agmission/Development/server/README_PARTNER_INTEGRATION.md

361 lines
17 KiB
Markdown

# Partner Integration System
**Last Updated**: February 2026 | **Active partners**: SatLoc (`SATLOC`) | **Planned**: AGIDRONEX
A multi-partner integration framework built on a dual-user model with a generic service layer. Originally implemented for SatLoc; designed to extend to any partner providing job-upload and log-download APIs.
For full architecture diagrams → [docs/PARTNER_INTEGRATION_ARCHITECTURE.md](docs/PARTNER_INTEGRATION_ARCHITECTURE.md)
For queue/DLQ data-flow → [docs/PARTNER_TASK_DATA_FLOW_ANALYSIS.md](docs/PARTNER_TASK_DATA_FLOW_ANALYSIS.md)
## Architecture Overview
### Dual User System
1. **Partner Organizations**: Companies providing integration services (e.g., SatLoc)
- Stored as User entities with `kind: "PARTNER"`
- Contains partner-specific configuration and metadata
2. **Partner System Users**: Customer credential records for each partner system
- Stored as User entities with `kind: "PARTNER_SYSTEM_USER"`
- Links AgMission customers to their partner system accounts
- Contains partner-specific credentials and identifiers
- **Used by workers and services to communicate with partner APIs**
### Assignment Design
**Job assignments are created with normal internal Aircraft user IDs**:
- Assignments use standard internal Aircraft user IDs (`kind: "9"`) for compatibility with existing job assignment logic
- Partner integration is determined by checking if the assigned user has associated PartnerSystemUser records
- Partner-specific metadata (like `partnerAircraftId`) is stored as assignment fields
- Workers and services use the Partner System User records to obtain credentials for external API calls
## Service Layer
```
helpers/partner_service_factory.js — ★ singleton factory used by all workers
services/base_partner_service.js — abstract base; defines uploadJobDataToAircraft,
healthCheck, getAircraftList, getStoragePath, etc.
services/satloc_service.js — SatLoc implementation; Redis-backed auth cache
services/partner_sync_service.js — orchestrates upload/sync; used by sync worker
services/task_id_generator.js — generates unique task/execution IDs for TaskTracker
```
### Auth / Credential Flow
1. Worker receives `customerId`
2. `satloc_service.getCachedAuth(customerId)` checks **Redis** for valid JWT
3. Cache miss → queries `PartnerSystemUser` for credentials → calls SatLoc API
4. JWT cached in Redis with TTL; auto-retries once on expiry
5. No global credentials — every API call uses the customer's own SatLoc account
### Tracking Models
- **`PartnerLogTracker`** — per-log-file: `PENDING → DOWNLOADING → DOWNLOADED → PROCESSING → PROCESSED | FAILED | SKIPPED`
- **`TaskTracker`** — per-queue-task (Jan 2026): `QUEUED → PROCESSING → COMPLETED | FAILED`; retry-chain and stuck-task detection
## Quick Start
### 1. Environment Configuration (`environment.env`)
```bash
# SatLoc API — customer credentials are per-customer in PartnerSystemUser records
SATLOC_API_ENDPOINT=https://www.satloccloudfc.com/api/Satloc
SATLOC_API_TIMEOUT=30000
SATLOC_RETRY_ATTEMPTS=3
SATLOC_RETRY_DELAY=10000
SATLOC_RATE_LIMIT=60
SATLOC_BURST_LIMIT=10
SATLOC_STORAGE_PATH=/path/to/uploads/satloc
SATLOC_REALTIME_ENABLED=true
SATLOC_FILE_UPLOAD_ENABLED=true
SATLOC_1ST_ASSIGNMENT_ALWAYS_MATCH=true
# Worker control
PARTNER_POLLING_ENABLED=true
PARTNER_SYNC_ENABLED=true
PARTNER_MAX_RETRIES=5
STUCK_TIMEOUT_MS=18*60*1000
```
### 2. Run Setup Script
```bash
node setup_partners.js
```
Creates: SatLoc partner organization, sample customer, and a PartnerSystemUser linking them.
### 3. Start Workers
```bash
node start_workers.js
# or individually:
node workers/partner_data_polling_worker.js
node workers/partner_sync_worker.js
```
## API Reference
All endpoints at `/api/partners` (authentication required).
### Partner Organization CRUD
| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/api/partners` | List all partners |
| `POST` | `/api/partners` | Create partner |
| `GET` | `/api/partners/:id` | Get partner details |
| `PUT` | `/api/partners/:id` | Update partner |
| `DELETE` | `/api/partners/:id` | Soft-delete (`active: false`) |
### Partner System User CRUD
| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/api/partners/systemUsers` | List all system users |
| `POST` | `/api/partners/systemUsers` | Create system user |
| `POST` | `/api/partners/systemUsers/testAuth` | Test credentials against partner API |
| `GET` | `/api/partners/systemUsers/:id` | Get system user |
| `PUT` | `/api/partners/systemUsers/:id` | Update (partial updates supported) |
| `DELETE` | `/api/partners/systemUsers/:id` | Soft-delete |
**Create body:**
```json
{
"partnerId": "<partner_org_id>",
"customerId": "<agmission_customer_id>",
"username": "customer_satloc_username",
"password": "customer_satloc_password",
"name": "Customer SatLoc Account",
"active": true
}
```
### Other Partner Routes
| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/api/partners/customers` | List customers for a partner (with subscription info). Query: `?partnerId=<id>` |
| `GET` | `/api/partners/aircraft` | Get aircraft list from partner API |
| `POST` | `/api/partners/uploadJob` | Manually trigger job upload to partner |
| `POST` | `/api/partners/syncData` | Trigger partner data sync |
### Global DLQ API (for `partner_tasks`)
```bash
GET /api/dlq/partner_tasks/list
POST /api/dlq/partner_tasks/retryAll
POST /api/dlq/partner_tasks/retryByPosition # body: {"position":0}
POST /api/dlq/partner_tasks/retryByHeader # body: {"headerName":"x-partner-code","headerValue":"SATLOC"}
POST /api/dlq/partner_tasks/purge
```
### Assign Jobs with Partner Integration
Use **internal** user IDs — partner detection is automatic:
```json
POST /api/jobs/assign
{
"jobId": "job_id",
"dlOp": { "type": 1 },
"asUsers": [
{
"uid": "<internal_pilot_user_id>",
"notes": "High priority mission"
}
]
}
```
Flow:
1. Assignment created in `controllers/job.js`
2. Partner integration detected from assigned user's context
3. **Immediate upload attempted** — if partner API is live, calls `partnerSyncService.uploadJobToPartner()` directly
4. **If immediate upload fails or API is offline** → queues `UPLOAD_PARTNER_JOB` task to `partner_tasks` queue
5. `partner_sync_worker` processes task → calls `uploadJobToPartner()` → stores `extJobId` on `JobAssign`, status set to `UPLOADED`
6. Polling worker (cron) finds `JobAssign` records with `status = UPLOADED` → polls partner for aircraft logs
7. New logs downloaded → `PROCESS_PARTNER_LOG` task queued → `partner_sync_worker` processes → `ApplicationDetail` records created
## Partner Data Flow
```
Polling Worker (cron — every 15 min prod / 1 min dev)
├─ Query JobAssign WHERE status = UPLOADED (jobs successfully sent to partner)
├─ Group by partnerCode + customerId
├─ Per aircraft: call partnerService.getAircraftLogs(customerId, aircraftId)
├─ Filter out already-processed logs (PartnerLogTracker where processed = true)
├─ For each new log:
│ ├─ Atomic upsert PartnerLogTracker (PENDING)
│ ├─ Claim for download → status DOWNLOADING
│ ├─ partnerService.getAircraftLogData() → save to SATLOC_STORAGE_PATH
│ ├─ Update PartnerLogTracker → DOWNLOADED
│ └─ Enqueue PROCESS_PARTNER_LOG → partner_tasks queue
└─ Periodic cleanup of stuck DOWNLOADING / DOWNLOADED / PROCESSING trackers
Sync Worker (queue consumer — partner_tasks)
├─ UPLOAD_PARTNER_JOB
│ ├─ Check idempotency (redelivered check via extJobId on JobAssign)
│ ├─ partnerSyncService.uploadJobToPartner(assignId)
│ ├─ POST to SatLoc UploadJobData
│ ├─ Store extJobId on JobAssign
│ └─ Set JobAssign.status = UPLOADED (AssignStatus.UPLOADED = 2)
└─ PROCESS_PARTNER_LOG
├─ TaskTracker idempotency check (claim or skip if already processing)
├─ Circuit breaker: skip repeatedly-failing files
├─ Atomic claim PartnerLogTracker → PROCESSING
├─ SatLocLogParser → SatLocApplicationProcessor
├─ Creates ApplicationFile + ApplicationDetail records
├─ PartnerLogTracker → PROCESSED
└─ On failure → nack → DLQ partner_tasks_failed
```
See [docs/PARTNER_TASK_DATA_FLOW_ANALYSIS.md](docs/PARTNER_TASK_DATA_FLOW_ANALYSIS.md) for full Mermaid diagrams.
## File Structure
### Core Partner Files
```
server/
├── model/
│ ├── partner.js # Partner + PartnerSystemUser discriminators
│ ├── partner_log_tracker.js # Per-log-file lifecycle tracking
│ └── task_tracker.js # Per-queue-task tracking (Phase 2, Jan 2026)
├── controllers/partner.js # RESTful CRUD + action endpoints
├── routes/partner.js # Mounted at /api/partners
├── helpers/
│ ├── partner_config.js # Partner configuration (SATLOC, AGIDRONEX stub)
│ ├── partner_service_factory.js # ★ Singleton factory (used by all workers)
│ ├── satloc_log_parser.js # Binary .LOG file parser
│ ├── satloc_application_processor.js # Parsed records → ApplicationDetail
│ ├── file_satlog.js # Low-level binary read utilities
│ └── satloc_util.js # SatLoc utilities
├── services/
│ ├── base_partner_service.js # Abstract base class
│ ├── satloc_service.js # SatLoc API client + Redis auth cache
│ ├── partner_sync_service.js # Orchestrates upload/sync flows
│ └── task_id_generator.js # Task/execution ID generation
├── workers/
│ ├── partner_data_polling_worker.js # Cron: polls APIs, downloads, enqueues
│ ├── partner_sync_worker.js # Consumer: processes partner_tasks queue
│ ├── dlq_archival_worker.js # Archives expired DLQ messages to disk
│ └── dlq_alert_worker.js # Threshold-based email alerts
├── controllers/dlq.js # Global DLQ controller (all queues)
├── routes/dlq.js # Mounted at /api/dlq/:queueName/*
├── public/dlq-monitor.html # ★ Web DLQ monitoring dashboard
└── setup_partners.js # Dev setup script
```
## Key Milestones
| Date | Change |
|------|--------|
| Jul 2025 | RESTful API standardization — consistent `:id` params, soft deletes, controller reuse |
| Aug 2025 | Binary processing architecture — `SatLocBinaryProcessor`, file download + local storage |
| Oct 2025 | Auth refactoring — `authenticate()` vs `authenticateAndCache()` separation; Redis cache |
| Dec 2025 | DLQ Step 8 — queue-native global `/api/dlq/:queueName/*` endpoints; web dashboard |
| Jan 2026 | TaskTracker (Phase 2) — per-task lifecycle tracking with retry chain and stuck-task detection |
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `SATLOC_API_ENDPOINT` | `https://www.satloccloudfc.com/api/Satloc` | SatLoc Cloud base URL |
| `SATLOC_API_TIMEOUT` | `30000` | Request timeout (ms) |
| `SATLOC_RETRY_ATTEMPTS` | `3` | Max retry attempts |
| `SATLOC_RETRY_DELAY` | `10000` | Delay between retries (ms) |
| `SATLOC_RATE_LIMIT` | `60` | Requests per minute |
| `SATLOC_BURST_LIMIT` | `10` | Burst request limit |
| `SATLOC_STORAGE_PATH` | `/uploads/satloc` | Local log file storage |
| `SATLOC_MAX_FILE_SIZE` | `10485760` | Max log file size (bytes) |
| `SATLOC_REALTIME_ENABLED` | `true` | Enable real-time polling |
| `SATLOC_FILE_UPLOAD_ENABLED` | `true` | Enable file download/storage |
| `SATLOC_1ST_ASSIGNMENT_ALWAYS_MATCH` | `true` | Match first assignment when no extJobId |
| `PARTNER_POLLING_ENABLED` | `true` | Enable polling worker |
| `PARTNER_SYNC_ENABLED` | `true` | Enable sync worker |
| `PARTNER_MAX_RETRIES` | `5` | Max retries for stuck tasks |
| `STUCK_TIMEOUT_MS` | `18*60*1000` | Stuck-task detection threshold |
## Monitoring
### DLQ Web Dashboard
```
http://localhost:4100/public/dlq-monitor.html
```
### DLQ CLI
```bash
node scripts/publish_to_dlq.js --env ./environment.env --queue dev_partner_tasks --count 3
node scripts/test_dlq_e2e.js --env ./environment.env --queue dev_partner_tasks
```
### Debug Logging
```bash
# All partner + satloc modules:
DEBUG=agm:partner*,agm:satloc* node server.js
# Workers:
DEBUG=agm:* node workers/partner_data_polling_worker.js
```
### Log Level Filtering (Pino)
See `PINO_MODULE_FILTERING_GUIDE.md` for filtering by module (e.g., `LOG_MODULES=partner*,satloc*`).
## Benefits
### For Development
- **Simplified Architecture**: Reuses existing User model infrastructure
- **Environment Configuration**: No complex partner management UI needed
- **Easy Testing**: Standard User entity patterns
- **Maintainable**: Fewer moving parts than separate partner models
### For Operations
- **Customer Isolation**: Each customer has their own partner system account
- **Secure Credentials**: Environment-based credential management
- **Simple Monitoring**: Basic health checks without complex infrastructure
- **Scalable**: Easy to add new customers and partners
### For Business
- **Partner Flexibility**: Easy to add new partners with environment configuration
- **Cost Effective**: No complex monitoring infrastructure required
- **Mixed Assignments**: Support for both internal and partner aircraft in same job
## Troubleshooting
| Problem | Likely Cause | Solution |
|---------|-------------|----------|
| `partner_service_not_implemented` | Unregistered `partnerCode` | Check `helpers/partner_service_factory.js` `serviceMapping` |
| `No SatLoc system user found` | Missing `PartnerSystemUser` record | Run `setup_partners.js` or `POST /api/partners/systemUsers` |
| Auth fails after token expiry | Stale Redis cache | Service auto-retries once; check `REDIS_PWD` and Redis connectivity |
| Log files not processed | `PartnerLogTracker` stuck in `DOWNLOADING` | Startup cleanup resets stuck tasks automatically; tune `STUCK_TIMEOUT_MS` |
| Tasks accumulating in DLQ | Worker processing failures | `POST /api/dlq/partner_tasks/retryAll` or use web dashboard |
| Queue connection lost | RabbitMQ restart | Workers reconnect automatically; in-flight tasks cached in memory offline queue |
## Documentation Index
| Document | Purpose | Status |
|----------|---------|--------|
| `docs/PARTNER_INTEGRATION_ARCHITECTURE.md` | Full architecture, diagrams, current state | ✅ Current |
| `docs/PARTNER_TASK_DATA_FLOW_ANALYSIS.md` | Queue/DLQ message lifecycle (Mermaid) | ✅ Current (Jan 2026) |
| `docs/PARTNER_RESPONSIBILITIES_ANALYSIS.md` | Worker responsibility boundaries | ✅ Current |
| `docs/PARTNER_AUTH_REFACTORING.md` | SatLoc auth separation + Redis cache design | ✅ Current |
| `docs/PARTNER_MODEL_SCHEMA_UPDATES.md` | PartnerSystemUser + Customer schema relationships | ✅ Current |
| `docs/PARTNER_LOG_FILE_PROCESSING.md` | Log processing pipeline | ✅ Current |
| `docs/PARTNER_LOG_DOWNLOAD_IMPLEMENTATION.md` | Polling worker download implementation | ✅ Current |
| `docs/PARTNER_INTEGRATION_IMPLEMENTATION.md` | Implementation patterns and examples | ✅ Current |
| `docs/PARTNER_SYSTEM_REFACTORING_SUMMARY.md` | Historical: Jul 2025 RESTful refactor | 📁 Historical |
| `docs/PARTNER_SYNC_INTEGRATION_SUMMARY.md` | Historical: Aug 2025 sync improvements | 📁 Historical |
| `docs/PARTNER_LOG_MIGRATION_SUMMARY.md` | Historical: log tracker schema migration | 📁 Historical |
| `docs/PARTNER_SYNC_WORKER_REFACTORING.md` | Historical: worker refactoring milestone | 📁 Historical |
| `docs/SATLOC_API_SPECIFICATION.md` | SatLoc external API endpoint reference | ✅ Reference |
| `docs/SATLOC_API_ACTUAL_BEHAVIOR.md` | Documented deviations from spec (important!) | ✅ Reference |
| `docs/SATLOC_BINARY_PROCESSING_ARCHITECTURE.md` | Binary log parsing architecture | ✅ Reference |
| `docs/SATLOC_APPLICATION_PROCESSOR_README.md` | `satloc_application_processor.js` usage | ✅ Reference |
| `docs/SATLOC_ERROR_PATTERNS.md` | Known error patterns and handling | ✅ Reference |
| `docs/SATLOC_LOG_NOTES.md` | Raw log format notes | ✅ Reference |
| `docs/Transland_SATLOC_Log_File_Formats_v3_76.md` | Official SatLoc log format spec | ✅ Reference |
| `docs/SATLOC_COMPLETE_IMPLEMENTATION.md` | Historical: implementation milestone | 📁 Historical |
| `docs/SATLOC_IMPLEMENTATION_SUMMARY.md` | Historical: implementation summary | 📁 Historical |
| `docs/SATLOC_INTEGRATION_SUMMARY.md` | Historical: integration summary | 📁 Historical |
| `docs/SATLOC_TESTING_SUMMARY.md` | Historical: testing milestone | 📁 Historical |