5.7 KiB
5.7 KiB
Partner Data Processing Analysis & Responsibilities
Current Issues Identified
1. ✅ processPartnerAssignment() - Removed Sync Task Queuing
Location: controllers/job.js
Change Made: Removed the delayed SYNC_PARTNER_DATA task queuing after successful job upload.
Rationale:
- The partner data polling worker (
partner_data_polling_worker.js) already handles automatic data polling - It polls for uploaded jobs and processes their log data automatically via
PROCESS_PARTNER_LOGtasks - No need to explicitly queue sync tasks since the polling worker discovers and processes data independently
2. 🔍 syncDataFromPartner() — Removed
This function was removed from services/partner_sync_service.js. The polling worker's cron-driven discovery of new logs via PROCESS_PARTNER_LOG tasks fully covers data sync without any explicit manual sync trigger.
Responsibility Analysis: job_worker vs partner_sync_worker
🎯 Updated Responsibilities (Revised)
job_worker.js - Internal Job Processing Only
Primary Focus: AgMission internal job processing from uploaded files
Responsibilities:
- ✅ Process uploaded job files (SatLog, KML, Shapefile, etc.) from users
- ✅ Create ApplicationDetail records from processed internal files
- ✅ Job status management and validation
- ✅ File processing and data extraction for internal uploads
- ✅ Database operations for jobs and applications
- ❌
Handle(moved to partner_sync_worker)UPLOAD_PARTNER_JOBtasks
Task Handling:
- Internal file processing (various formats)
- ApplicationDetail creation from user uploads
partner_sync_worker.js - All Partner System Operations
Primary Focus: Complete partner system integration and communication
Responsibilities:
- ✅ Handle
UPLOAD_PARTNER_JOBtasks (upload jobs TO partners) - ✅ Handle
PROCESS_PARTNER_LOGtasks (process logs FROM partners) - ✅ Partner system API communication and health monitoring
- ✅ Data synchronization with external systems
- ✅ Error handling and retry logic for partner operations
- ✅ Enhanced matching logic using job ID and aircraft ID
- ✅ Multiple log file grouping under same application
Enhanced Features:
- Smart Matching: Uses assignment job ID + partner aircraft ID for accurate matching
- Log Grouping: Multiple log files from same aircraft/job are grouped under one Application
- Application Hierarchy: Application → ApplicationFile (per log) → ApplicationDetails
- Geographic Matching: Bounding box overlap calculation for better accuracy
- Confidence Scoring: Multi-factor matching with configurable thresholds
🔄 Updated Data Flow Architecture
flowchart TD
A[User Uploads<br/>SatLog/KML/SHP] -->|Internal Files| B[job_worker]
B -->|Creates| C[ApplicationDetail<br/>Database]
D[partner_sync_worker] -->|UPLOAD_PARTNER_JOB| E[Partner System<br/>SatLoc]
E -->|Log Data| F[Partner System<br/>Log Files]
G[partner_polling_worker] -->|Auto Poll| F
G -->|PROCESS_PARTNER_LOG| D
D -->|Upload Job| E
🏗️ Enhanced Log Processing Logic
Matching Rules
- Primary Match: Partner Aircraft ID must match assignment user's partnerAircraftId
- Job ID Match: External Job ID from partner system (highest confidence +0.6)
- Time Proximity: Log time within 7 days of assignment creation (+0.3 max)
- Geographic Overlap: Bounding box intersection with job geometry (+0.2 max)
- Confidence Threshold: Minimum 0.5 required for match acceptance
Application Grouping Logic
- Same Application: Multiple log files from same aircraft + job combination
- Hierarchy:
Application→ApplicationFile(per log) →ApplicationDetail(per record) - Metadata Preservation: Each log file maintains individual metadata (parse stats, time range, etc.)
- Incremental Updates: New logs add to existing application without duplication
SatLoc Data Mapping Summary
Core Position Data (Record Type 1)
- GPS Coordinates:
lat,lon→ Direct mapping to ApplicationDetail - Timestamps:
timestamp→ Converted to Unix epoch (gpsTime) - Motion Data:
speed,track,altitude→grSpeed,head,alt - Spray Status:
sprayStat→ Direct boolean mapping (0/1)
Environmental Data Integration
- Wind Record (Type 50):
windSpeed,windDirection→windSpd,windDir - Environmental (Type 110):
temperature,humidity→temp,humid - System Monitoring: Various sensor data mapped to corresponding fields
Flow & Application Data
- Flow Monitor (Type 30):
pressure,flowRate→psi,lminApp - Target Rates (Type 32):
targetRate→lminReq - Applied Rates (Type 36):
actualRate→ Tracked for accuracy
Recommendations
Immediate Actions
- ✅ Remove sync task queuing from processPartnerAssignment() - COMPLETED
- ✅ syncDataFromPartner() removed - polling worker fully covers data discovery
Architecture Improvements
- Clear Separation: job_worker for job processing, partner_sync_worker for partner communication
- Eliminate Redundancy: Remove duplicate sync mechanisms
- Centralized Polling: Let polling worker handle all partner data discovery
- Error Handling: Improve retry logic for failed partner operations
Data Processing Efficiency
- ✅ SatLoc parser properly maps all critical fields to ApplicationDetail
- ✅ Batch processing implemented for performance
- ✅ Real-time polling discovers new data automatically
- ✅ Error tracking and logging in place
The current architecture is mostly sound, but removing the explicit sync task queuing improves efficiency by eliminating redundant data synchronization operations.