# Partner Data Processing Analysis & Responsibilities ## Current Issues Identified ### 1. ✅ processPartnerAssignment() - Removed Sync Task Queuing **Location**: `controllers/job.js` **Change Made**: Removed the delayed SYNC_PARTNER_DATA task queuing after successful job upload. **Rationale**: - The partner data polling worker (`partner_data_polling_worker.js`) already handles automatic data polling - It polls for uploaded jobs and processes their log data automatically via `PROCESS_PARTNER_LOG` tasks - No need to explicitly queue sync tasks since the polling worker discovers and processes data independently ### 2. 🔍 syncDataFromPartner() — Removed This function was removed from `services/partner_sync_service.js`. The polling worker's cron-driven discovery of new logs via `PROCESS_PARTNER_LOG` tasks fully covers data sync without any explicit manual sync trigger. ## Responsibility Analysis: job_worker vs partner_sync_worker ### 🎯 **Updated Responsibilities (Revised)** #### **job_worker.js** - Internal Job Processing Only **Primary Focus**: AgMission internal job processing from uploaded files **Responsibilities**: - ✅ Process uploaded job files (SatLog, KML, Shapefile, etc.) from users - ✅ Create ApplicationDetail records from processed internal files - ✅ Job status management and validation - ✅ File processing and data extraction for internal uploads - ✅ Database operations for jobs and applications - ❌ ~~Handle `UPLOAD_PARTNER_JOB` tasks~~ (moved to partner_sync_worker) **Task Handling**: - Internal file processing (various formats) - ApplicationDetail creation from user uploads #### **partner_sync_worker.js** - All Partner System Operations **Primary Focus**: Complete partner system integration and communication **Responsibilities**: - ✅ Handle `UPLOAD_PARTNER_JOB` tasks (upload jobs TO partners) - ✅ Handle `PROCESS_PARTNER_LOG` tasks (process logs FROM partners) - ✅ Partner system API communication and health monitoring - ✅ Data synchronization with external systems - ✅ Error handling and retry logic for partner operations - ✅ Enhanced matching logic using job ID and aircraft ID - ✅ Multiple log file grouping under same application **Enhanced Features**: - **Smart Matching**: Uses assignment job ID + partner aircraft ID for accurate matching - **Log Grouping**: Multiple log files from same aircraft/job are grouped under one Application - **Application Hierarchy**: Application → ApplicationFile (per log) → ApplicationDetails - **Geographic Matching**: Bounding box overlap calculation for better accuracy - **Confidence Scoring**: Multi-factor matching with configurable thresholds ### 🔄 **Updated Data Flow Architecture** ```mermaid flowchart TD A[User Uploads
SatLog/KML/SHP] -->|Internal Files| B[job_worker] B -->|Creates| C[ApplicationDetail
Database] D[partner_sync_worker] -->|UPLOAD_PARTNER_JOB| E[Partner System
SatLoc] E -->|Log Data| F[Partner System
Log Files] G[partner_polling_worker] -->|Auto Poll| F G -->|PROCESS_PARTNER_LOG| D D -->|Upload Job| E ``` ### 🏗️ **Enhanced Log Processing Logic** #### **Matching Rules** 1. **Primary Match**: Partner Aircraft ID must match assignment user's partnerAircraftId 2. **Job ID Match**: External Job ID from partner system (highest confidence +0.6) 3. **Time Proximity**: Log time within 7 days of assignment creation (+0.3 max) 4. **Geographic Overlap**: Bounding box intersection with job geometry (+0.2 max) 5. **Confidence Threshold**: Minimum 0.5 required for match acceptance #### **Application Grouping Logic** - **Same Application**: Multiple log files from same aircraft + job combination - **Hierarchy**: `Application` → `ApplicationFile` (per log) → `ApplicationDetail` (per record) - **Metadata Preservation**: Each log file maintains individual metadata (parse stats, time range, etc.) - **Incremental Updates**: New logs add to existing application without duplication ## SatLoc Data Mapping Summary ### Core Position Data (Record Type 1) - **GPS Coordinates**: `lat`, `lon` → Direct mapping to ApplicationDetail - **Timestamps**: `timestamp` → Converted to Unix epoch (`gpsTime`) - **Motion Data**: `speed`, `track`, `altitude` → `grSpeed`, `head`, `alt` - **Spray Status**: `sprayStat` → Direct boolean mapping (0/1) ### Environmental Data Integration - **Wind Record (Type 50)**: `windSpeed`, `windDirection` → `windSpd`, `windDir` - **Environmental (Type 110)**: `temperature`, `humidity` → `temp`, `humid` - **System Monitoring**: Various sensor data mapped to corresponding fields ### Flow & Application Data - **Flow Monitor (Type 30)**: `pressure`, `flowRate` → `psi`, `lminApp` - **Target Rates (Type 32)**: `targetRate` → `lminReq` - **Applied Rates (Type 36)**: `actualRate` → Tracked for accuracy ## Recommendations ### Immediate Actions 1. ✅ **Remove sync task queuing from processPartnerAssignment()** - COMPLETED 2. ✅ **syncDataFromPartner() removed** - polling worker fully covers data discovery ### Architecture Improvements 1. **Clear Separation**: job_worker for job processing, partner_sync_worker for partner communication 2. **Eliminate Redundancy**: Remove duplicate sync mechanisms 3. **Centralized Polling**: Let polling worker handle all partner data discovery 4. **Error Handling**: Improve retry logic for failed partner operations ### Data Processing Efficiency - ✅ SatLoc parser properly maps all critical fields to ApplicationDetail - ✅ Batch processing implemented for performance - ✅ Real-time polling discovers new data automatically - ✅ Error tracking and logging in place The current architecture is mostly sound, but removing the explicit sync task queuing improves efficiency by eliminating redundant data synchronization operations.