# TaskTracker Integration Plan ## Overview This document outlines the phased integration of TaskTracker into the partner_tasks queue as a pilot, with the goal of eventually rolling out to all queue types. ## Implementation Status ### ✅ Phase 1: Foundation (COMPLETED) - [x] TaskTracker model created (`model/task_tracker.js`) - [x] Task ID generator service created (`services/task_id_generator.js`) - [x] Test script created (`tests/test_task_tracker_2key.js`) - [x] Documentation created (`docs/TASK_TRACKER_2KEY_DESIGN.md`) - [x] Architecture diagrams moved to current docs ### 🔄 Phase 2: Partner Queue Integration (IN PROGRESS) **Target**: Integrate TaskTracker with partner_tasks queue alongside existing PartnerLogTracker #### 2.1 Parallel Tracking Implementation **Files to Modify**: 1. `workers/partner_data_polling_worker.js` - Add TaskTracker creation at enqueue time 2. `workers/partner_sync_worker.js` - Add TaskTracker updates during processing **Strategy**: Run both tracking systems in parallel - Continue updating PartnerLogTracker (existing functionality) - Add TaskTracker operations (new functionality) - Log differences for validation - No breaking changes to existing system #### 2.2 Integration Points **A. Enqueue Time** (`partner_data_polling_worker.js`): ```javascript // Location: When enqueueing PROCESS_PARTNER_LOG tasks // Current: Only creates/updates PartnerLogTracker // Add: Create TaskTracker entry with taskId and executionId ``` **B. Processing Start** (`partner_sync_worker.js`): ```javascript // Location: processPartnerLog() function start // Current: Claims PartnerLogTracker with atomic update // Add: Claim TaskTracker with atomic update using taskId + executionId ``` **C. Processing Success** (`partner_sync_worker.js`): ```javascript // Location: After successful log processing // Current: Updates PartnerLogTracker to PROCESSED // Add: Update TaskTracker to completed status ``` **D. Processing Failure** (`partner_sync_worker.js`): ```javascript // Location: Error handling in catch blocks // Current: Updates PartnerLogTracker status and retryCount // Add: Update TaskTracker status, error details, and retryCount ``` ### ⏸️ Phase 3: Validation Period (PLANNED) **Duration**: 2-4 weeks **Activities**: - Monitor both tracking systems side-by-side - Compare data consistency between PartnerLogTracker and TaskTracker - Validate deduplication logic prevents duplicate enqueues - Verify idempotency prevents duplicate processing - Test retry chain tracing via taskId - Performance testing (query speed, memory usage) **Success Criteria**: - [ ] 100% data consistency between trackers - [ ] Zero duplicate tasks created - [ ] Zero duplicate processing events - [ ] Complete retry chains traceable via taskId - [ ] No performance degradation - [ ] No errors in production logs ### ⏸️ Phase 4: Switch to TaskTracker (PLANNED) **Prerequisites**: All Phase 3 success criteria met **Changes**: 1. Update API endpoints to query TaskTracker instead of PartnerLogTracker 2. Update monitoring dashboards to use TaskTracker metrics 3. Workers continue updating both systems (safety net) **Rollback Plan**: Switch API/monitoring back to PartnerLogTracker if issues arise ### ⏸️ Phase 5: Deprecate PartnerLogTracker (FUTURE) **Timeline**: 3+ months after Phase 4 **Activities**: - Remove PartnerLogTracker update calls from workers - Archive PartnerLogTracker data - Remove PartnerLogTracker model and indexes - Clean up legacy code references ### ⏸️ Phase 6: Expand to Other Queues (FUTURE) **Target Queues**: dev_jobs, jobs, notifications (if created) **Per-Queue Rollout**: 1. Implement TaskTracker in target queue 2. Run parallel tracking for validation period 3. Switch to TaskTracker 4. Deprecate old tracking (if exists) ## Code Changes Required ### Phase 2 Implementation #### File 1: `workers/partner_data_polling_worker.js` **Location**: Where tasks are enqueued (around line 600-700) **Add Imports**: ```javascript const TaskTracker = require('../model/task_tracker'); const { TaskTrackerStatus } = require('../model/task_tracker'); const { generateTaskId, generateExecutionId } = require('../services/task_id_generator'); ``` **Modify Enqueue Logic**: ```javascript // After PartnerLogTracker update, before taskQHelper.addTaskASync() // Generate TaskTracker IDs const taskId = generateTaskId(PARTNER_QUEUE, { partnerCode: group.partnerCode, aircraftId: aircraftId, logId: logInfo.id }); const executionId = generateExecutionId(); // Check for recent duplicate (deduplication) const recentTask = await TaskTracker.findOne({ taskId, status: { $in: [TaskTrackerStatus.QUEUED, TaskTrackerStatus.PROCESSING] }, enqueuedAt: { $gt: new Date(Date.now() - 5 * 60000) } }).lean(); if (recentTask) { pino.debug({ taskId, existingExecutionId: recentTask.executionId }, 'Task already queued/processing, skipping duplicate'); continue; // Skip enqueue } // Create TaskTracker entry await TaskTracker.create({ taskId, executionId, queueName: PARTNER_QUEUE, status: TaskTrackerStatus.QUEUED, metadata: { partnerCode: group.partnerCode, aircraftId: aircraftId, logId: logInfo.id, customerId: group.customerId, logFileName: logInfo.logFileName, uploadedDate: logInfo.uploadedDate, localFilePath: downloadedPath } }); // Add IDs to task message const taskData = { ...existingTaskData, taskId, // Add for tracking executionId // Add for idempotency }; await taskQHelper.addTaskASync(PartnerTasks.PROCESS_PARTNER_LOG, taskData); ``` #### File 2: `workers/partner_sync_worker.js` **Location**: `processPartnerLog()` function **Add Imports**: ```javascript const TaskTracker = require('../model/task_tracker'); const { TaskTrackerStatus, ErrorCategory } = require('../model/task_tracker'); ``` **Modify Processing Start**: ```javascript // At start of processPartnerLog(), after extracting taskData const { taskId, executionId } = taskData; // Atomic claim with TaskTracker (idempotency check) if (taskId && executionId) { const taskTracker = await TaskTracker.findOneAndUpdate( { taskId, executionId, status: { $in: [TaskTrackerStatus.QUEUED, TaskTrackerStatus.FAILED] } }, { $set: { status: TaskTrackerStatus.PROCESSING, processingStartedAt: new Date() } }, { new: true } ); if (!taskTracker) { pino.warn({ taskId, executionId }, 'Task already claimed or completed, skipping'); return { skipped: true, reason: 'already_processed' }; } } // Continue with existing PartnerLogTracker claim... ``` **Modify Success Handler**: ```javascript // After successful processing, before PartnerLogTracker update if (taskId && executionId) { await TaskTracker.updateOne( { executionId }, { $set: { status: TaskTrackerStatus.COMPLETED, completedAt: new Date() } } ); } // Continue with existing PartnerLogTracker update... ``` **Modify Error Handler**: ```javascript // In catch block, after error logging if (taskId && executionId) { // Determine error category const errorCategory = categorizeError(error); // Check retry eligibility const taskTracker = await TaskTracker.findOne({ executionId }).lean(); const canRetry = taskTracker && taskTracker.retryCount < taskTracker.maxRetries; await TaskTracker.updateOne( { executionId }, { $set: { status: canRetry ? TaskTrackerStatus.FAILED : TaskTrackerStatus.DLQ, errorMessage: error.message, errorCategory, errorStack: error.stack, failedAt: new Date() }, $inc: { retryCount: 1 } } ); } // Continue with existing error handling... ``` **Add Error Categorization Helper**: ```javascript function categorizeError(error) { const { ErrorCategory } = require('../model/task_tracker'); const message = error.message.toLowerCase(); if (message.includes('timeout') || message.includes('econnrefused') || message.includes('network')) { return ErrorCategory.TRANSIENT; } if (message.includes('invalid') || message.includes('missing') || message.includes('required')) { return ErrorCategory.VALIDATION; } if (message.includes('parse') || message.includes('format')) { return ErrorCategory.PROCESSING; } if (message.includes('database') || message.includes('mongo') || message.includes('fs ')) { return ErrorCategory.INFRASTRUCTURE; } if (message.includes('partner') || message.includes('api') || message.includes('satloc')) { return ErrorCategory.PARTNER_API; } return ErrorCategory.UNKNOWN; } ``` ## Testing Plan ### Unit Tests - [ ] TaskTracker model creation and validation - [ ] TaskId generation determinism - [ ] ExecutionId uniqueness - [ ] Status transitions - [ ] Error categorization ### Integration Tests - [ ] Enqueue with TaskTracker creation - [ ] Deduplication prevents duplicate enqueues - [ ] Idempotency prevents duplicate processing - [ ] Successful processing updates both trackers - [ ] Failed processing updates both trackers - [ ] Retry chain via taskId query ### Load Tests - [ ] 1000 concurrent enqueues (measure deduplication) - [ ] 100 concurrent workers processing same queue - [ ] Query performance with 100k+ TaskTracker records ## Monitoring & Metrics ### New Metrics to Track - TaskTracker vs PartnerLogTracker consistency rate - Deduplication rate (skipped enqueues) - Idempotency effectiveness (skipped processing) - Query performance (TaskTracker vs PartnerLogTracker) - Memory usage with parallel tracking ### Alerts to Configure - Inconsistency between trackers > 1% - TaskTracker query latency > 500ms - Failed TaskTracker operations - Stuck tasks (PROCESSING > 30 minutes) ## Rollback Plan ### If Issues in Phase 2: 1. Remove TaskTracker calls from workers (git revert) 2. Deploy previous version 3. No data loss - PartnerLogTracker still primary ### If Issues in Phase 4: 1. Switch API/monitoring back to PartnerLogTracker 2. Workers still updating both (no code change needed) 3. Investigate and fix TaskTracker issues ## Timeline - **Phase 2**: 1-2 days (implementation + initial testing) - **Phase 3**: 2-4 weeks (validation period) - **Phase 4**: 1 week (switch + monitoring) - **Phase 5**: After 3+ months of stable operation - **Phase 6**: Per-queue rollout (1 month per queue) ## Success Metrics - [ ] Zero duplicate tasks created - [ ] Zero duplicate processing events - [ ] 100% data consistency - [ ] <10ms query performance overhead - [ ] <5MB memory overhead per 1000 tasks - [ ] Complete retry chains traceable - [ ] Zero production errors related to TaskTracker --- **Status**: Phase 1 Complete, Phase 2 Ready to Start **Next Action**: Implement Phase 2 changes in workers **Owner**: Development Team **Last Updated**: January 21, 2026