# Partner DLQ API Endpoints ## Overview RESTful API endpoints for monitoring and managing the Partner Dead Letter Queue (DLQ). These endpoints allow administrators to view statistics, process failed messages, retry tasks, and perform maintenance operations. ## Authentication All DLQ endpoints require admin authentication. Include authentication token in request headers: ``` Authorization: Bearer ``` ## Endpoints ### 1. Get DLQ Statistics Get comprehensive statistics about the DLQ and partner log processing status. **Endpoint:** `GET /api/dlq/partner_tasks/stats` **Authentication:** Required (Admin) **Response:** ```json { "dlq": { "messageCount": 5, "consumerCount": 0, "queueName": "partner_tasks_failed" }, "trackers": { "failed": 12, "processing": 3, "downloaded": 8, "processed": 245, "archived": 7 }, "recentFailures": [ { "id": "507f1f77bcf86cd799439011", "logFileName": "application_20250101_120000.log", "partner": { "id": "507f1f77bcf86cd799439012", "name": "SatLoc Systems", "code": "SATLOC" }, "customer": { "id": "507f1f77bcf86cd799439013", "name": "John Doe", "username": "john@example.com" }, "errorMessage": "Connection timeout", "retryCount": 3, "failedAt": "2025-10-02T10:30:00.000Z" } ] } ``` **Example:** ```bash curl -X GET http://localhost:3000/api/dlq/partner_tasks/stats \ -H "Authorization: Bearer " ``` --- ### 2. Get DLQ Messages Retrieve messages from the Dead Letter Queue without consuming them (peek mode). **Endpoint:** `GET /api/dlq/partner_tasks/messages` **Authentication:** Required (Admin) **Query Parameters:** - `limit` (optional): Maximum number of messages to retrieve (default: 50) **Response:** ```json { "messages": [ { "taskInfo": { "logFileName": "application_20250101_120000.log", "partnerId": "507f1f77bcf86cd799439012", "customerId": "507f1f77bcf86cd799439013" }, "errorMessage": "Connection timeout", "retryCount": 3, "enqueuedAt": "2025-10-02T10:00:00.000Z", "headers": { "x-death": [...] } } ] } ``` **Example:** ```bash curl -X GET "http://localhost:3000/api/dlq/partner_tasks/messages?limit=20" \ -H "Authorization: Bearer " ``` --- ### 3. Process DLQ Process messages in the Dead Letter Queue - categorizes errors and automatically retries or archives based on error type and age. **Endpoint:** `POST /api/dlq/:queueName/process` **Authentication:** Required (Admin) **Request Body:** ```json { "maxMessages": 100, "dryRun": false } ``` **Parameters:** - `maxMessages` (optional): Maximum number of messages to process (default: 100) - `dryRun` (optional): If true, analyze without taking action (default: false) **Response:** ```json { "processed": 15, "retried": 8, "archived": 5, "categorization": { "transient": 8, "validation": 3, "processing": 2, "infrastructure": 1, "partner_api": 1, "unknown": 0 }, "dryRun": false, "timestamp": "2025-10-02T11:00:00.000Z" } ``` **Error Categories:** - **transient**: Network timeouts, temporary connection issues (auto-retried within 2h window) - **validation**: Invalid data, missing fields (archived immediately) - **processing**: Calculation errors, parsing errors (kept for review) - **infrastructure**: Database errors, filesystem errors (retried with backoff) - **partner_api**: API authentication failures, rate limiting (retried with delay) - **unknown**: Unclassified errors (kept for review) **Example:** ```bash # Process DLQ curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"maxMessages": 50, "dryRun": false}' # Dry run (analyze only) curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"dryRun": true}' ``` --- ### 4. Retry Failed Task Retry all messages currently in the DLQ back to the main queue. **Endpoint:** `POST /api/dlq/:queueName/retryAll` **Authentication:** Required (Admin) **URL Parameters:** - `queueName`: Queue name (e.g., "partner_tasks") **Response:** ```json { "success": true, "message": "Retried 15 messages from DLQ", "retriedCount": 15, "queueName": "partner_tasks" } ``` **Example:** ```bash curl -X POST http://localhost:3000/api/dlq/partner_tasks/retryAll \ -H "Authorization: Bearer " ``` --- ### 5. Retry Messages by Position Retry messages from specific positions in the DLQ. **Endpoint:** `POST /api/dlq/:queueName/retryByPosition` **Authentication:** Required (Admin) **URL Parameters:** - `queueName`: Queue name **Request Body:** ```json { "startPosition": 1, "endPosition": 10 } ``` **Response:** ```json { "success": true, "message": "Retried 10 messages from positions 1-10", "retriedCount": 10 } ``` --- ### 6. Retry Messages by Header Retry messages matching specific header values (e.g., partner code). **Endpoint:** `POST /api/dlq/:queueName/retryByHeader` **Authentication:** Required (Admin) **URL Parameters:** - `queueName`: Queue name **Request Body:** ```json { "headerName": "partnerCode", "headerValue": "SATLOC" } ``` **Response:** ```json { "success": true, "message": "Retried 8 messages matching header partnerCode=SATLOC", "retriedCount": 8 } } ``` **Parameters:** - `reason` (optional): Reason for archiving **Response:** ```json { "success": true, "message": "Task has been archived" } ``` **Example:** ```bash curl -X POST http://localhost:3000/api/dlq/partner_tasks/archive/507f1f77bcf86cd799439011 \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"reason": "Invalid file format"}' ``` --- ### 6. Purge DLQ ⚠️ **DANGEROUS OPERATION** - Permanently delete all messages from the Dead Letter Queue. **Endpoint:** `DELETE /api/dlq/:queueName/purge` **Authentication:** Required (Admin) **Request Body:** ```json { "confirm": true } ``` **Parameters:** - `confirm` (required): Must be `true` to confirm the purge operation **Response:** ```json { "success": true, "purgedCount": 25, "message": "Purged 25 messages from DLQ" } ``` **Example:** ```bash curl -X DELETE http://localhost:3000/api/dlq/partner_tasks/purge \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"confirm": true}' ``` --- ## Web Dashboard A web-based monitoring dashboard is available at: ``` http://localhost:3000/dlq-monitor.html ``` **Features:** - Real-time statistics display - Recent failures with error categorization - One-click retry/archive operations - Bulk DLQ processing - Auto-refresh every 30 seconds --- ## Error Handling All endpoints return consistent error responses: **400 Bad Request:** ```json { "error": "Invalid ID format" } ``` **404 Not Found:** ```json { "error": "Partner log tracker not found" } ``` **500 Internal Server Error:** ```json { "error": "Failed to get DLQ statistics" } ``` --- ## Usage Examples ### Monitor DLQ Health ```bash #!/bin/bash # Check if DLQ has too many messages STATS=$(curl -s -H "Authorization: Bearer $TOKEN" \ http://localhost:3000/api/dlq/partner_tasks/stats) DLQ_COUNT=$(echo $STATS | jq -r '.dlq.messageCount') if [ "$DLQ_COUNT" -gt 50 ]; then echo "WARNING: DLQ has $DLQ_COUNT messages!" # Send alert to admin fi ``` ### Automated DLQ Processing ```bash #!/bin/bash # Process DLQ every hour via cron curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{"maxMessages": 100}' \ >> /var/log/dlq-processing.log 2>&1 ``` ### Retry All Failed Messages (Queue-Native) ```javascript // Retry all failed messages in a queue (up to max limit) async function retryAllDLQMessages(queueName = 'partner_tasks', maxMessages = 100) { const response = await fetch(`/api/partners/dlq/${queueName}/retryAll`, { method: 'POST', headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ maxMessages }) }); const result = await response.json(); console.log(`Retried ${result.retriedCount} messages from ${queueName} DLQ`); return result; } // Retry by position range (0-based indexing) async function retryByPosition(queueName, startPosition, endPosition) { const response = await fetch(`/api/partners/dlq/${queueName}/retryByPosition`, { method: 'POST', headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ startPosition, endPosition }) }); return await response.json(); } ``` --- ## Integration with Monitoring ### Prometheus Metrics (Future Enhancement) ``` # HELP agm_dlq_messages_total Total messages in DLQ # TYPE agm_dlq_messages_total gauge agm_dlq_messages_total 5 # HELP agm_failed_tasks_total Total failed tasks # TYPE agm_failed_tasks_total gauge agm_failed_tasks_total 12 # HELP agm_processed_tasks_total Total successfully processed tasks # TYPE agm_processed_tasks_total counter agm_processed_tasks_total 245 ``` ### Grafana Dashboard Query Examples ```sql -- Failed tasks by partner SELECT p.name, COUNT(*) as failed_count FROM partnerlogtrackers plt JOIN partners p ON plt.partnerId = p._id WHERE plt.status = 'failed' GROUP BY p.name -- Error categories over time SELECT DATE(updatedAt) as date, COUNT(*) as count, errorMessage FROM partnerlogtrackers WHERE status = 'failed' GROUP BY DATE(updatedAt), errorMessage ``` --- ## Best Practices 1. **Regular Monitoring**: Check DLQ stats daily 2. **Automated Processing**: Run DLQ processing every 4-6 hours 3. **Manual Review**: Review archived tasks weekly 4. **Alert Thresholds**: - Warning: DLQ > 20 messages - Critical: DLQ > 50 messages 5. **Cleanup**: Archive tasks older than 7 days 6. **Documentation**: Document recurring error patterns --- ## Related Documentation - [Partner DLQ Handling Guide](./PARTNER_DLQ_HANDLING.md) - [Partner Integration Architecture](./PARTNER_INTEGRATION_ARCHITECTURE.md) - [SatLoc Implementation Summary](./SATLOC_IMPLEMENTATION_SUMMARY.md)