10 KiB
Partner DLQ API Endpoints
Overview
RESTful API endpoints for monitoring and managing the Partner Dead Letter Queue (DLQ). These endpoints allow administrators to view statistics, process failed messages, retry tasks, and perform maintenance operations.
Authentication
All DLQ endpoints require admin authentication. Include authentication token in request headers:
Authorization: Bearer <token>
Endpoints
1. Get DLQ Statistics
Get comprehensive statistics about the DLQ and partner log processing status.
Endpoint: GET /api/dlq/partner_tasks/stats
Authentication: Required (Admin)
Response:
{
"dlq": {
"messageCount": 5,
"consumerCount": 0,
"queueName": "partner_tasks_failed"
},
"trackers": {
"failed": 12,
"processing": 3,
"downloaded": 8,
"processed": 245,
"archived": 7
},
"recentFailures": [
{
"id": "507f1f77bcf86cd799439011",
"logFileName": "application_20250101_120000.log",
"partner": {
"id": "507f1f77bcf86cd799439012",
"name": "SatLoc Systems",
"code": "SATLOC"
},
"customer": {
"id": "507f1f77bcf86cd799439013",
"name": "John Doe",
"username": "john@example.com"
},
"errorMessage": "Connection timeout",
"retryCount": 3,
"failedAt": "2025-10-02T10:30:00.000Z"
}
]
}
Example:
curl -X GET http://localhost:3000/api/dlq/partner_tasks/stats \
-H "Authorization: Bearer <token>"
2. Get DLQ Messages
Retrieve messages from the Dead Letter Queue without consuming them (peek mode).
Endpoint: GET /api/dlq/partner_tasks/messages
Authentication: Required (Admin)
Query Parameters:
limit(optional): Maximum number of messages to retrieve (default: 50)
Response:
{
"messages": [
{
"taskInfo": {
"logFileName": "application_20250101_120000.log",
"partnerId": "507f1f77bcf86cd799439012",
"customerId": "507f1f77bcf86cd799439013"
},
"errorMessage": "Connection timeout",
"retryCount": 3,
"enqueuedAt": "2025-10-02T10:00:00.000Z",
"headers": {
"x-death": [...]
}
}
]
}
Example:
curl -X GET "http://localhost:3000/api/dlq/partner_tasks/messages?limit=20" \
-H "Authorization: Bearer <token>"
3. Process DLQ
Process messages in the Dead Letter Queue - categorizes errors and automatically retries or archives based on error type and age.
Endpoint: POST /api/dlq/:queueName/process
Authentication: Required (Admin)
Request Body:
{
"maxMessages": 100,
"dryRun": false
}
Parameters:
maxMessages(optional): Maximum number of messages to process (default: 100)dryRun(optional): If true, analyze without taking action (default: false)
Response:
{
"processed": 15,
"retried": 8,
"archived": 5,
"categorization": {
"transient": 8,
"validation": 3,
"processing": 2,
"infrastructure": 1,
"partner_api": 1,
"unknown": 0
},
"dryRun": false,
"timestamp": "2025-10-02T11:00:00.000Z"
}
Error Categories:
- transient: Network timeouts, temporary connection issues (auto-retried within 2h window)
- validation: Invalid data, missing fields (archived immediately)
- processing: Calculation errors, parsing errors (kept for review)
- infrastructure: Database errors, filesystem errors (retried with backoff)
- partner_api: API authentication failures, rate limiting (retried with delay)
- unknown: Unclassified errors (kept for review)
Example:
# Process DLQ
curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"maxMessages": 50, "dryRun": false}'
# Dry run (analyze only)
curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"dryRun": true}'
4. Retry Failed Task
Retry all messages currently in the DLQ back to the main queue.
Endpoint: POST /api/dlq/:queueName/retryAll
Authentication: Required (Admin)
URL Parameters:
queueName: Queue name (e.g., "partner_tasks")
Response:
{
"success": true,
"message": "Retried 15 messages from DLQ",
"retriedCount": 15,
"queueName": "partner_tasks"
}
Example:
curl -X POST http://localhost:3000/api/dlq/partner_tasks/retryAll \
-H "Authorization: Bearer <token>"
5. Retry Messages by Position
Retry messages from specific positions in the DLQ.
Endpoint: POST /api/dlq/:queueName/retryByPosition
Authentication: Required (Admin)
URL Parameters:
queueName: Queue name
Request Body:
{
"startPosition": 1,
"endPosition": 10
}
Response:
{
"success": true,
"message": "Retried 10 messages from positions 1-10",
"retriedCount": 10
}
6. Retry Messages by Header
Retry messages matching specific header values (e.g., partner code).
Endpoint: POST /api/dlq/:queueName/retryByHeader
Authentication: Required (Admin)
URL Parameters:
queueName: Queue name
Request Body:
{
"headerName": "partnerCode",
"headerValue": "SATLOC"
}
Response:
{
"success": true,
"message": "Retried 8 messages matching header partnerCode=SATLOC",
"retriedCount": 8
}
}
Parameters:
reason(optional): Reason for archiving
Response:
{
"success": true,
"message": "Task has been archived"
}
Example:
curl -X POST http://localhost:3000/api/dlq/partner_tasks/archive/507f1f77bcf86cd799439011 \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"reason": "Invalid file format"}'
6. Purge DLQ
⚠️ DANGEROUS OPERATION - Permanently delete all messages from the Dead Letter Queue.
Endpoint: DELETE /api/dlq/:queueName/purge
Authentication: Required (Admin)
Request Body:
{
"confirm": true
}
Parameters:
confirm(required): Must betrueto confirm the purge operation
Response:
{
"success": true,
"purgedCount": 25,
"message": "Purged 25 messages from DLQ"
}
Example:
curl -X DELETE http://localhost:3000/api/dlq/partner_tasks/purge \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"confirm": true}'
Web Dashboard
A web-based monitoring dashboard is available at:
http://localhost:3000/dlq-monitor.html
Features:
- Real-time statistics display
- Recent failures with error categorization
- One-click retry/archive operations
- Bulk DLQ processing
- Auto-refresh every 30 seconds
Error Handling
All endpoints return consistent error responses:
400 Bad Request:
{
"error": "Invalid ID format"
}
404 Not Found:
{
"error": "Partner log tracker not found"
}
500 Internal Server Error:
{
"error": "Failed to get DLQ statistics"
}
Usage Examples
Monitor DLQ Health
#!/bin/bash
# Check if DLQ has too many messages
STATS=$(curl -s -H "Authorization: Bearer $TOKEN" \
http://localhost:3000/api/dlq/partner_tasks/stats)
DLQ_COUNT=$(echo $STATS | jq -r '.dlq.messageCount')
if [ "$DLQ_COUNT" -gt 50 ]; then
echo "WARNING: DLQ has $DLQ_COUNT messages!"
# Send alert to admin
fi
Automated DLQ Processing
#!/bin/bash
# Process DLQ every hour via cron
curl -X POST http://localhost:3000/api/dlq/partner_tasks/process \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"maxMessages": 100}' \
>> /var/log/dlq-processing.log 2>&1
Retry All Failed Messages (Queue-Native)
// Retry all failed messages in a queue (up to max limit)
async function retryAllDLQMessages(queueName = 'partner_tasks', maxMessages = 100) {
const response = await fetch(`/api/partners/dlq/${queueName}/retryAll`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ maxMessages })
});
const result = await response.json();
console.log(`Retried ${result.retriedCount} messages from ${queueName} DLQ`);
return result;
}
// Retry by position range (0-based indexing)
async function retryByPosition(queueName, startPosition, endPosition) {
const response = await fetch(`/api/partners/dlq/${queueName}/retryByPosition`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ startPosition, endPosition })
});
return await response.json();
}
Integration with Monitoring
Prometheus Metrics (Future Enhancement)
# HELP agm_dlq_messages_total Total messages in DLQ
# TYPE agm_dlq_messages_total gauge
agm_dlq_messages_total 5
# HELP agm_failed_tasks_total Total failed tasks
# TYPE agm_failed_tasks_total gauge
agm_failed_tasks_total 12
# HELP agm_processed_tasks_total Total successfully processed tasks
# TYPE agm_processed_tasks_total counter
agm_processed_tasks_total 245
Grafana Dashboard Query Examples
-- Failed tasks by partner
SELECT p.name, COUNT(*) as failed_count
FROM partnerlogtrackers plt
JOIN partners p ON plt.partnerId = p._id
WHERE plt.status = 'failed'
GROUP BY p.name
-- Error categories over time
SELECT DATE(updatedAt) as date,
COUNT(*) as count,
errorMessage
FROM partnerlogtrackers
WHERE status = 'failed'
GROUP BY DATE(updatedAt), errorMessage
Best Practices
- Regular Monitoring: Check DLQ stats daily
- Automated Processing: Run DLQ processing every 4-6 hours
- Manual Review: Review archived tasks weekly
- Alert Thresholds:
- Warning: DLQ > 20 messages
- Critical: DLQ > 50 messages
- Cleanup: Archive tasks older than 7 days
- Documentation: Document recurring error patterns