agmission/Development/server/.github/copilot-instructions.md

# AgMission Server - AI Coding Instructions

## ⚠️ READ THIS FIRST ⚠️

**MANDATORY RULE: Do not make up things yourselves.**
- Never invent endpoint names, route groupings, field names, model properties, or terminology that does not exist in the actual code
- Always verify against the real source files (routes, controllers, models, constants, and so on) before documenting
- If something is uncertain, read the code first — do not guess or assume

**MANDATORY RULE: Always run tests and scripts before claiming work is complete!**
- Never say "tests pass" without actually running them
- Never create scripts without executing them to verify they work
- Fix all errors until actual execution succeeds
- See "CRITICAL TESTING REQUIREMENT" section below for full details

---

## System Architecture

**AgMission** is a Node.js/Express agricultural mission planning system with:
- **MongoDB** (Mongoose 6.x) for data persistence with replica set support
- **RabbitMQ** (amqplib) for async task processing with DLQ patterns
- **Redis** (ioredis) for caching and session management
- **Stripe** API for subscription billing
- **External Partner APIs** (SatLoc) for equipment integration

### Critical Architecture Concepts

**Dual-Queue Worker Pattern**: Main application queue + partner-specific queues
- Main queue: `dev_jobs` (dev) / `jobs` (prod) - internal job processing
- Partner queue: `dev_partner_tasks` (dev) / `partner_tasks` (prod) - external sync
- DLQ: `{queueName}_failed` - dead letter queue with auto-retry logic
**Workers**: `job_worker.js`, `partner_sync_worker.js`, `partner_data_polling_worker.js`, `dlq_archival_worker.js`, `dlq_alert_worker.js`

**Dual-User Partner System** (critical for partner integration):
- Partner Organizations: `User` model with `kind: "PARTNER"` (e.g., SatLoc company)
- Partner System Users: `User` model with `kind: "PARTNER_SYSTEM_USER"` (customer credentials)
- **Key**: Assignments use internal user IDs, but workers lookup Partner System User records to get credentials for API calls
- See `README_PARTNER_INTEGRATION.md` for full explanation

**Queue-Native DLQ Operations** (Step 8 refactor - completed):
- ❌ Old: `/api/partners/dlq/retry/:id` (tracker-ID based, MongoDB-dependent)
- ✅ New: `/api/dlq/:queueName/retryAll`, `/api/dlq/:queueName/retryByPosition`, `/api/dlq/:queueName/retryByHeader`
- Direct RabbitMQ operations, no MongoDB coupling, supports multiple queues
- Global endpoints work for ANY queue type (partner_tasks, jobs, etc.)
- See `docs/STEP8_IMPLEMENTATION_COMPLETE.md` for migration context

## Development Workflow

### 🚨 CRITICAL TESTING REQUIREMENT 🚨

**MANDATORY: ALWAYS RUN TESTS/SCRIPTS BEFORE CLAIMING COMPLETION**

**Core Principle**: Never report work as "done" or "complete" without actually executing and verifying the code works.

**What This Means**:
- ❌ **NEVER** create test scripts and assume they work
- ❌ **NEVER** claim "tests pass" without running them
- ❌ **NEVER** say "this should work" without proving it
- ✅ **ALWAYS** execute every script/test you create
- ✅ **ALWAYS** fix all errors until tests actually pass
- ✅ **ALWAYS** include actual execution output in reports

**When Creating Test Scripts**:
1. Write the test script in `tests/` directory
2. **RUN IT IMMEDIATELY** using `run_in_terminal`
3. If errors occur: debug, fix, and run again (repeat until success)
4. Only after seeing successful execution: report completion
5. Include actual test output (success/failure) in final report

**When Creating Utility Scripts**:
1. Write the script
2. **EXECUTE IT** with sample/test data
3. Verify output is correct
4. Fix any errors (authentication, connection, logic, etc.)
5. Run again until it works
6. Report with actual execution proof

**Testing Checklist** (ALL must be ✅ before claiming done):
1. ✅ Test/script file created in appropriate directory
2. ✅ Script has proper environment loading (`environment.env`)
3. ✅ **Script executed successfully** (via `run_in_terminal`)
4. ✅ All errors fixed (authentication, parameters, logic)
5. ✅ Output confirms expected behavior
6. ✅ Edge cases handled gracefully
7. ✅ Actual execution output included in completion report

**Common Test Failures to Check**:
- Authentication errors (wrong tokens, credentials)
- API endpoint errors (404, 409, wrong paths)
- Parameter mismatches (wrong field names, missing required fields)
- Database connection issues
- Environment variable problems
- Missing dependencies

**Example Workflow**:
```bash
# 1. Create test
# 2. RUN IT
node tests/my_new_test.js
# 3. See error? Fix and run again
node tests/my_new_test.js
# 4. Keep fixing until you see: ✅ ALL TESTS PASSED
# 5. THEN report completion with output
```

**Remember**: The user is frustrated by untested code. Earn trust by delivering **working, verified** solutions.

### 🚨 STRIPE API RATE LIMITING BEST PRACTICES 🚨

**CRITICAL: Avoid hitting Stripe's 25 ops/sec test mode rate limit**

**Test Writing Principles**:
- ❌ **NEVER** disable/fetch 100+ existing records in tests
- ❌ **NEVER** cleanup all records before each test case
- ✅ **ALWAYS** use unique names (timestamps) to avoid conflicts
- ✅ **ALWAYS** track only what you create and cleanup only those
- ✅ **ALWAYS** use 100ms+ delays between API calls (10 ops/sec safe)

**Example Pattern**:
```javascript
// BAD: Disables 100+ old promos before each test
async function test1() {
  await disableAllPromos(); // 100+ API calls!
  await createPromo(...);
  // test logic
}

// GOOD: Use unique names, track what we create
const TEST_RUN_ID = Date.now();
const createdIds = [];

async function createPromo(data) {
  const uniqueData = { ...data, name: `${data.name}_${TEST_RUN_ID}` };
  const result = await api.post('/promos', uniqueData);
  createdIds.push(result.id); // Track for cleanup
  return result;
}

async function cleanup() {
  // Only cleanup our 5-10 promos, not 100+
  for (const id of createdIds) {
    await api.delete(`/promos/${id}`);
    await sleep(100); // Rate limiting
  }
}
```

**Rate Limit Guidelines**:
- Test mode: 25 operations/second
- Safe rate: 10 ops/sec (100ms between calls)
- List operations are expensive - minimize them
- Don't use cleanup between test cases - use unique names
- Cleanup once at end, not between tests

### 🚨 AVOID LIMIT-BASED QUERIES 🚨

**CRITICAL: Never use `.limit()` for fetching all records**

**Database Query Principles**:
- ❌ **NEVER** use `.find().limit(100)` to get "all" records (may have more!)
- ❌ **NEVER** assume limit covers all data
- ✅ **ALWAYS** use cursor-based pagination or auto-pagination
- ✅ **ALWAYS** use Stripe SDK's async iteration (handles pagination automatically)

**Bad Pattern**:
```javascript
// BAD: Only gets first 100, ignores rest
const subs = await stripe.subscriptions.list({
  customer: custId,
  limit: 100
});
for (const sub of subs.data) { ... }
```

**Good Pattern**:
```javascript
// GOOD: Auto-pagination fetches ALL subscriptions
const allSubs = [];
for await (const sub of stripe.subscriptions.list({ customer: custId })) {
  allSubs.push(sub);
}
```

**MongoDB Cursor Pagination**:
```javascript
// For large datasets, use cursor pagination
let cursor = null;
do {
  const query = cursor ? { _id: { $gt: cursor } } : {};
  const batch = await Model.find(query).sort({ _id: 1 }).limit(100).lean();

  for (const doc of batch) {
    // Process doc
  }

  cursor = batch.length > 0 ? batch[batch.length - 1]._id : null;
} while (cursor);
```

**Remember**: The user is frustrated by untested code. Earn trust by delivering **working, verified** solutions.

### Running the System

```bash
# Start main server (with debugger)
DEBUG=agm:* node --inspect server.js

# Start all workers (PM2 or standalone)
node start_workers.js

# Or start individual workers:
node workers/partner_sync_worker.js
node workers/partner_data_polling_worker.js

# DLQ monitoring
node scripts/monitor_partner_dlq.js
# Or web UI: http://localhost:4100/public/dlq-monitor.html
```

### Environment Configuration

**Critical**: Environment variables are loaded from `environment.env` (not `.env`). See `helpers/env.js` for all mappings.

**Queue Name Auto-Prefixing**:
- Development: `QUEUE_NAME_PARTNER=partner_tasks` → actual queue: `dev_partner_tasks`
- Production: `QUEUE_NAME_PARTNER=partner_tasks` → actual queue: `partner_tasks`
- Logic in `helpers/env.js` line ~115

**Debug Patterns**:
- `DEBUG=agm:*` - all modules
- `DEBUG=agm:partner*,agm:satloc*` - partner integration only
- `DEBUG=agm:queue*,agm:dlq*` - queue/DLQ operations
- See `PINO_MODULE_FILTERING_GUIDE.md` for Pino logger filtering

### Testing Partner Integration

```bash
# Setup test data
node setup_partners.js

# Test SatLoc log parsing (brief output)
node test_satloc_pattern_brief.js

# Test queue-native DLQ operations
node test_queue_native_retry.js

# Test race condition handling
node test-race-condition.js
```

## Code Conventions

### Route Organization

Routes follow function-based mounting pattern:
```javascript
// routes/partner.js
module.exports = function (app) {
  const router = require('express').Router();
  router.get('/api/partners', controller.listPartners);
  app.use(router);
};
```

All routes mounted in `server.js` via `require('./routes')(app)`.

**Endpoint Naming Convention**: Use **camelCase** for endpoint paths (NOT snake_case):
- ✅ Correct: `/uploadJob`, `/syncData`, `/retryAll`, `/getPartnerCustomers`
- ❌ Wrong: `/upload_job`, `/sync_data`, `/retry-all`, `/get_partner_customers`

### Controller Patterns

Controllers are organized by domain (not CRUD):
- `controllers/partner.js` - Partner management + job uploads
- `controllers/dlq.js` - Global DLQ operations (all queues)
- `controllers/job.js` - Job CRUD and processing

**JSDoc Required**: All controller functions must have JSDoc for apidoc generation:
```javascript
/**
 * @api {post} /api/partners/dlq/:queueName/retryAll Retry All DLQ Messages
 * @apiName RetryAllDLQ
 * @apiGroup PartnerDLQ
 * @apiDescription Retry all messages in specified DLQ (or up to maxMessages)
 *
 * @apiParam {String} queueName Queue name (e.g., 'partner_tasks')
 * @apiBody {Number} [maxMessages=100] Maximum messages to retry
 */
```

### Worker Error Handling

Workers MUST use queue-native error handling (not tracker status):
```javascript
// ✅ Correct - queue-native
channel.nack(msg, false, false); // Send to DLQ
channel.ack(msg); // Success
channel.nack(msg, false, true); // Requeue for retry

// ❌ Wrong - old tracker-based approach
await PartnerLogTracker.updateOne({_id}, {status: 'failed'});
```

DLQ messages are managed via global API endpoints (`/api/dlq/:queueName/*`) and web dashboard (`public/dlq-monitor.html`). Workers send failures to DLQ, and administrators can retry, archive, or purge messages through the API.

### Model Patterns

Mongoose models in `model/` directory:
- Use `mongoose-sequence` for auto-incrementing IDs where needed
- Discriminators for inheritance (e.g., `User` base, `Partner` discriminator)
- Always use `.lean()` for read-only queries to get plain objects

**Partner System User Queries**:
```javascript
// Find customer's SatLoc credentials
const psu = await User.findOne({
  kind: 'PARTNER_SYSTEM_USER',
  customerId: ObjectId('...'),
  partnerId: ObjectId('...') // SatLoc partner ID
});
// psu.partnerUsername, psu.partnerPassword for API calls
```

### Async Error Handling

`express-async-errors` is loaded globally - controllers can use async/await without try/catch:
```javascript
exports.myRoute = async (req, res) => {
  const data = await Model.findById(id); // auto-caught
  res.json(data);
};
```

Custom errors use `helpers/app_error.js`:
- `AppError(Errors.NOT_FOUND, 'Resource not found')`
- `AppParamError('Invalid ID format')`

**Error Response Format**: All API errors follow standardized format via `ErrorHandler` middleware:
```javascript
// Error object structure
{
  "error": {
    ".tag": "error_constant_value",  // Lowercase value from helpers/constants.js Errors
    "message": "Details"              // Only in development mode
  }
}
```

**Error Classes and Status Codes**:
- `AppAuthError` → 401 (authentication) → `.tag`: "not_authorized"
- `AppParamError` → 409 (invalid parameters) → `.tag`: "invalid_param"
- `AppInputError` → 409 (invalid input) → `.tag`: "invalid_input"
- `AppMembershipError` → 410 (subscription issues) → `.tag`: "subscription_not_found"
- `AppError` → 409 (general application errors) → `.tag`: "unknown_app_error"

**Usage Example**:
```javascript
// Throw error using constant name (uppercase)
throw new AppParamError(Errors.INVALID_PARAM, 'Queue name is required');

// Results in response (lowercase value):
// { "error": { ".tag": "invalid_param", "message": "Queue name is required" } }
```

**JSDoc for Error Responses**:
```javascript
/**
 * @apiError (409) {Object} error Error object
 * @apiError (409) {String} error..tag Error constant value (e.g., "invalid_param")
 * @apiError (409) {String} [error.message] Error details (dev mode only)
 */
```

## Critical Files

**Entry Points**:
- `server.js` - Express app initialization, middleware, route mounting
- `start_workers.js` - Worker process manager (spawns all workers)

**Partner Integration Core**:
- `workers/partner_sync_worker.js` - Main partner log processor
- `workers/partner_data_polling_worker.js` - Downloads logs from partner APIs
- `controllers/dlq.js` - Global DLQ API endpoints (all queues)
- `services/partner_service.js` - Partner API client abstractions

**DLQ System**:
- `routes/dlq.js` - Global DLQ API routes (all queue types)
- `controllers/dlq.js` - Global DLQ controller logic
- `scripts/monitor_partner_dlq.js` - CLI monitoring tool
- `public/dlq-monitor.html` - Web monitoring dashboard
- `docs/DLQ_INDEX.md` - DLQ documentation hub

**Configuration**:
- `helpers/env.js` - Environment variable mappings (source of truth)
- `environment.env` - Local dev environment (NOT .env)
- `helpers/db/connect.js` - MongoDB connection with retry logic

## Common Pitfalls

**Queue Name Confusion**: Development auto-prefixes `dev_`. If worker can't find queue, check actual name:
```javascript
// Expected: 'partner_tasks' → Actual: 'dev_partner_tasks' (in dev)
```

**Partner Auth Lookup**: Workers need Partner System User records for credentials:
```javascript
// ❌ Wrong: Using internal user ID to call partner API
// ✅ Right: Lookup Partner System User, use partnerUsername/partnerPassword
```

**DLQ Retry Pattern**: Use queue-native operations (Step 8), not tracker-based:
```javascript
// ❌ Old: POST /api/partners/dlq/retry/:trackerId
// ✅ New: POST /api/dlq/partner_tasks/retryAll
// ✅ Works for any queue: /api/dlq/dev_jobs/retryAll
```

**Mongoose .lean()**: Always use for read-only queries to avoid document overhead:
```javascript
const jobs = await Job.find({}).lean(); // Plain JS objects
```

**Always filter `active` and `markedDelete` in User/Partner queries**: Every query against the `users` collection (including discriminators like `PartnerSystemUser`, `Partner`, `Vehicle`) MUST include these unless intentionally retrieving inactive/deleted records:
```javascript
// ❌ Wrong: Missing active/markedDelete filters
const psu = await PartnerSystemUser.findOne({ parent: customerId, partner: partnerId });

// ✅ Correct: Always include both filters
const psu = await PartnerSystemUser.findOne({
  parent: customerId,
  partner: partnerId,
  active: true,
  markedDelete: { $ne: true }
});
```
This applies to all `User` discriminators: `PartnerSystemUser`, `Partner`, regular users, `Vehicle` (DEVICE type). Omitting these silently returns soft-deleted or deactivated records.

**Process Fatal Handlers**: Custom error logging in `helpers/process_fatal_handlers.js`. Don't override with generic handlers.

**CLI Scripts Environment Loading**: All CLI scripts MUST load environment variables from `environment.env` (not `.env`):
```javascript
// Required pattern at top of every CLI script:
const path = require('path');

// Parse --env argument (default: ./environment.env)
const args = process.argv.slice(2);
let envFile = './environment.env';
for (let i = 0; i < args.length; i++) {
  if (args[i] === '--env' && args[i + 1]) {
    envFile = args[i + 1];
    i++;
  }
}

// Load environment before requiring any modules
const envPath = path.resolve(process.cwd(), envFile);
require('dotenv').config({ path: envPath });
```

## Documentation Standards

**Update These When Changing Partner/DLQ Code**:
- `README_PARTNER_INTEGRATION.md` - Partner integration guide
- `docs/DLQ_INDEX.md` - DLQ documentation hub
- `docs/DLQ_API_REFERENCE.md` - API reference with examples
- `docs/DLQ_OPERATIONS.md` - Operational guide
- JSDoc comments for apidoc generation

**API Documentation**: Generated via `npm run docs` (apidoc). Output: `public/apidoc/`.

**Mermaid Diagrams**: Use Mermaid for architecture diagrams in markdown docs. See `docs/PARTNER_DLQ_ARCHITECTURE_DIAGRAMS.md` for examples.

## Key Dependencies

- **mongoose@6.12.0** - MongoDB ODM (v6 syntax, NOT v7+)
- **amqplib@0.10.3** - RabbitMQ client (callback-based, use promisify)
- **express@4.18.1** - Web framework (v4, NOT v5)
- **ioredis@5.3.2** - Redis client
- **stripe** - Payment processing (API version in env vars)
- **axios@1.7.2** - HTTP client (prefer over node-fetch)
- **debug@4.1.1** - Debug logging (`DEBUG=agm:*`)

## Project Organization Standards

**Directory Structure**:
- `tests/` - Test scripts (manual and automated)
- `docs/` - All project documentation (*.md files ONLY in docs/)
- `workers/` - Background worker processes
- `scripts/` - Utility and maintenance scripts
- `controllers/` - Domain-organized controllers (not CRUD)
- `routes/` - Express route definitions

**File Placement Rules**:
- ✅ **Test scripts**: MUST be in `tests/` directory (e.g., `tests/test_setup_intent.js`)
- ✅ **Documentation**: MUST be in `docs/` directory (e.g., `docs/SETUP_INTENT_IMPLEMENTATION.md`)
- ✅ **Utility scripts**: In `scripts/` directory
- ❌ **Never** place test scripts or documentation files in project root

**Testing Approach**:
- Manual testing scripts in `tests/` directory for important functions
- Integration test scripts named `test_*.js` (e.g., `test_satloc_pattern_brief.js`)
- Use `simple_test.js` for quick validation
- Postman collections in `docs/` for API testing
- **Note**: No formal test framework yet - scripts designed for future automation

**Documentation Requirements**:
- **ALWAYS update relevant documentation after code changes**
- Update JSDoc comments for API documentation generation
- Update markdown docs in `docs/` when changing partner/DLQ features
- Keep README files synchronized with actual implementation

**DLQ Testing**: Use `docs/Partner_DLQ_API.postman_collection.json` to test all 6 queue-native endpoints.