agmission/Development/server/docs/SATLOC_APPLICATION_PROCESSOR_README.md

286 lines
8.8 KiB
Markdown

# SatLoc Application Processor
A comprehensive log grouping and application management system for SatLoc binary log files, designed with the Job Worker pattern for proper application file organization and data processing.
## Overview
The SatLoc Application Processor provides:
- **Application Grouping**: Groups multiple log files under the same Application based on job ID and upload date
- **File Management**: Creates individual ApplicationFile records for each log with optimized metadata storage
- **Data Processing**: Extracts ApplicationDetail records with proper precision formatting and spray segment compression
- **Retry Logic**: Handles reprocessing of existing files with data reset capability
- **Accumulated Statistics**: Calculates totals for spray time, flight time, sprayed area, and material usage
- **Transaction Safety**: Uses MongoDB transactions for data integrity
## Architecture
```
Application (Job/Date Grouping)
├── ApplicationFile 1 (morning_001.log)
│ ├── meta: { flowControllerName, statistics, timeRange }
│ ├── data: [ spraySegments... ]
│ └── ApplicationDetails: [ detail1, detail2, ... ]
├── ApplicationFile 2 (morning_002.log)
│ ├── meta: { flowControllerName, statistics, timeRange }
│ ├── data: [ spraySegments... ]
│ └── ApplicationDetails: [ detail1, detail2, ... ]
└── Accumulated Fields: { totalSprayTime, totalSprayed, etc. }
```
## Key Features
### 1. Log Grouping Logic
Files are grouped under the same Application when they have:
- Same `jobId` (from context or SatLoc job records)
- Same `userId` (pilot/operator)
- Upload date within `groupingTolerance` (default: 24 hours)
```javascript
const processor = new SatLocApplicationProcessor({
groupingTolerance: 24 * 60 * 60 * 1000 // 24 hours
});
```
### 2. Optimized Metadata Storage
Flow controller names and other metadata are stored in `ApplicationFile.meta` to save database space:
```javascript
applicationFile.meta = {
flowControllerName: "FlowController_A",
satlocJobId: "JOB123",
aircraftId: "DRONE001",
pilotName: "John Pilot",
parseStatistics: { /* ... */ },
timeRange: { startDateTime, endDateTime }
}
```
### 3. Spray Segment Compression
Application details are compressed into spray segments stored in `ApplicationFile.data`:
```javascript
applicationFile.data = [
{
startTime: 1642234800,
endTime: 1642234860,
startLat: -34.123,
startLon: 138.456,
endLat: -34.124,
endLon: 138.457,
points: 60,
avgRate: 12.5,
avgSpeed: 15.2,
swathWidth: 18.0,
duration: 60
}
// ... more segments
]
```
### 4. Precision Formatting
All numeric values are formatted with appropriate precision using `utils.fixedTo()`:
- **Swath Width**: 1 decimal place
- **Application Rates**: 2 decimal places
- **Ground Speed**: 2 decimal places
- **Humidity**: 0 decimal places (whole numbers)
- **Heading**: 1 decimal place
### 5. Bit Flag Processing
Spray status (`sprayStat`) properly handles boom on/off using bit flags:
```javascript
// Enhanced position records (78 bytes)
if (position.isEnhanced) {
sprayStat = (position.boomControlStatus & 0x01) ? 1 : 0;
} else {
// Short position records (43 bytes)
sprayStat = (position.flags === 2) ? 1 : 0;
}
```
## Usage
### Basic Processing
```javascript
const SatLocApplicationProcessor = require('./helpers/satloc_application_processor');
const processor = new SatLocApplicationProcessor();
const result = await processor.processLogFile(
{ filePath: '/path/to/file.log' },
{
jobId: 'job_123',
userId: 'pilot_456',
uploadedDate: new Date()
}
);
if (result.success) {
console.log('Application ID:', result.application._id);
console.log('File ID:', result.applicationFile._id);
console.log('Details:', result.applicationDetails.length);
}
```
### Enhanced Parser Integration
```javascript
const SatLocLogParser = require('./helpers/satloc_log_parser');
const parser = new SatLocLogParser();
// Parse and process in one call
const result = await parser.parseAndProcessFile('/path/to/file.log', contextData);
// Retry existing file
const retryResult = await parser.retryParseAndProcessFile('/path/to/file.log', contextData);
```
### Multiple File Grouping
```javascript
const logFiles = [
'/path/to/job123/morning_001.log',
'/path/to/job123/morning_002.log',
'/path/to/job123/morning_003.log'
];
const baseContext = {
jobId: 'job_123',
userId: 'pilot_456',
uploadedDate: new Date()
};
// All files will be grouped under the same Application
for (const logFile of logFiles) {
await processor.processLogFile({ filePath: logFile }, baseContext);
}
```
### Retry Processing
```javascript
// Reset existing data and reprocess
const retryResult = await processor.retryLogFile('/path/to/file.log', contextData);
```
## Configuration Options
```javascript
const processor = new SatLocApplicationProcessor({
batchSize: 1000, // Batch size for ApplicationDetail inserts
enableRetryLogic: true, // Enable retry functionality
groupingTolerance: 24 * 60 * 60 * 1000, // Time tolerance for grouping (24 hours)
validateChecksums: true // Validate record checksums
});
```
## Data Models
### Application
- `jobId`: External job identifier
- `fileName`: Virtual grouping file name (e.g., "satloc_logs.zip")
- `byUser`: User/pilot identifier
- `status`: Processing status (IN_PROGRESS, DONE)
- `totalSprayTime`: Accumulated spray time (seconds)
- `totalFlightTime`: Accumulated flight time (seconds)
- `totalSprayed`: Accumulated sprayed area (hectares)
- `totalSprayMat`: Accumulated spray material (liters)
- `meta.satlocJobId`: SatLoc job ID from log files
- `meta.logFileCount`: Number of log files grouped
### ApplicationFile
- `appId`: Reference to parent Application
- `name`: Original log file name
- `agn`: Generated AgNav identifier (timestamp-based)
- `meta`: Optimized metadata storage (flow controller, statistics, etc.)
- `data`: Compressed spray segments array
- `totalSprayTime`: File-specific spray time
- `totalFlightTime`: File-specific flight time
- `totalSprayed`: File-specific sprayed area
- `totalSprayMat`: File-specific spray material
### ApplicationDetail
- `fileId`: Reference to ApplicationFile (new field)
- `appId`: Reference to Application (legacy support)
- `gpsTime`: GPS timestamp
- `lat`, `lon`: GPS coordinates
- `grSpeed`: Ground speed (2 decimal places)
- `swath`: Swath width (1 decimal place)
- `lminApp`: Application rate (2 decimal places)
- `sprayStat`: Boom on/off status (1/0 from bit flags)
- Plus all other existing fields with proper precision
## Testing
Run the comprehensive test suite:
```bash
node tests/test_satloc_application_processor.js
```
This tests:
- ✅ Application/ApplicationFile creation with proper grouping
- ✅ ApplicationDetail batch processing with spray segments
- ✅ Accumulated field calculations
- ✅ Retry logic with data reset
- ✅ Enhanced parser integration
- ✅ Multiple file grouping under same application
- ✅ Metadata optimization and spray segment extraction
## Performance Considerations
1. **Batch Processing**: ApplicationDetails are inserted in configurable batches (default: 1000)
2. **Transaction Safety**: All operations use MongoDB transactions for consistency
3. **Memory Efficiency**: Large files are processed in chunks to avoid memory issues
4. **Index Optimization**: Proper indexing on `appId`, `fileId`, and `gpsTime` fields
5. **Metadata Compression**: Flow controller names stored in meta fields vs repeated in every detail
## Migration from Legacy System
The new system maintains backward compatibility:
- Existing `appId` fields in ApplicationDetail are preserved
- New `fileId` fields link details to specific log files
- Legacy applications continue to work unchanged
- Gradual migration to new grouping system possible
## Error Handling
- **Parse Errors**: Continue processing with error statistics
- **Transaction Failures**: Full rollback with detailed error reporting
- **Retry Logic**: Automatic retry with configurable backoff
- **Checksum Validation**: Optional validation with error tracking
- **Memory Management**: Chunked processing for large files
## Monitoring and Debugging
Enable debug logging:
```bash
DEBUG=agm:satloc-processor,agm:satloc-parser node your_script.js
```
This provides detailed logs for:
- Application grouping decisions
- File processing progress
- Spray segment extraction
- Performance metrics
- Error details and retry attempts
## Future Enhancements
1. **Real-time Processing**: WebSocket support for live log streaming
2. **Data Validation**: Enhanced validation rules for application data
3. **Analytics Integration**: Built-in analytics and reporting capabilities
4. **Cloud Storage**: Support for cloud-based log file storage
5. **Parallel Processing**: Multi-threaded processing for large datasets