agmission/Development/server/docs/CURSOR_PAGINATION_GUIDE.md

326 lines
7.6 KiB
Markdown

# Cursor-Based Pagination Guide
## Overview
This guide explains how to implement cursor-based pagination (Stripe API style) in your endpoints to efficiently handle large datasets without Chrome DevTools cache eviction issues.
## Configuration
Set environment variables to configure pagination limits:
```bash
# Default number of records per page (default: 1000)
PAGINATION_DEFAULT_LIMIT=1000
# Maximum allowed records per page (default: 10000)
PAGINATION_MAX_LIMIT=10000
```
## Why Cursor-Based Pagination?
### Problems with Offset-Based (`skip/limit`)
- **Performance degrades** with deep pagination: `skip(20000)` scans and discards 20,000 documents
- **Inefficient** for large datasets (millions of records)
- **Not scalable** as dataset grows
### Benefits of Cursor-Based
-**Constant-time performance** regardless of page depth
-**Uses MongoDB indexes** efficiently (_id index)
-**Prevents Chrome DevTools** cache eviction (responses stay under 6MB)
-**Compatible with Stripe API** patterns (industry standard)
## Quick Start
### 1. Import the Helper
```javascript
const { paginateWithCursor, validateCursorParams } = require('../helpers/cursor_pagination');
```
### 2. Basic Usage
```javascript
async function myEndpoint_post(req, res) {
const params = req.body;
// Validate pagination parameters
const validation = validateCursorParams(params);
if (!validation.valid) {
return res.status(400).json({ error: validation.error });
}
// Apply cursor-based pagination
const result = await paginateWithCursor(
MyModel, // Mongoose model
params, // Request params with limit, starting_after, ending_before
{ status: 'active' }, // Base filter (optional)
{
defaultLimit: 1000, // Default records per page
maxLimit: 10000, // Maximum allowed limit
cursorField: '_id' // Field to use for cursor (must be indexed)
}
);
res.json(result);
}
```
### 3. Response Format
```json
{
"data": [...], // Array of records
"hasMore": true, // Whether more records exist
"startingAfter": "507f1f77bcf86cd799439011", // Cursor for next page
"endingBefore": "507f191e810c19729de860ea" // Cursor for previous page
}
```
## Client Usage Examples
### First Page Request
```javascript
POST /api/endpoint
{
"limit": 1000
}
```
### Next Page Request
```javascript
POST /api/endpoint
{
"limit": 1000,
"startingAfter": "507f1f77bcf86cd799439011" // From previous response
}
```
### Previous Page Request
```javascript
POST /api/endpoint
{
"limit": 1000,
"endingBefore": "507f191e810c19729de860ea" // From previous response
}
```
### Return All Records (No Pagination)
```javascript
// Option 1: Use returnAll flag
POST /api/endpoint
{
"returnAll": true
}
// Option 2: Use limit: -1
POST /api/endpoint
{
"limit": -1
}
// Option 3: Use limit: 0
POST /api/endpoint
{
"limit": 0
}
```
**⚠️ Warning:** Use "return all" carefully! For large datasets (>10,000 records), this can:
- Cause Chrome DevTools cache eviction
- Increase server memory usage
- Slow down response times
- Timeout on slow networks
**Best for:** Small datasets (<5,000 records) or when you need complete data export.
## Advanced Usage
### Using Custom Cursor Field
For timestamp-based pagination:
```javascript
const result = await paginateWithCursor(
MyModel,
params,
{ userId: req.userInfo.uid },
{
defaultLimit: 100,
cursorField: 'createdAt' // Must have index on createdAt
}
);
```
**Important:** Ensure the cursor field has a unique index:
```javascript
// In your model file
MyModelSchema.index({ createdAt: 1, _id: 1 });
```
### Disable "Return All" Feature
For security or performance reasons, you can disable the ability to return all records:
```javascript
const result = await paginateWithCursor(
MyModel,
params,
{ status: 'active' },
{
defaultLimit: 100,
allowReturnAll: false // Prevents limit: -1 or returnAll: true
}
);
```
### Manual Query Building
For more control:
```javascript
const { buildCursorQuery, processCursorResults } = require('../helpers/cursor_pagination');
async function customEndpoint_post(req, res) {
const params = req.body;
// Build query configuration
const queryConfig = buildCursorQuery(
params,
{ status: 'pending' },
{ defaultLimit: 500 }
);
// Execute custom query with additional options
const records = await MyModel
.find(queryConfig.filter)
.select('name status createdAt')
.populate('user')
.lean()
.limit(queryConfig.options.limit)
.sort(queryConfig.options.sort);
// Process results
const result = processCursorResults(
records,
queryConfig.limit,
queryConfig.isBackward
);
res.json(result);
}
```
### Express Middleware
Validate pagination parameters globally:
```javascript
const { cursorPaginationMiddleware } = require('../helpers/cursor_pagination');
router.post('/api/jobs',
cursorPaginationMiddleware(),
jobController.getJobs_post
);
```
## Real-World Example: Multiple Filters
```javascript
async function getJobs_post(req, res) {
const { status, clientId, dateRange, limit, starting_after, ending_before } = req.body;
// Validate pagination
const validation = validateCursorParams(req.body);
if (!validation.valid) {
return res.status(400).json({ error: validation.error });
}
// Build base filter
const baseFilter = {};
if (status) baseFilter.status = status;
if (clientId) baseFilter.clientId = clientId;
if (dateRange) {
baseFilter.createdAt = {
$gte: new Date(dateRange.start),
$lte: new Date(dateRange.end)
};
}
const result = await paginateWithCursor(
Job,
req.body,
baseFilter,
{ defaultLimit: 50, maxLimit: 500 }
);
res.json(result);
}
```
## Migration from Skip/Limit
### Before (Offset-Based)
```javascript
const limit = parseInt(params.limit) || 1000;
const skip = parseInt(params.skip) || 0;
const records = await MyModel.find(filter)
.skip(skip)
.limit(limit)
.lean();
const total = await MyModel.countDocuments(filter);
res.json({
data: records,
total,
skip,
limit,
hasMore: (skip + limit) < total
});
```
### After (Cursor-Based)
```javascript
const result = await paginateWithCursor(
MyModel,
params,
filter,
{ defaultLimit: 1000 }
);
res.json(result);
```
## Performance Comparison
| Dataset Size | Skip/Limit (page 100) | Cursor-Based (page 100) |
|--------------|----------------------|-------------------------|
| 10K records | ~50ms | ~5ms |
| 100K records | ~500ms | ~5ms |
| 1M records | ~5s | ~5ms |
| 10M records | ~50s | ~5ms |
## Best Practices
1. **Always use indexed fields** for cursors (default `_id` is always indexed)
2. **Set reasonable default limits** via environment variables (PAGINATION_DEFAULT_LIMIT)
3. **Don't expose total counts** for large datasets (use `hasMore` instead)
4. **Validate cursor parameters** before processing
5. **Handle errors gracefully** with clear error messages
## Troubleshooting
### Error: "Cannot use both startingAfter and endingBefore"
Client sent both cursors. Only one is allowed at a time.
### Error: "Invalid cursor: must be a valid ObjectId"
The cursor value is not a valid MongoDB ObjectId. Ensure you're passing the exact value from a previous response.
### Empty results but hasMore is false
You've reached the end of the dataset. Reset pagination by omitting cursors.
## API Reference
See `helpers/cursor_pagination.js` for complete function signatures and options.