Background Jobs
TimeTiles uses Payload CMS’s built-in job queue and workflow system for all asynchronous processing. Jobs and workflows are defined in lib/jobs/ and registered in lib/config/payload-shared-config.ts.
Key Behavior
- Auto-deletion: Completed jobs are automatically deleted (
deleteJobOnComplete: true) - Workflow-based orchestration: The ingestion pipeline uses 4 Payload Workflows, queued by collection
afterChangehooks - 3-queue architecture:
ingest(user-facing workflows),default(trigger jobs),maintenance(scheduled system jobs) - Production: 1 Docker worker container per queue via
pnpm payload jobs:run --cron --queue <name> - Development:
autoRunprocesses all queues within the Next.js process
Ingest Workflows
The ingestion pipeline is orchestrated by 4 Payload Workflows. Each workflow sequences multiple task handlers into a linear pipeline. See Data Ingestion Pipeline for detailed stage documentation.
| Workflow | Trigger | Pipeline |
|---|---|---|
manual-ingest | ingest-files afterChange hook | dataset-detection, then per-sheet: analyze, detect-schema, validate, create-schema-version, geocode, create-events |
scheduled-ingest | schedule-manager job | url-fetch, dataset-detection, then per-sheet pipeline |
scraper-ingest | schedule-manager job | scraper-execution, dataset-detection, then per-sheet pipeline |
ingest-process | ingest-jobs afterChange hook (NEEDS_REVIEW approval) | create-schema-version, geocode, create-events |
All ingest workflows run on the ingest queue with per-resource concurrency keys (e.g., file:{id}, sched:{id}).
Error Model
- Throw for transient failures — Payload retries the task
- Return
{ needsReview: true }for human review — pipeline pauses for that sheet - Return data for success — pipeline continues to next task
- Multi-sheet files use
Promise.allSettledwith per-sheet try/catch; individual sheet failures do not block other sheets
Ingest Task Handlers
These task handlers are composed by the workflows above. They are not queued individually.
| Task | Purpose |
|---|---|
dataset-detection | Parse file, create ingestion jobs per sheet |
analyze-duplicates | Find internal/external duplicate rows |
schema-detection | Build progressive JSON Schema from data |
validate-schema | Compare detected vs existing schema |
create-schema-version | Persist approved schema version |
geocode-batch | Geocode unique locations via providers |
create-events-batch | Create event records in database |
url-fetch | Download file from URL for scheduled ingest |
scraper-execution | Run scraper in Podman container |
System Jobs
System jobs use Payload’s native schedule property for cron-based scheduling.
| Job | Queue | Schedule |
|---|---|---|
schedule-manager | default | Every minute |
quota-reset | maintenance | Daily midnight |
cache-cleanup | maintenance | Every 6 hours |
schema-maintenance | maintenance | Daily 3:00 AM |
audit-log-ip-cleanup | maintenance | Daily 4:00 AM |
execute-account-deletion | maintenance | Daily 2:00 AM |
data-export-cleanup | maintenance | Hourly |
cleanup-stuck-scheduled-ingests | maintenance | Hourly |
cleanup-stuck-scrapers | maintenance | Hourly |
Standalone Task Jobs
These are queued on demand (not scheduled):
| Job | Purpose | Trigger |
|---|---|---|
scraper-repo-sync | Sync scraper manifest from Git repo | Admin action |
data-export | Generate ZIP archive of user data | User request |
Adding a New Job
- Create handler in
lib/jobs/handlers/my-job.ts - Export job config from
lib/jobs/ingest-jobs.ts - Add to
ALL_JOBSarray inlib/config/payload-shared-config.ts - If it is a workflow task, add it to the appropriate workflow in
lib/jobs/workflows/ - Create migration if the job needs new fields
Testing Jobs
See Integration Testing Patterns for job testing. Key points:
- Query pending jobs before running (
completedAt: { exists: false }) - Verify side effects after running (not job records — they’re deleted)
- Use
describe.sequential()for tests that interact with the job queue - Use a drain loop with
payload.jobs.run()to process chained workflow tasks in tests