Transcript Import

Canonical Model

Commands

  1. Audit current repository transcript integrity:
    • ./bin/transcripts audit
  2. Build ID-suffixed staging files (recommended for ambiguous filenames):
    • ./bin/transcripts prepare --source-dir /Volumes/Dock_1TB/vimeo/outbox --output-dir tmp/transcript-id-staging --min-confidence 0.8 --clean-output
  3. Run import in dry-run mode:
    • ./bin/transcripts dry-run --source-dir tmp/transcript-id-staging --min-confidence 0.9
  4. Review output reports:
    • tmp/transcript-import-report.json
    • tmp/transcript-import-report.md
  5. Apply high-confidence mappings:
    • ./bin/transcripts ingest --source-dir tmp/transcript-id-staging --min-confidence 0.9

Direct Import Mode

If filenames already include explicit IDs and do not need staging:

Report Files

Legacy sequence (kept for reference)

  1. Run import in dry-run mode:
    • ./bin/transcripts dry-run --source-dir /Volumes/Dock_1TB/vimeo/outbox --min-confidence 0.9
  2. Review output reports:
    • tmp/transcript-import-report.json
    • tmp/transcript-import-report.md
  3. Apply high-confidence mappings:
    • ./bin/transcripts ingest --source-dir /Volumes/Dock_1TB/vimeo/outbox --min-confidence 0.9
  4. Re-run pipeline validation:
    • ./bin/transcripts validate

One-Command Batch Mode

Notes