A large number of interviews are still pending transcription. We will use the local system’s ztranscribe capability (alias for ~/.config/zsh/recipes/yt-transcribe) to download and transcribe these videos directly from YouTube.
video_assets missing a transcript_id that have a valid youtube platform ID.yt-transcribe pipeline: ~/.config/zsh/recipes/yt-transcribe https://www.youtube.com/watch?v=<id>..txt transcripts from ~/Downloads/transcripts/ into a staging area and use the project’s ./bin/transcripts pipeline to ingest them into _data/transcripts/.transcript-conversational-audit skill (using rake audit:prepare[slug] and rake audit:ingest[slug]) to clean the transcript, separate speakers, and generate the durable insights and SEO metadata.
video_assets missing a transcript_id with a valid youtube platform ID.zdots-ctx enqueue transcription '{"url": "https://www.youtube.com/watch?v=<id>"}' to queue the download and transcription asynchronously.zdots-ctx worker in a background process or terminal pane to process the queued transcriptions without blocking the main workflow.~/Downloads/transcripts/, stage them in tmp/transcript-id-staging/ using their video_asset_id and run ./bin/transcripts ingest. Finally, run the transcript-conversational-audit skill to generate insights and metadata.
Background worker is currently processing the queue of 41 videos.
Wrote a script bin/stage_completed_transcripts.rb to automatically stage any finished .txt files from the worker.
Successfully staged, ingested, and performed a canonical audit on the first completed transcript (david-heinemeier-hansson-dhh-railsconf-2014).
The worker will continue processing the remaining ~40 videos in the background over the next few hours. Once done, the staging and ingestion scripts can be run again to process the rest of the batch.