Interview Data Model
This document describes the data model for the interview/video content on the site.
Data Flow Overview
Rendered Diagram
flowchart TB
subgraph Canonical["Canonical Data"]
interviews[_data/interviews.yml<br/>Interview records]
assets[_data/video_assets.yml<br/>Canonical video assets]
transcripts[_data/transcripts/*.yml<br/>Transcript entries referenced by video_assets]
end
subgraph Buckets["Video Buckets"]
oneoffs[_data/oneoff_videos.yml<br/>One-off video list]
scmc[_data/scmc_videos.yml<br/>SCMC video list]
end
subgraph Lookups["Lookup Tables"]
confs[_data/interview_conferences.yml<br/>Conference definitions]
comms[_data/interview_communities.yml<br/>Community definitions]
resources[_data/resources.yml<br/>Trusted source registry]
end
interviews <-->|video_asset_id / interview_id| assets
oneoffs -->|video_asset_id| assets
scmc -->|video_asset_id| assets
interviews -->|conference field| confs
interviews -->|community field| comms
confs -->|slug| resources
assets -..->|transcript_id| transcripts
Diagram Source
flowchart TB
subgraph Canonical["Canonical Data"]
interviews[_data/interviews.yml<br/>Interview records]
assets[_data/video_assets.yml<br/>Canonical video assets]
transcripts[_data/transcripts/*.yml<br/>Transcript entries referenced by video_assets]
end
subgraph Buckets["Video Buckets"]
oneoffs[_data/oneoff_videos.yml<br/>One-off video list]
scmc[_data/scmc_videos.yml<br/>SCMC video list]
end
subgraph Lookups["Lookup Tables"]
confs[_data/interview_conferences.yml<br/>Conference definitions]
comms[_data/interview_communities.yml<br/>Community definitions]
resources[_data/resources.yml<br/>Trusted source registry]
end
interviews <-->|video_asset_id / interview_id| assets
oneoffs -->|video_asset_id| assets
scmc -->|video_asset_id| assets
interviews -->|conference field| confs
interviews -->|community field| comms
confs -->|slug| resources
assets -.->|transcript_id| transcripts
File Descriptions
Canonical Data Files
_data/interviews.yml
The authoritative source for interview records. Each interview contains:
items:
- id: string # Unique identifier (slug)
title: string # Display title
interviewees: [string] # List of interviewee names
interviewer: string # Usually "Mike Hall"
topic: string # Interview topic (optional)
conference: string # Conference name (references interview_conferences.yml)
conference_year: int # Year of conference (optional)
community: string # Community name (references interview_communities.yml)
recorded_date: date # When the interview was recorded
tags: [string] # Categorization tags
video_asset_id: string # video_assets.id
_data/video_assets.yml
Canonical video asset records. The id is the primary key and permalink for a video, with one or more platform publications stored in platforms[]. Links back to interviews via interview_id (optional):
items:
- id: string # Primary key / permalink
interview_id: string # Links to interviews.yml id (optional)
title: string # Canonical title
primary_platform: string # Preferred platform for defaults (optional)
source: string # Source identifier (optional, e.g., "ugtastic")
published_date: date # Primary published date (optional)
thumbnail: string # Primary thumbnail URL (optional)
thumbnail_local: string # Local thumbnail path (optional)
duration_seconds: int # Video length (optional)
duration_minutes: int # Video length (rounded, optional)
description: string # Canonical description (optional)
tags: [string] # Tags
transcript_id: string # References transcripts.yml id (optional)
platforms: # Per-platform publications
- platform: string # "vimeo", "youtube", "pechakucha", etc.
asset_id: string # Platform-specific video ID
url: string # Direct URL to video (optional)
embed_url: string # Embeddable player URL (optional)
title_on_platform: string
published_date: date # Publication date on platform (optional)
thumbnail: string # Platform thumbnail URL (optional)
thumbnail_local: string # Local thumbnail path (optional)
duration_seconds: int # Video length (optional)
duration_minutes: int # Video length (rounded, optional)
description: string # Platform description (optional)
source: string # Platform source identifier (optional)
playlist: string # Playlist name/label (optional)
video_url: string # Direct video file URL (PechaKucha)
image_url: string # Poster image (PechaKucha)
_data/transcripts/*.yml
Canonical transcripts referenced by video_assets.yml. Each transcript is stored as its own data file keyed by transcript ID (file basename):
# File: _data/transcripts/<transcript_id>.yml
content: |
Transcript text...
_data/transcripts.yml remains as a legacy placeholder and is not the active transcript content source.
Lookup Tables
_data/interview_conferences.yml
Conference definitions referenced by interviews:
conferences:
- slug: string # URL-friendly identifier
name: string # Full conference name (display)
conference: string # Conference series (matched by interviews.conference)
year: int # Conference year (matched by interviews.conference_year)
start_date: date # Conference start
end_date: date # Conference end
location: string # Venue location
description: string # Conference description
_data/interview_communities.yml
Community definitions referenced by interviews:
communities:
- slug: string # URL-friendly identifier
name: string # Full community name (matched by interviews.community)
description: string # Community description
_data/resources.yml
Trusted source registry keyed by conference slug:
source_policy: repository-only
conferences:
conference-slug:
- label: string # Source label
url: string # Absolute URL
kind: string # Optional source kind
notes: string # Optional context
Source Data Files
_data/oneoff_videos.yml
One-off videos (standalone talks and presentations):
items:
- video_asset_id: string # References video_assets.yml id
slug: string # URL-friendly identifier
title: string # Video title
topic: string # Topic (optional)
speaker: string # Speaker name (optional)
people: [string] # People featured (optional)
speakers: [string] # Speakers list (optional)
created: date # Creation date
tags: [string] # Tags
views: int # View count (optional)
category: string # Category label (optional)
categories: [string] # Category list (optional)
_data/scmc_videos.yml
SCMC (Software Craftsmanship McHenry County) recordings:
items:
- video_asset_id: string # References video_assets.yml id
slug: string # URL-friendly identifier
title: string # Video title
topic: string # Topic (optional)
speakers: [string] # Speakers list (optional)
created: date # Creation date
tags: [string] # Tags
views: int # View count (optional)
category: string # Category label (optional)
Relationships
| From | To | Relationship | Field |
|---|---|---|---|
| interviews | video_assets | N:1 | video_asset_id |
| video_assets | interviews | N:1 (optional) | interview_id |
| oneoff_videos | video_assets | N:1 | video_asset_id |
| scmc_videos | video_assets | N:1 | video_asset_id |
| video_assets | transcripts | N:1 (optional) | transcript_id |
| interviews | conferences | N:1 | conference + conference_year |
| interviews | communities | N:1 | community (name match) |
| conferences | resources | 1:N | slug → resources.conferences[slug] |
Maintenance
Imports are complete. The canonical sources of truth are interviews.yml and video_assets.yml. Ongoing work focuses on pruning, deduping, and maintaining canonical metadata.