Interview Data Model

This document describes the data model for the interview/video content on the site.

Data Flow Overview

Rendered Diagram

flowchart TB
  subgraph Canonical["Canonical Data"]
    interviews[_data/interviews.yml<br/>Interview records]
    assets[_data/video_assets.yml<br/>Canonical video assets]
    transcripts[_data/transcripts/*.yml<br/>Transcript entries referenced by video_assets]
  end

  subgraph Buckets["Video Buckets"]
    oneoffs[_data/oneoff_videos.yml<br/>One-off video list]
    scmc[_data/scmc_videos.yml<br/>SCMC video list]
  end

  subgraph Lookups["Lookup Tables"]
    confs[_data/interview_conferences.yml<br/>Conference definitions]
    comms[_data/interview_communities.yml<br/>Community definitions]
    resources[_data/resources.yml<br/>Trusted source registry]
  end
  interviews <-->|video_asset_id / interview_id| assets
  oneoffs -->|video_asset_id| assets
  scmc -->|video_asset_id| assets
  interviews -->|conference field| confs
  interviews -->|community field| comms
  confs -->|slug| resources
  assets -..->|transcript_id| transcripts

Diagram Source

flowchart TB
  subgraph Canonical["Canonical Data"]
    interviews[_data/interviews.yml<br/>Interview records]
    assets[_data/video_assets.yml<br/>Canonical video assets]
    transcripts[_data/transcripts/*.yml<br/>Transcript entries referenced by video_assets]
  end

  subgraph Buckets["Video Buckets"]
    oneoffs[_data/oneoff_videos.yml<br/>One-off video list]
    scmc[_data/scmc_videos.yml<br/>SCMC video list]
  end

  subgraph Lookups["Lookup Tables"]
    confs[_data/interview_conferences.yml<br/>Conference definitions]
    comms[_data/interview_communities.yml<br/>Community definitions]
    resources[_data/resources.yml<br/>Trusted source registry]
  end
  interviews <-->|video_asset_id / interview_id| assets
  oneoffs -->|video_asset_id| assets
  scmc -->|video_asset_id| assets
  interviews -->|conference field| confs
  interviews -->|community field| comms
  confs -->|slug| resources
  assets -.->|transcript_id| transcripts

File Descriptions

Canonical Data Files

`_data/interviews.yml`

The authoritative source for interview records. Each interview contains:

items:
  - id: string              # Unique identifier (slug)
    title: string           # Display title
    interviewees: [string]  # List of interviewee names
    interviewer: string     # Usually "Mike Hall"
    topic: string           # Interview topic (optional)
    conference: string      # Conference name (references interview_conferences.yml)
    conference_year: int    # Year of conference (optional)
    community: string       # Community name (references interview_communities.yml)
    recorded_date: date     # When the interview was recorded
    tags: [string]          # Categorization tags
    video_asset_id: string  # video_assets.id

`_data/video_assets.yml`

Canonical video asset records. The id is the primary key and permalink for a video, with one or more platform publications stored in platforms[]. Links back to interviews via interview_id (optional):

items:
  - id: string                # Primary key / permalink
    interview_id: string      # Links to interviews.yml id (optional)
    title: string             # Canonical title
    primary_platform: string  # Preferred platform for defaults (optional)
    source: string            # Source identifier (optional, e.g., "ugtastic")
    published_date: date      # Primary published date (optional)
    thumbnail: string         # Primary thumbnail URL (optional)
    thumbnail_local: string   # Local thumbnail path (optional)
    duration_seconds: int     # Video length (optional)
    duration_minutes: int     # Video length (rounded, optional)
    description: string       # Canonical description (optional)
    tags: [string]            # Tags
    transcript_id: string     # References transcripts.yml id (optional)
    platforms:                # Per-platform publications
      - platform: string      # "vimeo", "youtube", "pechakucha", etc.
        asset_id: string      # Platform-specific video ID
        url: string           # Direct URL to video (optional)
        embed_url: string     # Embeddable player URL (optional)
        title_on_platform: string
        published_date: date  # Publication date on platform (optional)
        thumbnail: string     # Platform thumbnail URL (optional)
        thumbnail_local: string # Local thumbnail path (optional)
        duration_seconds: int   # Video length (optional)
        duration_minutes: int   # Video length (rounded, optional)
        description: string     # Platform description (optional)
        source: string          # Platform source identifier (optional)
        playlist: string        # Playlist name/label (optional)
        video_url: string       # Direct video file URL (PechaKucha)
        image_url: string       # Poster image (PechaKucha)

`_data/transcripts/*.yml`

Canonical transcripts referenced by video_assets.yml. Each transcript is stored as its own data file keyed by transcript ID (file basename):

# File: _data/transcripts/<transcript_id>.yml
content: |
  Transcript text...

_data/transcripts.yml remains as a legacy placeholder and is not the active transcript content source.

Lookup Tables

`_data/interview_conferences.yml`

Conference definitions referenced by interviews:

conferences:
  - slug: string        # URL-friendly identifier
    name: string        # Full conference name (display)
    conference: string  # Conference series (matched by interviews.conference)
    year: int           # Conference year (matched by interviews.conference_year)
    start_date: date    # Conference start
    end_date: date      # Conference end
    location: string    # Venue location
    description: string # Conference description

`_data/interview_communities.yml`

Community definitions referenced by interviews:

communities:
  - slug: string        # URL-friendly identifier
    name: string        # Full community name (matched by interviews.community)
    description: string # Community description

`_data/resources.yml`

Trusted source registry keyed by conference slug:

source_policy: repository-only
conferences:
  conference-slug:
    - label: string      # Source label
      url: string        # Absolute URL
      kind: string       # Optional source kind
      notes: string      # Optional context

Source Data Files

`_data/oneoff_videos.yml`

One-off videos (standalone talks and presentations):

items:
  - video_asset_id: string  # References video_assets.yml id
    slug: string            # URL-friendly identifier
    title: string           # Video title
    topic: string           # Topic (optional)
    speaker: string         # Speaker name (optional)
    people: [string]        # People featured (optional)
    speakers: [string]      # Speakers list (optional)
    created: date           # Creation date
    tags: [string]          # Tags
    views: int              # View count (optional)
    category: string        # Category label (optional)
    categories: [string]    # Category list (optional)

`_data/scmc_videos.yml`

SCMC (Software Craftsmanship McHenry County) recordings:

items:
  - video_asset_id: string  # References video_assets.yml id
    slug: string            # URL-friendly identifier
    title: string           # Video title
    topic: string           # Topic (optional)
    speakers: [string]      # Speakers list (optional)
    created: date           # Creation date
    tags: [string]          # Tags
    views: int              # View count (optional)
    category: string        # Category label (optional)

Relationships

From	To	Relationship	Field
interviews	video_assets	N:1	`video_asset_id`
video_assets	interviews	N:1 (optional)	`interview_id`
oneoff_videos	video_assets	N:1	`video_asset_id`
scmc_videos	video_assets	N:1	`video_asset_id`
video_assets	transcripts	N:1 (optional)	`transcript_id`
interviews	conferences	N:1	`conference` + `conference_year`
interviews	communities	N:1	`community` (name match)
conferences	resources	1:N	`slug` → `resources.conferences[slug]`

Maintenance

Imports are complete. The canonical sources of truth are interviews.yml and video_assets.yml. Ongoing work focuses on pruning, deduping, and maintaining canonical metadata.