Repository Hygiene
This policy keeps technical debt from accumulating in the repository by making file ownership, retention, and pruning rules explicit.
File Classes
canonical: source files required to generate public site output.generated: repo-tracked generated artifacts required by the site (interviews/,videos/, selected_dataoutputs).reference: implementation documentation required for maintainers.ephemeral: local reports and build outputs (tmp/,_site/) that are never canonical.legacy: historical context files kept for migration support, not as active source of truth.
Retention Rules
- Keep
canonical,generated, andreferencefiles in version control. - Exclude
ephemeralfiles from commits unless explicitly required by policy. - Mark
legacydirectories in_data/repo_hygiene.ymland require an explicit retention decision. - Any new top-level file or directory must be declared in
_data/repo_hygiene.yml.
Pruning Workflow
- Detect candidates with
ruby ./bin/validate_repo_hygiene.rb. - Confirm that a candidate is not referenced by build scripts, templates, docs, or validators.
- If uncertain, quarantine under an archive path before deletion.
- Remove references and update
_data/repo_hygiene.ymlin the same commit. - Run
./bin/pipeline validate.
Current Decisions
career_history.mdis deprecated and must not be reintroduced.context/interviews-history/is legacy reference data and must stay explicitly marked asreview_requireduntil archival policy is finalized.
CI Enforcement
bin/validate_repo_hygiene.rb is part of ./bin/pipeline validate and enforces:
- top-level allowlist compliance
- deprecated-path blocking
- docs-link style consistency checks for
/docs/*routes