Shared Utilities
The _linguistic_shared/ directory contains shared utilities used across all 18 linguistic skills. These are not user-facing skills — they are internal libraries that provide consistent behavior across the suite.
Library Contents
interaction_utils.py
Utilities for reading and writing workspace_state.md, managing phase transitions, and formatting phase indicators.
Key functions:
| Function | Purpose |
|---|---|
read_workspace_state(path) | Read and parse workspace_state.md from the given path |
write_workspace_state(path, state) | Write structured state to workspace_state.md |
snapshot_workspace_state(path) | Create a timestamped snapshot in logs/ before destructive updates |
format_phase_indicator(state) | Format the `[Phase: X |
get_current_phase(state) | Extract current pipeline phase from state |
record_skill_routing(state, skill, reason) | Append to skill routing history |
record_decision(state, topic, decision, rationale) | Append to decisions log |
Usage pattern:
from _linguistic_shared.interaction_utils import (
read_workspace_state,
write_workspace_state,
format_phase_indicator
)
state = read_workspace_state("workspace_state.md")
state["targets"]["resource_class"] = 2
write_workspace_state("workspace_state.md", state)
print(format_phase_indicator(state))
# [Phase: Scope | Language: Yoruba (yor) | Resource Class: 2 | ...]findings_presenter.py
Utilities for accumulating, deduplicating, and presenting structured findings across skills.
Key functions:
| Function | Purpose |
|---|---|
add_finding(state, severity, skill, evidence, action) | Add a finding to the workspace state |
get_findings(state, severity=None) | Retrieve findings, optionally filtered by severity |
format_findings_report(state) | Format the full findings report (HIGH / MEDIUM / LOW sections) |
count_open_findings(state) | Count findings for phase indicator |
Severity levels:
| Level | Examples |
|---|---|
HIGH | License violation risk; eval contamination >10%; fertility >5×; sacred-text in training corpus |
MEDIUM | Register imbalance (Bible >30%); URIEL distance to transfer source >0.6; missing ethics sign-off |
LOW | Minor dedup gain available; stale benchmark version; non-canonical romanization |
Usage pattern:
from _linguistic_shared.findings_presenter import add_finding, format_findings_report
add_finding(
state,
severity="HIGH",
skill="linguistic-corpus",
evidence="Bible-NLP constitutes 62% of corpus (threshold: 30%)",
action="Reduce Bible slice to ≤30%; supplement with web/news sources"
)
print(format_findings_report(state))Design Principles
Single source of truth: All skills read from and write to the same workspace_state.md structure via these utilities. Inconsistent state access across skills was the primary source of inter-skill bugs in early development.
Snapshot before mutation: write_workspace_state always calls snapshot_workspace_state before writing destructive updates. This enables /linguistic:rollback to restore prior state.
Transparent findings: The findings system is append-only during a pipeline run. Skills add findings; the orchestrator presents them. Skills never suppress or overwrite findings from other skills.
Testing
The shared utilities are covered by the suite's unit test suite (tests/unit/test_interaction_utils.py, tests/unit/test_findings_presenter.py). All 226 suite tests include integration tests that exercise the shared utilities through realistic multi-skill scenarios.
Location in Repository
skills/
└── _linguistic_shared/
├── __init__.py
├── interaction_utils.py
└── findings_presenter.pyThe leading underscore in _linguistic_shared/ follows the Claude Code Skills convention for shared utility directories — it prevents the directory from being treated as an independent skill.
Last updated on
Joshi Classification
The 6-level resource classification system (Classes 0–5) for characterizing language data availability, with language examples and strategy implications for each level.
Quality Gating
The skill-judge 8-dimension 120-point rubric used to gate all 18 linguistic skills — dimensions, per-tier score requirements, and how scores influence pipeline routing.