Commands
/linguistic:explore
Investigate a corpus or language interactively, without committing to a pipeline plan.
Syntax
/linguistic:explore [language or corpus]Description
Enter exploration mode — read-only, curiosity-driven inspection of a language or corpus before deciding what to do. The orchestrator will read available data and surface:
- Language identification confidence (via
linguistic-scope) - Script and encoding observations (via
linguistic-scripts) - Corpus size, register mix, dedup stats (via
linguistic-corpus) - Tokenizer fertility against a baseline (via
linguistic-tokenize)
You stay in exploration mode until you opt into a plan via /linguistic:propose or /linguistic:lifecycle.
Options
| Argument | Description |
|---|---|
[language] | Target language name or ISO code |
[corpus path] | Path to a corpus file or directory to explore |
Example
/linguistic:explore Yoruba
> Exploring Yoruba (yor)...
ISO: yor | Glottolog: yoru1245 | Script: Latin + tone diacritics
Resource class: Joshi 2 (Hopefuls)
Fertility on tiktoken-cl100k_base: 3.4× — vocab extension recommended
Corpus available: Bible-NLP (CC-BY), OPUS, CulturaX subset
Register: liturgical 45%, web 35%, news 20% — Bible% high; flagRelated Commands
/linguistic:propose— generate a plan based on exploration findings/linguistic:lifecycle— enter full pipeline
Was this page helpful?
Edit on GitHub
Last updated on