MAGIC Agent Skills is now open source! Star on GitHub
MAGIC Agent SkillsMAGIC Agent Skills
Skills

linguistic-orchestrator

Entry point for any linguistic / NLP / LLM-for-low-resource-language task. Coordinates the 5-phase pipeline and routes to the right specialist skill. Use this skill whenever a target language is mentioned in conjunction with any ML/NLP operation.

Overview

The orchestrator is the conductor — it reads workspace_state.md, identifies the current pipeline phase, and routes to the appropriate specialist skill(s). It never duplicates specialist content; it always hands off. Every session begins here, and the orchestrator resumes seamlessly from wherever the last session left off.

Pipeline Position

Phase: All phases (entry point and coordinator)

Activates: All other linguistic skills through routing

Entry point skill: Yes — this is where every linguistic pipeline begins

When It Activates

  • User mentions a target language (especially non-English / low-resource) with any LLM/NLP task
  • User needs the multi-step linguistic pipeline
  • Session needs phase tracking, multi-skill coordination, or workspace state
  • User is unsure which linguistic-* specialist to use — this skill triages

Natural language triggers (identical to slash commands):

  • "help me build an LLM for [language]"
  • "my tokenizer produces garbage for [language]"
  • "train a Cantonese model"
  • "low-resource MT"
  • "evaluate on FLORES / Belebele / AfroBench"
  • "what data exists for [language]?"
  • Any bare target language name + ML/NLP verb

When NOT to use: A single isolated operation where the specific skill handles it directly (e.g., "just compute fertility for this tokenizer" → linguistic-tokenize directly).

What It Does

On First Touch

  1. Check workspace — if no workspace_state.md exists, create one with: target language(s), Glottolog/ISO code, resource class, pipeline phase (start at Scope)
  2. Check ethics gate early — before recommending any data sources, route to linguistic-ethics for FPIC/CARE awareness
  3. Identify phase — map the user's request to: Scope / Acquire / Analyze / Evaluate / Release
  4. Route to specialist(s) — never duplicate specialist content; always hand off

Phase Routing Table

PhaseSpecialists Routed
Scopescope → scripts → ethics (seed)
Acquirecorpus + bitext + transfer + tokenize; ethics (per-dataset)
Analyzemorph, syntax, semantics, discourse, speech, annotate (as needed)
Evaluateeval
Releaseethics (final gate)

Phase Indicator

Every substantive response includes:

[Phase: Scope | Language: Yoruba (yor) | Resource Class: 2 | Skills routed: scope, ethics]

Workspace State Management

The orchestrator reads and writes workspace_state.md in the current working directory. This file is the shared memory — scope writes language identity, scripts writes normalization policy, corpus writes the data manifest. The orchestrator snapshots state before every destructive update; use /linguistic:rollback to restore from snapshots in logs/.

Disambiguation Query

When a user query matches multiple specialists, the orchestrator decomposes and routes to ≥2 nearest skills with explicit "partial match" caveat. Queries in the linguistic domain that match no single skill are decomposed, not refused.

Pipeline Overview

Scope → Acquire → Analyze → Evaluate → Release
  |        |         |          |           |
scope    corpus    morph      eval       ethics
scripts  bitext    syntax              (release gate)
tokenize transfer  semantics
ethics   (ethics   discourse
(early   gate at   speech
gate)    each      annotate
         dataset)

Phases overlap and loop back. The orchestrator provides the skeleton; specialists own the content.

Inputs & Outputs

InputDescription
Any linguistic/NLP requestNatural language or slash command
workspace_state.mdPrior session state (if exists)
OutputDescription
Phase indicatorCurrent phase + language + resource class
Specialist routingWhich skill(s) activated and why
workspace_state.mdUpdated with new phase outputs
Open questionsDecisions pending user input

Example Usage

Natural language: "help me build an LLM for Khmer"

[Phase: Scope | Language: Khmer (khm) | Resource Class: 2 | Skills routed: scope]

Routing to linguistic-scope for:
  1. ISO 639-3 + Glottolog resolution (khm / khmr1253)
  2. Joshi resource classification
  3. Typological profile (Austroasiatic; abugida script; analytic morphology)
  4. Transfer source recommendations (Vietnamese distance 0.31)
  5. Vitality assessment (EGIDS 1 — national language; standard FPIC)

Next: linguistic-scripts for Khmer abugida normalization policy
      linguistic-ethics seed (EGIDS 1; standard gate)

All linguistic skills — this is the coordinator for the entire suite.

Was this page helpful?
Edit on GitHub

Last updated on

On this page