magic-data-exploration

Explore data interactively and detect patterns systematically. Use when investigating a dataset — freely exploring quality issues, comparing segments, discovering correlations, or running automated pattern detection. Covers both interactive investigation (asking questions, following threads) and scripted analysis (pattern detection, segment comparison, relationship exploration).

When It Activates

Use this skill when investigating data patterns or comparing segments. Trigger phrases: explore, investigate, patterns, what patterns, look into, understand data, compare groups, segment analysis, find templates, similarity.

User wants to investigate data interactively before committing to a processing plan
User wants to understand quality issues, patterns, or structure
Need to discover patterns and insights using automated scripts
Need to compare statistics across groups/segments
Need to explore pairwise relationships between columns
After magic-data-profiling, for deeper systematic investigation

When NOT to Use: Use magic-data-profiling for initial quality scoring and distribution overview. Use magic-data-cleaning for applying fixes. Use magic-statistical-analysis for formal hypothesis testing. Use magic-data-lifecycle for full multi-step processing.

Quick Facts

Property	Value
Version	2.0.0
Complexity	medium
Phase	1
Scripts	4

Scripts

Scriptable Tools (call directly or read + adapt)

Script	Standard CLI Usage	When to Customize
`detect_patterns.py`	`python3 detect_patterns.py data.csv patterns.csv`	`--max-findings 20` for broader coverage. 6 detectors: temporal cycle, categorical imbalance, numeric cluster, text pattern, outlier presence, correlation
`prepare_for_exploration.py`	`python3 prepare_for_exploration.py data.csv prepared.csv`	`--columns col1,col2` to restrict; `--derive '{"new_col": "src:expression"}'` for custom derives
`relationship_explorer.py`	`python3 relationship_explorer.py data.csv relationships.csv`	`--columns col1,col2` to restrict; `--max-pairs 20` for wider coverage. Produces PNG charts in `{stem}_charts/`
`segment_analysis.py`	`python3 segment_analysis.py data.csv segments.csv`	`--group_col col` when auto-detect picks wrong column; `--value_cols col1,col2` to narrow metrics

New in v2.0.0

prepare_for_exploration.py — Text Column Enrichment

prepare_for_exploration.py enriches a CSV with numeric representations of text columns, enabling exploration scripts to operate on text-heavy datasets. For each text column it automatically derives {col}_length, {col}_word_count, and {col}_is_present.

Run this before detect_patterns.py or relationship_explorer.py on text-only datasets — exploration scripts require at least some numeric or categorical columns to function.

python3 prepare_for_exploration.py data.csv prepared.csv

# Restrict to specific columns
python3 prepare_for_exploration.py data.csv prepared.csv --columns title,body

Dependencies

pandas numpy scipy matplotlib seaborn

Was this page helpful?