magic-statistical-analysis
Perform descriptive statistics, hypothesis testing, and correlation analysis with mandatory uncertainty communication. Use when computing statistics, testing hypotheses, comparing groups, or analyzing correlations with significance.
When It Activates
Use this skill when computing statistics or testing hypotheses. Trigger phrases: statistics, statistical, hypothesis test, t-test, chi-square, correlation, regression, significance, p-value, distribution, balance check.
- Need descriptive statistics with narrative interpretation
- Need hypothesis testing (group comparisons)
- Need correlation analysis with significance
- After magic-data-profiling or magic-data-cleaning, before reporting
- Results naturally feed into
magic-report-generationfor structured deliverables, ormagic-data-visualizationfor charts
When NOT to Use: Use magic-data-profiling for initial exploration; use magic-data-exploration for pattern discovery.
Quick Facts
| Property | Value |
|---|---|
| Version | 2.0.0 |
| Complexity | high |
| Phase | 1 |
| Scripts | 3 |
Tags
data-science statistics hypothesis-testing correlation analysis
Scripts
Scriptable Tools (call directly or read + adapt)
| Script | Standard CLI Usage | When to Customize |
|---|---|---|
descriptive_stats.py | python3 descriptive_stats.py --input data.csv --output stats.json | --columns col1,col2 to restrict; --explain for verbose narrative; --auto-checkpoint for versioned snapshots |
hypothesis_test.py | python3 hypothesis_test.py --input data.csv --output test.json --group_col region --value_col revenue | --group_col and --value_col functionally required; --test to override auto; --explain for narrative; --auto-checkpoint |
correlation_analysis.py | python3 correlation_analysis.py --input data.csv --output corr.json | --method pearson|spearman|kendall to override auto; --columns to restrict |
New in v2.0.0
--auto-checkpoint Flag
descriptive_stats.py and hypothesis_test.py support --auto-checkpoint, which saves a numbered snapshot (ckpt_NN_*.csv) after each successful analysis run.
--explain Flag
Both descriptive_stats.py and hypothesis_test.py support --explain, which outputs a JSON execution plan describing which test will be run, on which columns, with which parameters — without writing any result files.
# Preview what hypothesis test will be selected
python3 hypothesis_test.py --input data.csv --output test.json \
--group_col region --value_col revenue --explainTest Selection
The skill auto-selects the appropriate statistical test based on data characteristics:
| Condition | Test |
|---|---|
| 2 groups + normal distribution | t-test |
| 2 groups + non-normal | Mann-Whitney U |
| 3+ groups + normal | One-way ANOVA |
| 3+ groups + non-normal | Kruskal-Wallis |
| Both categorical | Chi-square |
Every test result includes an effect size (Cohen's d, eta-squared, rank-biserial, or Cramer's V) alongside the p-value.
Dependencies
pandas numpy scipy matplotlib seaborn
Related Skills from Other Suites
- Linguistic Eval — NLP evaluation metrics
Last updated on
magic-data-exploration
Explore data interactively and detect patterns systematically. Use when investigating a dataset — freely exploring quality issues, comparing segments, discovering correlations, or running automated pattern detection. Covers both interactive investigation (asking questions, following threads) and scripted analysis (pattern detection, segment comparison, relationship exploration).
magic-data-transformation
Transform data by reshaping, aggregating, merging, deriving columns, and delivering to external destinations (database, HuggingFace Hub). Use when: (1) pivoting, melting, or unpivoting tables, (2) grouping and aggregating data, (3) joining or merging multiple datasets, (4) creating calculated or derived columns, (5) uploading/delivering/pushing data to HuggingFace Hub or database. Trigger keywords: pivot, melt, reshape, groupby, aggregate, merge, join, vlookup, deliver, upload, HuggingFace, push to Hub.