Generate a Quick Report

This tutorial shows the full three-skill pipeline in action: load a CSV, profile its quality, and generate a structured markdown report — all from a single prompt. The agent chains through magic-data-loading → magic-data-profiling → magic-report-generation automatically, producing a 10-section report with severity-rated issues and recommended cleaning steps.

Setup

Follow these steps to set up your environment. If you've already installed, skip to the sample data.

Install the skills

npx @votee-ai/magic-data-agent-skills init

Create a workspace

mkdir magic-tutorial && cd magic-tutorial

Download sample data

curl -O https://docs.votee.ai/tutorials/data/sales_data.csv

Sample Data & Output

sales_data.csv

Input — 515 rows × 12 columns with embedded quality issues

quality_report.md

Output — the actual 10-section report the agent generated

The Agent Trace

Watch how the agent chains three skills together — loading, profiling, and report generation — from a single prompt. Click any step to expand its details.

magic-data-loadingmagic-data-profilingmagic-report-generation3 minutes

Load data/sales_data.csv, profile it, and generate a complete data quality report. Save the report as reports/quality_report.md

User prompt

00:00

magic-data-loadingSkillloadcsv

00:05

Detected 'Load' with a CSV file path — starting the pipeline with data ingestion. The agent also lists the skill's scripts/ directory to find callable tools.

detect_format.py (reference script)Execution

00:10

Input

{
  "file_path": "data/sales_data.csv"
}

Output

{
  "format": "csv",
  "encoding": "utf-8",
  "delimiter": ",",
  "has_header": true,
  "estimated_rows": 515
}

python3 .claude/skills/magic-data-loading/scripts/detect_format.py data/sales_data.csv /tmp/detect_result.json

load_file.py (reference script)Execution

00:15

Input

{
  "file_path": "data/sales_data.csv",
  "output_format": "csv"
}

Output

{
  "rows": 515,
  "columns": 12,
  "output_file": "data/sales_loaded.csv"
}

python3 .claude/skills/magic-data-loading/scripts/load_file.py data/sales_data.csv data/sales_loaded.csv --output-format csv

magic-data-profilingSkillprofile

00:19

magic-data-loading → magic-data-profiling— Data loaded — 515 rows × 12 columns saved to checkpoint. Transitioning to profiling for quality assessment.

User asked to 'profile' — loading profiling SKILL.md with quality scoring patterns and issue detection methods

quality_score.py + detect_all_issues.py (reference scripts)Execution

00:22

Input

{
  "file_path": "data/sales_loaded.csv"
}

Output

{
  "quality_score": 88.9,
  "grade": "B",
  "issues_detected": 6,
  "duplicates": 15,
  "date_format_inconsistency": "20.4% non-ISO",
  "quantity_outliers": 5,
  "notes_missing": "89.7%"
}

mkdir -p logs
python3 .claude/skills/magic-data-profiling/scripts/quality_score.py data/sales_loaded.csv logs/quality_report.json
python3 .claude/skills/magic-data-profiling/scripts/detect_all_issues.py data/sales_loaded.csv logs/issues_report.json --include-content-validation

deep-dive analysis (custom code)Execution

00:30

Input

{
  "focus": [
    "duplicates",
    "date_formats",
    "distributions"
  ]
}

Output

{
  "duplicate_rows": 30,
  "duplicate_order_ids": 15,
  "date_formats": {
    "ISO": 410,
    "US_slash": 69,
    "DD-Mon-YYYY": 36
  },
  "quantity_outlier_range": "5,741–15,862 vs median ~165"
}

import pandas as pd
df = pd.read_csv('data/sales_loaded.csv')
dupes = df[df.duplicated(keep=False)]
print(f'Duplicate rows: {len(dupes)}')

import re
iso_pat = re.compile(r'^\d{4}-\d{2}-\d{2}$')
us_pat = re.compile(r'^\d{2}/\d{2}/\d{4}$')

magic-report-generationSkillgeneratereport

00:37

magic-data-profiling → magic-report-generation— Profiling complete — 6 findings documented in logs/issues_report.json. User explicitly asked for a report, so activating report generation skill.

Detected 'generate a complete data quality report' — loading report generation SKILL.md with template patterns and validation rules

write quality_report.mdExecution

00:42

Input

{
  "template": "standard",
  "output_path": "reports/quality_report.md",
  "profiling_data": "logs/quality_report.json + logs/issues_report.json"
}

Output

{
  "file": "reports/quality_report.md",
  "word_count": 1417,
  "sections": 6,
  "findings": 6
}

validate_report.py (reference script)Execution

00:48

Input

{
  "report_path": "reports/quality_report.md",
  "template": "standard"
}

Output

{
  "all_sections_present": true,
  "structure_valid": true,
  "word_count": 1417,
  "template_ceiling": 600,
  "verdict": "valid — word count exceeds ceiling but justified by 6 distinct findings"
}

python3 .claude/skills/magic-report-generation/scripts/validate_report.py reports/quality_report.md logs/report_validation.json --template standard

reports/quality_report.mdCheckpoint

00:50

reports/quality_report.md

Complete data quality report with 6 sections, validated against standard template. Supporting artifacts in logs/.

Results deliveredOutput

00:52

Report saved to `reports/quality_report.md`

Headline: 88.9/100 (Grade B)

All 6 mandatory sections present, structure validated.

Key findings:

20.4% of dates use non-ISO formats — 69 rows MM/DD/YYYY, 36 rows DD-Mon-YYYY

15 duplicate `order_id` values / 30 fully-duplicated rows

5 extreme `quantity` outliers (5,741–15,862) that account for the bulk of revenue variance

`notes` column 89.7% empty with uniform whitespace padding

`email` ↔ `sales_rep` mapping broken — emails shared across multiple reps

Missing contact data — email 13%, phone 8.2%

Supporting artifacts saved to logs/quality_report.json and logs/issues_report.json.

Artifacts

quality_report.md

6-section data quality report, validated against standard template

quality_report.json

Quality score and grade

issues_report.json

Detailed issue detection results

report_validation.json

Template validation results

Data Flow

Step	Skill	Input	Output	Key Config
1	`magic-data-loading`	`sales_data.csv` (raw file)	DataFrame (515×12), saved to `sales_loaded.csv`	encoding: auto-detect
2	`magic-data-profiling`	`sales_loaded.csv`	Quality score (88.9/100), 6 issues in `logs/`	scripts: quality_score.py, detect_all_issues.py
3	`magic-report-generation`	Profiling results in `logs/`	`reports/quality_report.md` (6 sections, 1417 words)	template: standard, validated

Report Preview

The agent generated a comprehensive quality report with 10 sections. Here's the issues summary:

#	Severity	Issue	Affected Rows	Recommendation
1	High	15 exact duplicate rows	30 (15 pairs)	Deduplicate by dropping exact duplicates
2	High	3 inconsistent date formats	105 (non-ISO rows)	Normalize all dates to `YYYY-MM-DD`
3	High	Emails shared across all reps	448	Do not use email as a rep identifier
4	Medium	5 extreme quantity outliers	5	Flag for business review
5	Medium	67 missing emails (13.0%)	67	Accept or impute based on business rules
6	Medium	42 missing phones (8.2%)	42	Accept or impute based on business rules
7	Low	Notes field 89.7% empty	515	Drop the column
8	Low	Whitespace in notes	53	Strip whitespace during cleaning

What Happened

Three-Skill Chain

The prompt triggered a pipeline across three skills:

"Load" → magic-data-loading
"profile" → magic-data-profiling
"generate a complete data quality report" → magic-report-generation

The agent routed through all three automatically, activating each skill and reading its SKILL.md as it transitioned.

Data Loading

Loaded sales_data.csv — 515 rows × 12 columns. The agent checked file existence, line count, and column types in a single pass, then saved the result to sales_loaded.csv for downstream skills.

Comprehensive Profiling

Because the prompt included "profile", the agent activated magic-data-profiling, transitioning from the loading skill. It ran a single comprehensive analysis covering all quality dimensions:

Completeness — 90.8% (email/phone/notes have gaps)
Uniqueness — 97.1% (15 exact duplicate rows)
Consistency — 3 mixed date formats, emails shared across reps
Validity — 5 quantity outliers, but revenue math is 100% consistent

Report Generation

Because the prompt asked for a "complete data quality report", the agent activated magic-report-generation, transitioning from the profiling skill. It wrote reports/quality_report.md — a 6-section, 1417-word report with severity-rated issues and a prioritized cleaning sequence.

Try It Yourself

Copy this prompt into your MAGIC session:

Load data/sales_data.csv, profile it, and generate a complete data quality
report. Save the report as reports/quality_report.md

Want a shorter output? Try:

Load data/sales_data.csv and give me a brief quality summary

The agent adjusts verbosity based on whether you say "complete report" vs "brief summary".

What You Learned

Multi-skill chaining works from a single prompt — you don't need to invoke each skill separately.
The agent produces a structured markdown report you can commit to version control or share with stakeholders.
Reports include severity ratings (High/Medium/Low) and specific row counts — not vague descriptions.
The recommended cleaning sequence is ordered by priority, with the highest-impact fixes first.
All three skills share context automatically — profiling results flow directly into the report without you needing to specify file paths.

Generate a Quick Report

Setup

Install the skills

Create a workspace

Download sample data

Sample Data & Output

sales_data.csv

quality_report.md

The Agent Trace