Core Commands¶

The core commands provide essential data quality operations for validation, profiling, and data protection.

Overview¶

Command	Description	Primary Use Case
`learn`	Learn schema from data	Schema inference
`check`	Validate data quality	Data validation
`scan`	Scan for PII	Privacy compliance
`mask`	Mask sensitive data	Data anonymization
`profile`	Generate data profile	Data exploration
`compare`	Detect data drift	Model monitoring

Typical Workflow¶

graph LR
    A[Raw Data] --> B[learn]
    B --> C[schema.yaml]
    A --> D[check]
    C --> D
    D --> E{Issues?}
    E -->|Yes| F[Fix Data]
    E -->|No| G[scan]
    G --> H{PII Found?}
    H -->|Yes| I[mask]
    H -->|No| J[Ready]
    I --> J

1. Schema Learning¶

First, learn a schema from your reference data:

truthound learn reference_data.csv -o schema.yaml

2. Data Validation¶

Validate new data against the schema:

truthound check new_data.csv --schema schema.yaml --strict

3. PII Detection¶

Scan for personally identifiable information:

truthound scan customer_data.csv

4. Data Masking¶

Mask sensitive columns before sharing:

truthound mask customer_data.csv -o safe_data.csv --strategy hash

5. Data Profiling¶

Generate statistical profile for analysis:

truthound profile data.csv --format json -o profile.json

6. Drift Detection¶

Compare datasets to detect distribution changes:

truthound compare baseline.csv production.csv --method psi

Common Options¶

All core commands share these common patterns:

Output Format (`-f, --format`)¶

# Console output (default)
truthound check data.csv

# JSON output
truthound check data.csv --format json

# HTML report
truthound check data.csv --format html -o report.html

Output File (`-o, --output`)¶

truthound check data.csv -o results.json --format json

Strict Mode (`--strict`)¶

Exit with code 1 if issues are found (useful for CI/CD):

truthound check data.csv --strict
truthound compare baseline.csv current.csv --strict

CI/CD Integration¶

Use core commands in your CI/CD pipeline:

# GitHub Actions example
- name: Validate Data Quality
  run: truthound check data/*.csv --strict

- name: Check for PII
  run: truthound scan data/*.csv --format json -o pii_report.json

Next Steps¶

learn - Learn schema from data
check - Validate data quality
scan - Scan for PII
mask - Mask sensitive data
profile - Generate data profile
compare - Detect data drift