Skip to content

CLI Reference

The CLI is the fastest way to run Truthound from a terminal, shell script, or CI job. Use it when you want file-based validation, profiling, docs generation, checkpoints, benchmark flows, or scaffolding without embedding Truthound in Python code.

Installation

pip install truthound

After installation, the truthound command is available globally.

When To Use The CLI

Choose the CLI when you want:

  • a shell-first validation workflow
  • simple CI or cron execution with explicit commands
  • team-friendly reproducibility without asking everyone to write Python
  • quick access to profiling, docs generation, plugin scaffolding, and benchmark commands

CLI vs Python API

The CLI only supports file-based inputs. For SQL databases (PostgreSQL, MySQL, SQLite), Spark DataFrames, or Cloud Data Warehouses (BigQuery, Snowflake, Redshift, Databricks), use the Python API with the source= parameter.

  1. truthound check for a first validation
  2. profiler commands for profile-driven suite generation
  3. checkpoint commands for automation
  4. plugin and scaffolding commands for extension work

If you are still learning the platform, start with Getting Started. If you already know the feature area, jump directly to the command-group pages below.

Supported Input Formats

Format Extension Description
CSV .csv Comma-separated values
JSON .json Standard JSON array
Parquet .parquet Columnar storage format
NDJSON .ndjson Newline-delimited JSON
JSONL .jsonl JSON Lines (same as NDJSON)

Quick Reference

Core Commands Summary

Command Arguments Options
learn FILE (required) --output, -o (schema.yaml), --no-constraints
check FILE (required) --validators, -v, --min-severity, -s (low/medium/high/critical), --schema, --auto-schema, --format, -f (console/json/html), --output, -o, --strict
scan FILE (required) --format, -f (console/json/html), --output, -o
mask FILE (required) --output, -o (required), --columns, -c, --strategy, -s (redact/hash/fake), --strict
profile FILE (required) --format, -f (console/json), --output, -o
compare BASELINE CURRENT (required) --columns, -c, --method, -m (auto/ks/psi/chi2/js), --threshold, -t, --format, -f (console/json), --output, -o, --strict

Profiler Commands Summary

Command Arguments Options
auto-profile FILE (required) --output, -o, --format, -f (console/json/yaml), --patterns/--no-patterns, --correlations/--no-correlations, --sample, -s, --top-n (10)
generate-suite PROFILE_FILE (required) --output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence (low/medium/high), --name, -n, --preset, -p, --config, -c, --group-by-category, --code-style (functional/class_based/declarative)
quick-suite FILE (required) --output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence, --name, -n, --preset, -p, --sample-size
list-formats - -
list-presets - -
list-categories - -

Checkpoint Commands Summary

checkpoint vs realtime checkpoint

truthound checkpoint is for CI/CD pipelines (YAML configuration file based).

For streaming validation state management, use truthound realtime checkpoint.

Command Arguments Options
checkpoint run NAME (required) --config, -c (truthound.yaml), --data, -d, --validators, -v, --output, -o, --format, -f (console/json), --strict, --store, --slack, --webhook, --github-summary
checkpoint list - --config, -c, --format, -f (console/json)
checkpoint validate CONFIG_FILE (required) --strict, -s
checkpoint init - --output, -o (truthound.yaml), --format, -f (yaml/json)

ML Commands Summary

Command Arguments Options
ml anomaly FILE (required) --method, -m (zscore/iqr/mad/isolation_forest), --contamination, -c (0.1), --columns, --output, -o, --format, -f (console/json)
ml drift BASELINE CURRENT (required) --method, -m (distribution/feature/multivariate), --threshold, -t (0.1), --columns, --output, -o
ml learn-rules FILE (required) --output, -o (learned_rules.json), --strictness, -s (loose/medium/strict), --min-confidence (0.9), --max-rules (100)

Docs Commands Summary

Command Arguments Options
docs generate PROFILE_FILE (required) --output, -o, --title, -t ("Data Profile Report"), --subtitle, -s, --theme (light/dark/professional/minimal/modern), --format, -f (html/pdf)
docs themes - -

Data Docs Dashboard UI Command Summary

Command Arguments Options
dashboard - --profile, -p, --port (8080), --host (localhost), --title, -t ("Truthound Dashboard"), --debug

This command launches the local Data Docs dashboard UI. For the first-party control-plane product, see Truthound Dashboard.

Realtime Commands Summary

realtime checkpoint vs checkpoint

truthound realtime checkpoint is for streaming validation state management (--dir option based).

For CI/CD pipelines, use truthound checkpoint.

Command Arguments Options
realtime validate SOURCE (required) --validators, -v, --batch-size, -b (1000), --max-batches (10, 0=unlimited), --output, -o
realtime monitor SOURCE (required) --interval, -i (5), --duration, -d (60, 0=unlimited)
realtime checkpoint list - --dir, -d (./checkpoints), --format, -f (console/json)
realtime checkpoint show CHECKPOINT_ID (required) --dir, -d
realtime checkpoint delete CHECKPOINT_ID (required) --dir, -d, --force, -f

Benchmark Commands Summary

Command Arguments Options
benchmark run BENCHMARK (optional) --suite, -s (quick/ci/full/profiling/validation), --size (tiny/small/medium/large/xlarge), --rows, -r, --iterations, -i (5), --warmup, -w (2), --output, -o, --format, -f (console/json/html), --save-baseline, --compare-baseline, --verbose, -v
benchmark list - --format, -f (console/json)
benchmark compare BASELINE CURRENT (required) --threshold, -t (10.0%), --format, -f (console/json)
benchmark parity - --suite (pr-fast/nightly-core/nightly-sql/release-ga), --frameworks (truthound/gx/both), --backend, --output, --save-baseline, --compare-baseline, --strict

Scaffolding Commands Summary

Command Arguments Options
new validator NAME (required) --output, -o (.), --template, -t (basic/column/pattern/range/comparison/composite/full), --author, -a, --description, -d, --category, -c (custom), --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --severity, -s (MEDIUM), --pattern, --min, --max
new reporter NAME (required) --output, -o (.), --template, -t (basic/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --extension, -e (.txt), --content-type (text/plain)
new plugin NAME (required) --output, -o (.), --type, -t (validator/reporter/hook/datasource/action/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --min-version (defaults to current Truthound release), --python (3.10)
new list - --verbose, -v
new templates SCAFFOLD_TYPE (required) -

Plugin Commands Summary

Command Arguments Options
plugins list - --type, -t (validator/reporter/hook/datasource/action/custom), --state, -s (discovered/loading/loaded/active/inactive/error/unloading), --verbose, -v, --json
plugins info NAME (required) --json
plugins load NAME (required) --activate/--no-activate (--activate)
plugins unload NAME (required) -
plugins enable NAME (required) -
plugins disable NAME (required) -
plugins create NAME (required) --output, -o (.), --type, -t (validator/reporter/hook/custom), --author

Global Options

These options are available for all commands:

Option Description
--help Show help message and exit
--version Show version and exit

Command Groups

Core Commands

Essential data quality operations:

Command Description
learn Learn schema from data
check Validate data quality
scan Scan for PII
mask Mask sensitive data
profile Generate data profile
compare Detect data drift

Profiler Commands

Advanced profiling and rule generation:

Command Description
auto-profile Profile with auto-detection
generate-suite Generate validation rules from profile
quick-suite Profile and generate rules in one step
list-formats List supported output formats
list-presets List available presets
list-categories List rule categories

Checkpoint Commands

CI/CD pipeline integration:

Command Description
checkpoint run Run validation pipeline
checkpoint list List available checkpoints
checkpoint validate Validate configuration
checkpoint init Initialize sample config

ML Commands

Machine learning-based detection:

Command Description
ml anomaly Detect anomalies
ml drift Detect data drift
ml learn-rules Learn validation rules

Lineage Commands Summary

Command Arguments Options
lineage show LINEAGE_FILE (required) --node, -n, --direction, -d (upstream/downstream/both), --format, -f (console/json/dot)
lineage impact LINEAGE_FILE NODE (required) --max-depth (-1), --output, -o
lineage visualize LINEAGE_FILE (required) --output, -o (required), --renderer, -r (d3/cytoscape/graphviz/mermaid), --theme, -t (light/dark), --focus, -f

Lineage Commands

Data lineage tracking:

Command Description
lineage show Display lineage information
lineage impact Analyze change impact
lineage visualize Generate lineage visualization

Docs Commands

Documentation generation:

Command Description
docs generate Generate HTML/PDF report
docs themes List available themes

Data Docs Dashboard UI Command

Interactive data exploration:

Command Description
dashboard Launch interactive dashboard

Realtime Commands

Streaming validation:

Command Description
realtime validate Validate streaming data
realtime monitor Monitor validation metrics
realtime checkpoint Manage validation checkpoints

Benchmark Commands

Performance testing:

Command Description
benchmark run Run performance benchmarks
benchmark list List available benchmarks
benchmark compare Compare benchmark results
benchmark parity Run release-grade Truthound vs GX parity suites

Scaffolding Commands

Code generation:

Command Description
new validator Create custom validator
new reporter Create custom reporter
new plugin Create plugin package
new list List scaffold types
new templates List available templates

Plugin Commands

Plugin management:

Command Description
plugins list List discovered plugins
plugins info Show plugin details
plugins load Load a plugin
plugins unload Unload a plugin
plugins enable Enable a plugin
plugins disable Disable a plugin
plugin create Create plugin template

Exit Codes

Code Meaning
0 Success
1 General error or validation failed (with --strict)
2 Usage error (invalid arguments)

Quick Examples

# Learn schema from data
truthound learn data.csv -o schema.yaml

# Validate data quality
truthound check data.csv --strict

# Scan for PII
truthound scan customers.csv

# Mask sensitive data
truthound mask data.csv -o masked.csv --strategy hash

# Generate data profile
truthound profile data.csv --format json -o profile.json

# Compare datasets for drift
truthound compare baseline.csv current.csv --method psi