CLI Reference¶
The CLI is the fastest way to run Truthound from a terminal, shell script, or CI job. Use it when you want file-based validation, profiling, docs generation, checkpoints, benchmark flows, or scaffolding without embedding Truthound in Python code.
Installation¶
After installation, the truthound command is available globally.
When To Use The CLI¶
Choose the CLI when you want:
- a shell-first validation workflow
- simple CI or cron execution with explicit commands
- team-friendly reproducibility without asking everyone to write Python
- quick access to profiling, docs generation, plugin scaffolding, and benchmark commands
CLI vs Python API
The CLI only supports file-based inputs. For SQL databases (PostgreSQL, MySQL, SQLite), Spark DataFrames, or Cloud Data Warehouses (BigQuery, Snowflake, Redshift, Databricks), use the Python API with the source= parameter.
Recommended Reading Path¶
truthound checkfor a first validation- profiler commands for profile-driven suite generation
- checkpoint commands for automation
- plugin and scaffolding commands for extension work
If you are still learning the platform, start with Getting Started. If you already know the feature area, jump directly to the command-group pages below.
Supported Input Formats¶
| Format | Extension | Description |
|---|---|---|
| CSV | .csv |
Comma-separated values |
| JSON | .json |
Standard JSON array |
| Parquet | .parquet |
Columnar storage format |
| NDJSON | .ndjson |
Newline-delimited JSON |
| JSONL | .jsonl |
JSON Lines (same as NDJSON) |
Quick Reference¶
Core Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
learn |
FILE (required) | --output, -o (schema.yaml), --no-constraints |
check |
FILE (required) | --validators, -v, --min-severity, -s (low/medium/high/critical), --schema, --auto-schema, --format, -f (console/json/html), --output, -o, --strict |
scan |
FILE (required) | --format, -f (console/json/html), --output, -o |
mask |
FILE (required) | --output, -o (required), --columns, -c, --strategy, -s (redact/hash/fake), --strict |
profile |
FILE (required) | --format, -f (console/json), --output, -o |
compare |
BASELINE CURRENT (required) | --columns, -c, --method, -m (auto/ks/psi/chi2/js), --threshold, -t, --format, -f (console/json), --output, -o, --strict |
Profiler Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
auto-profile |
FILE (required) | --output, -o, --format, -f (console/json/yaml), --patterns/--no-patterns, --correlations/--no-correlations, --sample, -s, --top-n (10) |
generate-suite |
PROFILE_FILE (required) | --output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence (low/medium/high), --name, -n, --preset, -p, --config, -c, --group-by-category, --code-style (functional/class_based/declarative) |
quick-suite |
FILE (required) | --output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence, --name, -n, --preset, -p, --sample-size |
list-formats |
- | - |
list-presets |
- | - |
list-categories |
- | - |
Checkpoint Commands Summary¶
checkpoint vs realtime checkpoint
truthound checkpoint is for CI/CD pipelines (YAML configuration file based).
For streaming validation state management, use truthound realtime checkpoint.
| Command | Arguments | Options |
|---|---|---|
checkpoint run |
NAME (required) | --config, -c (truthound.yaml), --data, -d, --validators, -v, --output, -o, --format, -f (console/json), --strict, --store, --slack, --webhook, --github-summary |
checkpoint list |
- | --config, -c, --format, -f (console/json) |
checkpoint validate |
CONFIG_FILE (required) | --strict, -s |
checkpoint init |
- | --output, -o (truthound.yaml), --format, -f (yaml/json) |
ML Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
ml anomaly |
FILE (required) | --method, -m (zscore/iqr/mad/isolation_forest), --contamination, -c (0.1), --columns, --output, -o, --format, -f (console/json) |
ml drift |
BASELINE CURRENT (required) | --method, -m (distribution/feature/multivariate), --threshold, -t (0.1), --columns, --output, -o |
ml learn-rules |
FILE (required) | --output, -o (learned_rules.json), --strictness, -s (loose/medium/strict), --min-confidence (0.9), --max-rules (100) |
Docs Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
docs generate |
PROFILE_FILE (required) | --output, -o, --title, -t ("Data Profile Report"), --subtitle, -s, --theme (light/dark/professional/minimal/modern), --format, -f (html/pdf) |
docs themes |
- | - |
Data Docs Dashboard UI Command Summary¶
| Command | Arguments | Options |
|---|---|---|
dashboard |
- | --profile, -p, --port (8080), --host (localhost), --title, -t ("Truthound Dashboard"), --debug |
This command launches the local Data Docs dashboard UI. For the first-party control-plane product, see Truthound Dashboard.
Realtime Commands Summary¶
realtime checkpoint vs checkpoint
truthound realtime checkpoint is for streaming validation state management (--dir option based).
For CI/CD pipelines, use truthound checkpoint.
| Command | Arguments | Options |
|---|---|---|
realtime validate |
SOURCE (required) | --validators, -v, --batch-size, -b (1000), --max-batches (10, 0=unlimited), --output, -o |
realtime monitor |
SOURCE (required) | --interval, -i (5), --duration, -d (60, 0=unlimited) |
realtime checkpoint list |
- | --dir, -d (./checkpoints), --format, -f (console/json) |
realtime checkpoint show |
CHECKPOINT_ID (required) | --dir, -d |
realtime checkpoint delete |
CHECKPOINT_ID (required) | --dir, -d, --force, -f |
Benchmark Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
benchmark run |
BENCHMARK (optional) | --suite, -s (quick/ci/full/profiling/validation), --size (tiny/small/medium/large/xlarge), --rows, -r, --iterations, -i (5), --warmup, -w (2), --output, -o, --format, -f (console/json/html), --save-baseline, --compare-baseline, --verbose, -v |
benchmark list |
- | --format, -f (console/json) |
benchmark compare |
BASELINE CURRENT (required) | --threshold, -t (10.0%), --format, -f (console/json) |
benchmark parity |
- | --suite (pr-fast/nightly-core/nightly-sql/release-ga), --frameworks (truthound/gx/both), --backend, --output, --save-baseline, --compare-baseline, --strict |
Scaffolding Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
new validator |
NAME (required) | --output, -o (.), --template, -t (basic/column/pattern/range/comparison/composite/full), --author, -a, --description, -d, --category, -c (custom), --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --severity, -s (MEDIUM), --pattern, --min, --max |
new reporter |
NAME (required) | --output, -o (.), --template, -t (basic/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --extension, -e (.txt), --content-type (text/plain) |
new plugin |
NAME (required) | --output, -o (.), --type, -t (validator/reporter/hook/datasource/action/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --min-version (defaults to current Truthound release), --python (3.10) |
new list |
- | --verbose, -v |
new templates |
SCAFFOLD_TYPE (required) | - |
Plugin Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
plugins list |
- | --type, -t (validator/reporter/hook/datasource/action/custom), --state, -s (discovered/loading/loaded/active/inactive/error/unloading), --verbose, -v, --json |
plugins info |
NAME (required) | --json |
plugins load |
NAME (required) | --activate/--no-activate (--activate) |
plugins unload |
NAME (required) | - |
plugins enable |
NAME (required) | - |
plugins disable |
NAME (required) | - |
plugins create |
NAME (required) | --output, -o (.), --type, -t (validator/reporter/hook/custom), --author |
Global Options¶
These options are available for all commands:
| Option | Description |
|---|---|
--help |
Show help message and exit |
--version |
Show version and exit |
Command Groups¶
Core Commands¶
Essential data quality operations:
| Command | Description |
|---|---|
learn |
Learn schema from data |
check |
Validate data quality |
scan |
Scan for PII |
mask |
Mask sensitive data |
profile |
Generate data profile |
compare |
Detect data drift |
Profiler Commands¶
Advanced profiling and rule generation:
| Command | Description |
|---|---|
auto-profile |
Profile with auto-detection |
generate-suite |
Generate validation rules from profile |
quick-suite |
Profile and generate rules in one step |
list-formats |
List supported output formats |
list-presets |
List available presets |
list-categories |
List rule categories |
Checkpoint Commands¶
CI/CD pipeline integration:
| Command | Description |
|---|---|
checkpoint run |
Run validation pipeline |
checkpoint list |
List available checkpoints |
checkpoint validate |
Validate configuration |
checkpoint init |
Initialize sample config |
ML Commands¶
Machine learning-based detection:
| Command | Description |
|---|---|
ml anomaly |
Detect anomalies |
ml drift |
Detect data drift |
ml learn-rules |
Learn validation rules |
Lineage Commands Summary¶
| Command | Arguments | Options |
|---|---|---|
lineage show |
LINEAGE_FILE (required) | --node, -n, --direction, -d (upstream/downstream/both), --format, -f (console/json/dot) |
lineage impact |
LINEAGE_FILE NODE (required) | --max-depth (-1), --output, -o |
lineage visualize |
LINEAGE_FILE (required) | --output, -o (required), --renderer, -r (d3/cytoscape/graphviz/mermaid), --theme, -t (light/dark), --focus, -f |
Lineage Commands¶
Data lineage tracking:
| Command | Description |
|---|---|
lineage show |
Display lineage information |
lineage impact |
Analyze change impact |
lineage visualize |
Generate lineage visualization |
Docs Commands¶
Documentation generation:
| Command | Description |
|---|---|
docs generate |
Generate HTML/PDF report |
docs themes |
List available themes |
Data Docs Dashboard UI Command¶
Interactive data exploration:
| Command | Description |
|---|---|
dashboard |
Launch interactive dashboard |
Realtime Commands¶
Streaming validation:
| Command | Description |
|---|---|
realtime validate |
Validate streaming data |
realtime monitor |
Monitor validation metrics |
realtime checkpoint |
Manage validation checkpoints |
Benchmark Commands¶
Performance testing:
| Command | Description |
|---|---|
benchmark run |
Run performance benchmarks |
benchmark list |
List available benchmarks |
benchmark compare |
Compare benchmark results |
benchmark parity |
Run release-grade Truthound vs GX parity suites |
Scaffolding Commands¶
Code generation:
| Command | Description |
|---|---|
new validator |
Create custom validator |
new reporter |
Create custom reporter |
new plugin |
Create plugin package |
new list |
List scaffold types |
new templates |
List available templates |
Plugin Commands¶
Plugin management:
| Command | Description |
|---|---|
plugins list |
List discovered plugins |
plugins info |
Show plugin details |
plugins load |
Load a plugin |
plugins unload |
Unload a plugin |
plugins enable |
Enable a plugin |
plugins disable |
Disable a plugin |
plugin create |
Create plugin template |
Exit Codes¶
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | General error or validation failed (with --strict) |
| 2 | Usage error (invalid arguments) |
Quick Examples¶
# Learn schema from data
truthound learn data.csv -o schema.yaml
# Validate data quality
truthound check data.csv --strict
# Scan for PII
truthound scan customers.csv
# Mask sensitive data
truthound mask data.csv -o masked.csv --strategy hash
# Generate data profile
truthound profile data.csv --format json -o profile.json
# Compare datasets for drift
truthound compare baseline.csv current.csv --method psi