CLI Reference
Complete reference for the Truthound command-line interface.
Installation
After installation, the truthound command is available globally.
CLI vs Python API
The CLI only supports file-based inputs. For SQL databases (PostgreSQL, MySQL, SQLite), Spark DataFrames, or Cloud Data Warehouses (BigQuery, Snowflake, Redshift, Databricks), use the Python API with the source= parameter.
| Format |
Extension |
Description |
| CSV |
.csv |
Comma-separated values |
| JSON |
.json |
Standard JSON array |
| Parquet |
.parquet |
Columnar storage format |
| NDJSON |
.ndjson |
Newline-delimited JSON |
| JSONL |
.jsonl |
JSON Lines (same as NDJSON) |
Quick Reference
Core Commands Summary
| Command |
Arguments |
Options |
learn |
FILE (required) |
--output, -o (schema.yaml), --no-constraints |
check |
FILE (required) |
--validators, -v, --min-severity, -s (low/medium/high/critical), --schema, --auto-schema, --format, -f (console/json/html), --output, -o, --strict |
scan |
FILE (required) |
--format, -f (console/json/html), --output, -o |
mask |
FILE (required) |
--output, -o (required), --columns, -c, --strategy, -s (redact/hash/fake), --strict |
profile |
FILE (required) |
--format, -f (console/json), --output, -o |
compare |
BASELINE CURRENT (required) |
--columns, -c, --method, -m (auto/ks/psi/chi2/js), --threshold, -t, --format, -f (console/json), --output, -o, --strict |
Profiler Commands Summary
| Command |
Arguments |
Options |
auto-profile |
FILE (required) |
--output, -o, --format, -f (console/json/yaml), --patterns/--no-patterns, --correlations/--no-correlations, --sample, -s, --top-n (10) |
generate-suite |
PROFILE_FILE (required) |
--output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence (low/medium/high), --name, -n, --preset, -p, --config, -c, --group-by-category, --code-style (functional/class_based/declarative) |
quick-suite |
FILE (required) |
--output, -o, --format, -f (yaml/json/python/toml/checkpoint), --strictness, -s (loose/medium/strict), --include, -i, --exclude, -e, --min-confidence, --name, -n, --preset, -p, --sample-size |
list-formats |
- |
- |
list-presets |
- |
- |
list-categories |
- |
- |
Checkpoint Commands Summary
checkpoint vs realtime checkpoint
truthound checkpoint is for CI/CD pipelines (YAML configuration file based).
For streaming validation state management, use truthound realtime checkpoint.
| Command |
Arguments |
Options |
checkpoint run |
NAME (required) |
--config, -c (truthound.yaml), --data, -d, --validators, -v, --output, -o, --format, -f (console/json), --strict, --store, --slack, --webhook, --github-summary |
checkpoint list |
- |
--config, -c, --format, -f (console/json) |
checkpoint validate |
CONFIG_FILE (required) |
--strict, -s |
checkpoint init |
- |
--output, -o (truthound.yaml), --format, -f (yaml/json) |
ML Commands Summary
| Command |
Arguments |
Options |
ml anomaly |
FILE (required) |
--method, -m (zscore/iqr/mad/isolation_forest), --contamination, -c (0.1), --columns, --output, -o, --format, -f (console/json) |
ml drift |
BASELINE CURRENT (required) |
--method, -m (distribution/feature/multivariate), --threshold, -t (0.1), --columns, --output, -o |
ml learn-rules |
FILE (required) |
--output, -o (learned_rules.json), --strictness, -s (loose/medium/strict), --min-confidence (0.9), --max-rules (100) |
Docs Commands Summary
| Command |
Arguments |
Options |
docs generate |
PROFILE_FILE (required) |
--output, -o, --title, -t ("Data Profile Report"), --subtitle, -s, --theme (light/dark/professional/minimal/modern), --format, -f (html/pdf) |
docs themes |
- |
- |
Dashboard Command Summary
| Command |
Arguments |
Options |
dashboard |
- |
--profile, -p, --port (8080), --host (localhost), --title, -t ("Truthound Dashboard"), --debug |
Realtime Commands Summary
realtime checkpoint vs checkpoint
truthound realtime checkpoint is for streaming validation state management (--dir option based).
For CI/CD pipelines, use truthound checkpoint.
| Command |
Arguments |
Options |
realtime validate |
SOURCE (required) |
--validators, -v, --batch-size, -b (1000), --max-batches (10, 0=unlimited), --output, -o |
realtime monitor |
SOURCE (required) |
--interval, -i (5), --duration, -d (60, 0=unlimited) |
realtime checkpoint list |
- |
--dir, -d (./checkpoints), --format, -f (console/json) |
realtime checkpoint show |
CHECKPOINT_ID (required) |
--dir, -d |
realtime checkpoint delete |
CHECKPOINT_ID (required) |
--dir, -d, --force, -f |
Benchmark Commands Summary
| Command |
Arguments |
Options |
benchmark run |
BENCHMARK (optional) |
--suite, -s (quick/ci/full/profiling/validation), --size (tiny/small/medium/large/xlarge), --rows, -r, --iterations, -i (5), --warmup, -w (2), --output, -o, --format, -f (console/json/html), --save-baseline, --compare-baseline, --verbose, -v |
benchmark list |
- |
--format, -f (console/json) |
benchmark compare |
BASELINE CURRENT (required) |
--threshold, -t (10.0%), --format, -f (console/json) |
Scaffolding Commands Summary
| Command |
Arguments |
Options |
new validator |
NAME (required) |
--output, -o (.), --template, -t (basic/column/pattern/range/comparison/composite/full), --author, -a, --description, -d, --category, -c (custom), --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --severity, -s (MEDIUM), --pattern, --min, --max |
new reporter |
NAME (required) |
--output, -o (.), --template, -t (basic/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --docs/--no-docs (--no-docs), --extension, -e (.txt), --content-type (text/plain) |
new plugin |
NAME (required) |
--output, -o (.), --type, -t (validator/reporter/hook/datasource/action/full), --author, -a, --description, -d, --tests/--no-tests (--tests), --min-version (0.1.0), --python (3.10) |
new list |
- |
--verbose, -v |
new templates |
SCAFFOLD_TYPE (required) |
- |
Plugin Commands Summary
| Command |
Arguments |
Options |
plugin list |
- |
--type, -t (validator/reporter/hook/datasource/action/custom), --state, -s (discovered/loading/loaded/active/inactive/error/unloading), --verbose, -v, --json |
plugin info |
NAME (required) |
--json |
plugin load |
NAME (required) |
--activate/--no-activate (--activate) |
plugin unload |
NAME (required) |
- |
plugin enable |
NAME (required) |
- |
plugin disable |
NAME (required) |
- |
plugin create |
NAME (required) |
--output, -o (.), --type, -t (validator/reporter/hook/custom), --author |
Global Options
These options are available for all commands:
| Option |
Description |
--help |
Show help message and exit |
--version |
Show version and exit |
Command Groups
Essential data quality operations:
| Command |
Description |
learn |
Learn schema from data |
check |
Validate data quality |
scan |
Scan for PII |
mask |
Mask sensitive data |
profile |
Generate data profile |
compare |
Detect data drift |
Advanced profiling and rule generation:
| Command |
Description |
auto-profile |
Profile with auto-detection |
generate-suite |
Generate validation rules from profile |
quick-suite |
Profile and generate rules in one step |
list-formats |
List supported output formats |
list-presets |
List available presets |
list-categories |
List rule categories |
CI/CD pipeline integration:
Machine learning-based detection:
Lineage Commands Summary
| Command |
Arguments |
Options |
lineage show |
LINEAGE_FILE (required) |
--node, -n, --direction, -d (upstream/downstream/both), --format, -f (console/json/dot) |
lineage impact |
LINEAGE_FILE NODE (required) |
--max-depth (-1), --output, -o |
lineage visualize |
LINEAGE_FILE (required) |
--output, -o (required), --renderer, -r (d3/cytoscape/graphviz/mermaid), --theme, -t (light/dark), --focus, -f |
Data lineage tracking:
Documentation generation:
Interactive data exploration:
| Command |
Description |
dashboard |
Launch interactive dashboard |
Streaming validation:
Performance testing:
Code generation:
Plugin management:
Exit Codes
| Code |
Meaning |
| 0 |
Success |
| 1 |
General error or validation failed (with --strict) |
| 2 |
Usage error (invalid arguments) |
Quick Examples
# Learn schema from data
truthound learn data.csv -o schema.yaml
# Validate data quality
truthound check data.csv --strict
# Scan for PII
truthound scan customers.csv
# Mask sensitive data
truthound mask data.csv -o masked.csv --strategy hash
# Generate data profile
truthound profile data.csv --format json -o profile.json
# Compare datasets for drift
truthound compare baseline.csv current.csv --method psi