Skip to content

API Reference

Complete API documentation for Truthound.

Core Module

The main entry point for Truthound functionality.

import truthound as th

# Core functions
report = th.check("data.csv")      # Validate data
profile = th.profile("data.csv")   # Profile data
schema = th.learn("data.csv")      # Learn schema
pii = th.scan("data.csv")          # Scan for PII
masked = th.mask("data.csv")       # Mask data
drift = th.compare(old, new)       # Detect drift

Module Index

Core API

Module Description
truthound.api Main API functions (check, scan, mask, profile)
truthound.schema Schema learning and validation
truthound.drift Data drift detection
truthound.report Report classes and formatting

Validation

Module Description
truthound.validators 289 built-in validators
truthound.validators.sdk Custom validator SDK

Profiling

Module Description
truthound.profiler Data profiling and rule generation

Storage & Reporting

Module Description
truthound.stores Result storage backends
truthound.datadocs HTML reports and documentation

CI/CD

Module Description
truthound.checkpoint Checkpoint and CI/CD integration

Advanced Features

Module Description
truthound.ml Machine learning features
truthound.lineage Data lineage tracking
truthound.realtime Streaming validation

Extensions

Module Description
truthound.plugins Plugin system

Quick Examples

Basic Validation

import truthound as th

# Check with default validators
report = th.check("data.csv")
print(f"Issues found: {report.issue_count}")

# Check with specific validators
report = th.check(
    "data.csv",
    validators=["null", "duplicate", "range"],
    min_severity="high"
)

# Check with schema
schema = th.learn("reference_data.csv")
report = th.check("new_data.csv", schema=schema)

Profiling

from truthound import DataProfiler, ProfilerConfig

# Configure profiler
config = ProfilerConfig(
    include_patterns=True,
    include_correlations=True,
    sample_size=10000
)

# Profile data
profiler = DataProfiler(config=config)
profile = profiler.profile("data.csv")

# Access profile information
print(f"Rows: {profile.row_count}")
print(f"Columns: {profile.column_count}")

for col in profile.columns:
    print(f"{col.name}: {col.inferred_type}, {col.null_ratio:.1%} nulls")

Custom Validators

from truthound.validators.sdk import validator, ValidationResult

@validator("my_validator", category="custom")
def my_custom_validator(df, column: str, threshold: float = 0.1):
    """Custom validation logic."""
    values = df[column]
    invalid_count = (values < 0).sum()
    invalid_ratio = invalid_count / len(values)

    return ValidationResult(
        passed=invalid_ratio <= threshold,
        message=f"Found {invalid_ratio:.1%} invalid values",
        details={"invalid_count": invalid_count}
    )