Configuration Guide¶
This guide covers configuration management with Truthound's Python API. It includes practical workflows for environment setup, secrets management, logging, metrics, and resilience patterns.
Quick Start¶
from truthound.infrastructure.config import load_config
# Load configuration (auto-detects environment)
config = load_config()
# Access values with type safety
db_host = config.get("database.host", default="localhost")
db_port = config.get_int("database.port", default=5432)
debug = config.get_bool("debug", default=False)
Common Workflows¶
Workflow 1: Production Environment Setup¶
from truthound.infrastructure.config import load_config, Environment
# Production configuration with Vault secrets
config = load_config(
environment=Environment.PRODUCTION,
config_path="config/",
env_prefix="TRUTHOUND",
use_vault=True,
vault_url="http://vault:8200",
vault_path="truthound/production",
validate=True,
)
# Access secrets (fetched from Vault)
db_password = config.get("database.password")
api_key = config.get("external_api.key")
Workflow 2: Structured Logging Setup¶
from truthound.infrastructure.logging import setup_logging, LogConfig
# Configure structured JSON logging
log_config = LogConfig(
level="INFO",
format="json",
correlation_id_header="X-Request-ID",
include_timestamp=True,
include_hostname=True,
)
logger = setup_logging(log_config)
# Log with correlation ID
logger.info(
"Validation completed",
extra={
"data_asset": "customers.csv",
"issue_count": 5,
"duration_ms": 1234,
}
)
Workflow 3: Prometheus Metrics¶
from truthound.infrastructure.metrics import MetricsCollector, start_http_server
# Initialize metrics
metrics = MetricsCollector()
# Start Prometheus endpoint
start_http_server(port=9090)
# Record validation metrics
with metrics.validation_duration.time():
report = th.check("data.csv")
metrics.validation_total.inc(labels={"status": "success"})
metrics.issues_found.inc(len(report.issues))
Workflow 4: Resilience Patterns¶
from truthound.common.resilience import ResilienceBuilder
import truthound as th
# Build resilience policy
resilience = (
ResilienceBuilder()
.with_retry(max_attempts=3, backoff_factor=2.0)
.with_circuit_breaker(failure_threshold=5, recovery_timeout=30)
.with_timeout(seconds=60)
.with_bulkhead(max_concurrent=10)
.build()
)
# Execute with resilience
from truthound.datasources import PostgreSQLDataSource
source = PostgreSQLDataSource(table="users", host="db.example.com")
result = resilience.execute(lambda: th.check(source=source))
Workflow 5: Audit Logging for Compliance¶
from truthound.infrastructure.audit import AuditLogger, AuditConfig
# Configure audit logging
audit_config = AuditConfig(
enabled=True,
storage="elasticsearch",
elasticsearch_url="http://elasticsearch:9200",
index_prefix="truthound-audit",
retention_days=365, # SOC 2 compliance
)
audit = AuditLogger(audit_config)
# Log validation operations
audit.log(
operation="validation",
actor="system",
resource="customers.csv",
result="success",
metadata={"issue_count": 0},
)
Full Documentation¶
Truthound provides an enterprise-grade configuration management system with support for multiple environments, configuration sources, validation, and hot reloading.
Architecture¶
ConfigSource[] (ordered by priority)
|
+---> EnvConfigSource (environment variables)
+---> FileConfigSource (YAML, JSON, TOML)
+---> VaultConfigSource (HashiCorp Vault)
+---> AwsSecretsSource (AWS Secrets Manager)
|
v
ConfigManager
|
+---> Merge & Validate
|
v
ConfigProfile (typed access)
Configuration Priority¶
Configuration sources are processed in priority order. Higher priority sources override lower priority ones:
| Source | Priority | Description |
|---|---|---|
Base file (base.yaml) |
10 | Common configuration |
Environment file (production.yaml) |
20 | Environment-specific |
Local file (local.yaml) |
30 | Local overrides (git-ignored) |
| Environment variables | 100 | Runtime overrides |
| HashiCorp Vault | 200 | Secrets (highest priority) |
| AWS Secrets Manager | 200 | Secrets (highest priority) |
Quick Start¶
Basic Configuration Loading¶
from truthound.infrastructure.config import load_config, get_config
# Load configuration (uses defaults + environment variables)
config = load_config()
# Access configuration values
db_host = config.get("database.host", default="localhost")
db_port = config.get_int("database.port", default=5432)
debug = config.get_bool("debug", default=False)
Full Configuration with All Sources¶
from truthound.infrastructure.config import (
load_config,
Environment,
)
config = load_config(
environment=Environment.PRODUCTION,
config_path="config/",
env_prefix="TRUTHOUND",
use_vault=True,
vault_url="http://vault:8200",
vault_path="truthound/production",
use_aws_secrets=True,
aws_secret_name="truthound/production",
auto_reload=True,
validate=True,
)
Environments¶
Truthound supports five environment types:
from truthound.infrastructure.config import Environment
class Environment(Enum):
DEVELOPMENT = "development"
TESTING = "testing"
STAGING = "staging"
PRODUCTION = "production"
LOCAL = "local"
Environment Detection¶
The environment is detected from environment variables in this order:
TRUTHOUND_ENVENVIRONMENTENV
If none are set, defaults to DEVELOPMENT.
Environment Aliases¶
| Input | Environment |
|---|---|
dev, development |
DEVELOPMENT |
test, testing |
TESTING |
stage, staging |
STAGING |
prod, production |
PRODUCTION |
local |
LOCAL |
Environment Properties¶
config = load_config()
# Check environment type
if config.is_production: # True for PRODUCTION and STAGING
# Production-specific logic
pass
if config.is_development: # True for DEVELOPMENT, LOCAL, and TESTING
# Development-specific logic
pass
Typed Configuration Access¶
ConfigProfile provides type-safe access to configuration values:
config = load_config()
# String (default)
host = config.get("database.host", default="localhost")
host = config.get_str("database.host", default="localhost")
# Integer
port = config.get_int("database.port", default=5432)
# Float
timeout = config.get_float("request.timeout", default=30.0)
# Boolean
debug = config.get_bool("debug", default=False)
# List
hosts = config.get_list("database.hosts", default=[])
# Dictionary
options = config.get_dict("database.options", default={})
# Required values (raises ConfigError if missing)
secret = config.get("api.secret", required=True)
Configuration Validation¶
Define a schema to validate configuration:
from truthound.infrastructure.config import (
ConfigSchema,
ConfigField,
ConfigValidator,
)
# Define schema
schema = ConfigSchema()
schema.add_field("database.host", str, required=True)
schema.add_field(
"database.port",
int,
default=5432,
min_value=1,
max_value=65535,
)
schema.add_field(
"logging.level",
str,
choices=["DEBUG", "INFO", "WARNING", "ERROR"],
)
# Validate
validator = ConfigValidator(schema)
errors = validator.validate(config_dict)
if errors:
raise ConfigValidationError(errors)
ConfigField Options¶
| Option | Type | Description |
|---|---|---|
name |
str | Configuration key (dot-separated) |
type |
type | Expected Python type |
required |
bool | Whether the field is required |
default |
Any | Default value if not found |
min_value |
float | Minimum value (for numbers) |
max_value |
float | Maximum value (for numbers) |
pattern |
str | Regex pattern (for strings) |
choices |
list | Allowed values |
description |
str | Field description |
Default Schema¶
Truthound includes a default schema with common configuration fields:
Default Fields¶
| Field | Type | Default | Constraints |
|---|---|---|---|
logging.level |
str | "INFO" |
TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL |
logging.format |
str | "console" |
console, json, logfmt |
metrics.enabled |
bool | True |
- |
metrics.port |
int | 9090 |
1-65535 |
database.host |
str | "localhost" |
- |
database.port |
int | 5432 |
1-65535 |
database.pool_size |
int | 10 |
1-100 |
validation.timeout |
int | 300 |
>= 1 |
validation.max_workers |
int | 4 |
1-32 |
Hot Reload¶
Configuration supports automatic reloading when sources change:
from truthound.infrastructure.config import load_config, reload_config
# Enable auto-reload
config = load_config(auto_reload=True)
# Manual reload
config = reload_config()
Reload Callbacks¶
Register callbacks to be notified when configuration changes:
from truthound.infrastructure.config import ConfigManager
manager = ConfigManager(auto_reload=True)
def on_config_change(new_config):
print("Configuration changed!")
# Re-initialize services with new config
manager.on_reload(on_config_change)
config = manager.load()
File Structure¶
Recommended configuration file structure:
config/
├── base.yaml # Common configuration (priority 10)
├── development.yaml # Development-specific (priority 20)
├── staging.yaml # Staging-specific (priority 20)
├── production.yaml # Production-specific (priority 20)
└── local.yaml # Local overrides, git-ignored (priority 30)
Example: base.yaml¶
logging:
level: INFO
format: console
database:
host: localhost
port: 5432
pool_size: 10
validation:
timeout: 300
max_workers: 4
metrics:
enabled: true
port: 9090
Example: production.yaml¶
logging:
level: WARNING
format: json
database:
pool_size: 50
metrics:
enable_http: true
push_gateway_url: http://pushgateway:9091
Configuration Guide Pages¶
This guide is split into the following sections for detailed documentation:
- Configuration Sources - File, Environment Variables, Vault, AWS Secrets Manager
- Resilience Patterns - Circuit Breaker, Retry, Bulkhead, Rate Limiter
- Logging Configuration - Enterprise logging with multiple sinks
- Metrics Configuration - Prometheus metrics
- Audit Logging - Compliance and audit trails
- Encryption - Data encryption and Cloud KMS
- Environment Variables - Complete reference
See Also¶
- CI/CD Integration - Checkpoint configuration in CI/CD
- Data Sources - DataSource configuration details
- Storage Backends - Store configuration options
- Data Profiling - Profiler configuration