Reporter SDK¶
Truthound provides a comprehensive SDK for custom reporter development.
Overview¶
The Reporter SDK provides the following features:
- Mixins: Reusable common functionality (formatting, aggregation, filtering, etc.)
- Builder: Create reporters with decorators and builder patterns
- Templates: Pre-defined reporter templates (CSV, YAML, JUnit, etc.)
- Schema: Output format validation
- Testing: Testing utilities and mock data generation
Quick Start¶
Create Simple Reporter with Decorator¶
from truthound.reporters.sdk import create_reporter
@create_reporter("my_format", extension=".myf")
def render_my_format(result, config):
return f"Status: {result.status.value}"
# Usage
from truthound.reporters import get_reporter
reporter = get_reporter("my_format")
output = reporter.render(validation_result)
Full Reporter with Mixins¶
from truthound.reporters.sdk import (
FormattingMixin,
AggregationMixin,
FilteringMixin,
)
from truthound.reporters.base import ValidationReporter, ReporterConfig
class MyReporterConfig(ReporterConfig):
custom_option: str = "default"
class MyReporter(FormattingMixin, AggregationMixin, FilteringMixin, ValidationReporter[MyReporterConfig]):
name = "my_format"
file_extension = ".myf"
@classmethod
def _default_config(cls):
return MyReporterConfig()
def render(self, data):
# Use mixin methods
issues = self.filter_by_severity(data, min_severity="medium")
grouped = self.group_by_column(issues)
return self.format_as_table(grouped)
Mixins¶
The SDK provides 6 mixins.
FormattingMixin¶
Output formatting utilities:
from truthound.reporters.sdk import FormattingMixin
class MyReporter(FormattingMixin, ValidationReporter):
def render(self, data):
# Table formatting (ascii, markdown, grid, simple styles)
rows = [{"name": r.column, "message": r.message} for r in data.results]
table = self.format_as_table(rows, style="markdown")
# Number formatting
rate = self.format_percentage(data.statistics.pass_rate)
# Date formatting
date = self.format_datetime(data.run_time)
# Byte size formatting
size = self.format_bytes(1024000) # "1000.0 KB"
return f"{table}\nPass Rate: {rate}"
Key Methods:
| Method | Description |
|---|---|
format_as_table(rows, columns, style) |
Format data as table (style: ascii/markdown/grid/simple) |
format_percentage(value, precision) |
Format percentage (e.g., "85.5%") |
format_number(value, precision) |
Format number (thousands separator) |
format_datetime(dt, format) |
Format date/time |
format_duration(seconds) |
Format execution time (e.g., "2h 30m 15s") |
format_bytes(size) |
Format byte size (e.g., "1.5 MB") |
format_relative_time(dt) |
Format relative time (e.g., "5 minutes ago") |
truncate(text, max_length, suffix) |
Limit text length |
indent(text, prefix) |
Indent text |
wrap(text, width) |
Wrap text lines |
AggregationMixin¶
Data aggregation utilities:
from truthound.reporters.sdk import AggregationMixin
class MyReporter(AggregationMixin, ValidationReporter):
def render(self, data):
# Group by column
by_column = self.group_by_column(data.results)
# Group by severity
by_severity = self.group_by_severity(data.results)
# Group by validator
by_validator = self.group_by_validator(data.results)
# Calculate statistics
stats = self.get_summary_stats(data)
return self.format_groups(by_severity)
Key Methods:
| Method | Description |
|---|---|
group_by_column(results) |
Group results by column |
group_by_severity(results) |
Group results by severity |
group_by_validator(results) |
Group results by validator |
group_by(results, key) |
Group by custom key function |
get_summary_stats(result) |
Calculate statistics (pass_rate, counts, etc.) |
count_by_severity(results) |
Count by severity |
count_by_column(results) |
Count by column |
FilteringMixin¶
Data filtering utilities:
from truthound.reporters.sdk import FilteringMixin
class MyReporter(FilteringMixin, ValidationReporter):
def render(self, data):
# Filter by severity
critical = self.filter_by_severity(data.results, min_severity="critical")
# Failed items only
failed = self.filter_failed(data.results)
# Specific columns only
email_issues = self.filter_by_column(data.results, include_columns=["email"])
# Specific validators only
null_issues = self.filter_by_validator(data.results, include_validators=["NullValidator"])
# Sort by severity
sorted_results = self.sort_by_severity(failed)
return self.format_issues(sorted_results)
Key Methods:
| Method | Description |
|---|---|
filter_by_severity(results, min_severity, max_severity) |
Filter by severity range |
filter_failed(results) |
Filter failed results only |
filter_passed(results) |
Filter passed results only |
filter_by_column(results, include_columns, exclude_columns) |
Filter by specific columns |
filter_by_validator(results, include_validators, exclude_validators) |
Filter by specific validators |
sort_by_severity(results, ascending) |
Sort by severity |
sort_by_column(results, ascending) |
Sort by column name |
limit(results, count, offset) |
Limit result count |
SerializationMixin¶
Serialization utilities:
from truthound.reporters.sdk import SerializationMixin
class MyReporter(SerializationMixin, ValidationReporter):
def render(self, data):
# To JSON string
as_json = self.to_json(data, indent=2)
# To CSV string
rows = [{"name": "col1", "count": 10}]
as_csv = self.to_csv(rows, columns=["name", "count"])
# Create XML element
xml_elem = self.to_xml_element(
"issue",
value="message",
attributes={"severity": "high"}
)
return as_json
Key Methods:
| Method | Description |
|---|---|
to_json(data, indent, sort_keys) |
Serialize to JSON string |
to_csv(rows, columns, delimiter) |
Serialize to CSV string |
to_xml_element(tag, value, attributes) |
Create XML element string |
TemplatingMixin¶
Template rendering utilities:
from truthound.reporters.sdk import TemplatingMixin
class MyReporter(TemplatingMixin, ValidationReporter):
template_string = """
Report: {{ data.data_asset }}
Status: {{ data.status.value }}
{% for issue in data.issues %}
- {{ issue.message }}
{% endfor %}
"""
def render(self, data):
return self.render_template(self.template_string, data=data)
Key Methods:
| Method | Description |
|---|---|
render_template(template, context) |
Render Jinja2 template |
render_template_file(path, context) |
Render file-based template |
interpolate(template, context) |
Simple string interpolation (no Jinja2 required) |
StreamingMixin¶
Streaming output utilities:
from truthound.reporters.sdk import StreamingMixin
class MyReporter(StreamingMixin, ValidationReporter):
def render(self, data):
# Generate in chunks
for chunk in self.stream_results(data.results, chunk_size=100):
yield self.format_chunk(chunk)
def render_lines(self, data):
# Line-by-line streaming
formatter = lambda r: f"{r.validator_name}: {r.message}"
return self.render_streaming(data.results, formatter)
Key Methods:
| Method | Description |
|---|---|
stream_results(results, chunk_size) |
Chunk iterator |
stream_lines(results, formatter) |
Line iterator |
render_streaming(results, formatter) |
Combine streaming results to string |
Builder¶
@create_reporter Decorator¶
Convert function to reporter:
from truthound.reporters.sdk import create_reporter
@create_reporter(
name="simple",
extension=".txt",
content_type="text/plain"
)
def render_simple(result, config):
"""Simple text reporter."""
lines = [
f"Data Asset: {result.data_asset}",
f"Status: {result.status.value}",
f"Pass Rate: {result.pass_rate * 100:.1f}%",
]
return "\n".join(lines)
# Automatically registered
from truthound.reporters import get_reporter
reporter = get_reporter("simple")
@create_validation_reporter Decorator¶
Include full ValidationReporter functionality:
from truthound.reporters.sdk import create_validation_reporter
from truthound.reporters.base import ReporterConfig
class MyConfig(ReporterConfig):
prefix: str = ">"
include_timestamp: bool = True
@create_validation_reporter(
name="prefixed",
extension=".txt",
config_class=MyConfig
)
def render_prefixed(result, config):
lines = []
if config.include_timestamp:
lines.append(f"{config.prefix} Time: {result.run_time}")
lines.append(f"{config.prefix} Status: {result.status.value}")
return "\n".join(lines)
ReporterBuilder¶
Fluent builder pattern:
from truthound.reporters.sdk import ReporterBuilder
# ReporterBuilder takes name in constructor
reporter_class = (
ReporterBuilder("custom")
.with_extension(".custom")
.with_content_type("text/plain")
.with_mixin(FormattingMixin)
.with_mixin(FilteringMixin)
.with_renderer(lambda self, data: f"Status: {data.status.value}")
.build()
)
# Create instance
instance = reporter_class()
output = instance.render(validation_result)
Builder Methods:
| Method | Description |
|---|---|
ReporterBuilder(name) |
Create builder with reporter name |
with_extension(ext) |
Set file extension |
with_content_type(type) |
Set MIME type |
with_mixin(mixin_class) |
Add mixin |
with_mixins(*mixins) |
Add multiple mixins |
with_config_class(cls) |
Specify config class |
with_renderer(func) |
Specify render function (takes self, data as arguments) |
with_post_processor(func) |
Add post-processor function |
with_attribute(name, value) |
Add class attribute |
register_as(name) |
Specify factory registration name |
build() |
Create reporter class |
Templates¶
The SDK provides pre-defined reporter templates.
CSVReporter¶
from truthound.reporters.sdk import CSVReporter
reporter = CSVReporter(
delimiter=",",
include_header=True,
include_passed=False,
quoting="minimal" # minimal, all, none, nonnumeric
)
csv_output = reporter.render(result)
CSVReporterConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
delimiter |
str |
"," |
Field delimiter |
include_header |
bool |
True |
Include header row |
include_passed |
bool |
False |
Include passed items |
quoting |
str |
"minimal" |
Quoting style |
columns |
list[str] |
None |
Columns to include (None=all) |
YAMLReporter¶
from truthound.reporters.sdk import YAMLReporter
reporter = YAMLReporter(
default_flow_style=False,
indent=2,
include_passed=False,
sort_keys=False
)
yaml_output = reporter.render(result)
YAMLReporterConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
default_flow_style |
bool |
False |
Use flow style |
indent |
int |
2 |
Indentation size |
include_passed |
bool |
False |
Include passed items |
sort_keys |
bool |
False |
Sort keys |
JUnitXMLReporter¶
JUnit XML format for CI/CD integration:
from truthound.reporters.sdk import JUnitXMLReporter
reporter = JUnitXMLReporter(
testsuite_name="Truthound Validation",
include_stdout=True,
include_properties=True
)
xml_output = reporter.render(result)
JUnitXMLReporterConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
testsuite_name |
str |
"Truthound Validation" |
Test suite name |
include_stdout |
bool |
True |
Include system-out |
include_properties |
bool |
True |
Include properties |
include_passed |
bool |
False |
Include passed tests |
NDJSONReporter¶
Newline Delimited JSON (for log collection system integration):
from truthound.reporters.sdk import NDJSONReporter
reporter = NDJSONReporter(
include_metadata=True,
include_passed=False,
compact=True
)
ndjson_output = reporter.render(result)
NDJSONReporterConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
include_metadata |
bool |
True |
Include metadata line |
include_passed |
bool |
False |
Include passed items |
compact |
bool |
True |
Compact JSON |
TableReporter¶
Text table output:
from truthound.reporters.sdk import TableReporter
reporter = TableReporter(
style="grid", # ascii, markdown, grid, simple
max_width=120,
include_passed=False
)
table_output = reporter.render(result)
TableReporterConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
style |
str |
"ascii" |
Table style |
max_width |
int |
120 |
Maximum width |
include_passed |
bool |
False |
Include passed items |
show_index |
bool |
False |
Show index |
Schema Validation¶
A schema system for output format validation.
Basic Usage¶
from truthound.reporters.sdk import validate_output, JSONSchema
# Define schema
schema = JSONSchema(
required_fields=["status", "data_asset", "issues"],
field_types={
"status": str,
"data_asset": str,
"issues": list,
"pass_rate": float,
}
)
# Validate output
result = validate_output(json_output, schema)
if not result.is_valid:
for error in result.errors:
print(f"Error: {error.message}")
Schema Types¶
JSONSchema¶
from truthound.reporters.sdk import JSONSchema
schema = JSONSchema(
required_fields=["status", "data_asset"],
field_types={"status": str, "issues": list},
allow_extra_fields=True,
max_depth=10
)
Options:
| Option | Type | Description |
|---|---|---|
required_fields |
list[str] |
Required field list |
field_types |
dict[str, type] |
Field types |
allow_extra_fields |
bool |
Allow extra fields |
max_depth |
int |
Maximum nesting depth |
XMLSchema¶
from truthound.reporters.sdk import XMLSchema
schema = XMLSchema(
root_element="testsuites",
required_elements=["testsuite", "testcase"],
required_attributes={"testsuite": ["name", "tests"]},
validate_dtd=False
)
Options:
| Option | Type | Description |
|---|---|---|
root_element |
str |
Root element name |
required_elements |
list[str] |
Required element list |
required_attributes |
dict[str, list[str]] |
Required attributes per element |
validate_dtd |
bool |
Validate DTD |
CSVSchema¶
from truthound.reporters.sdk import CSVSchema
schema = CSVSchema(
required_columns=["validator", "column", "severity", "message"],
column_types={"severity": str, "count": int},
allow_extra_columns=True,
min_rows=0
)
Options:
| Option | Type | Description |
|---|---|---|
required_columns |
list[str] |
Required column list |
column_types |
dict[str, type] |
Column types |
allow_extra_columns |
bool |
Allow extra columns |
min_rows |
int |
Minimum row count |
TextSchema¶
from truthound.reporters.sdk import TextSchema
schema = TextSchema(
required_patterns=[r"Status:", r"Pass Rate:"],
forbidden_patterns=[r"ERROR", r"EXCEPTION"],
max_length=100000,
encoding="utf-8"
)
Options:
| Option | Type | Description |
|---|---|---|
required_patterns |
list[str] |
Required patterns (regex) |
forbidden_patterns |
list[str] |
Forbidden patterns (regex) |
max_length |
int |
Maximum character count |
encoding |
str |
Encoding |
Schema Registration and Management¶
from truthound.reporters.sdk import (
register_schema,
get_schema,
unregister_schema,
validate_reporter_output,
)
# Register schema
register_schema("my_format", schema)
# Get schema
my_schema = get_schema("my_format")
# Remove schema
unregister_schema("my_format")
# Auto-validate reporter output
is_valid = validate_reporter_output("json", json_output)
Schema Inference and Merging¶
from truthound.reporters.sdk import infer_schema, merge_schemas
# Infer schema from sample data
inferred = infer_schema(sample_output, format="json")
# Merge multiple schemas
merged = merge_schemas([schema1, schema2], strategy="union")
Testing Utilities¶
Utilities for reporter testing.
Mock Data Generation¶
create_mock_result¶
from truthound.reporters.sdk import create_mock_result
# Default mock result
result = create_mock_result()
# Custom settings
result = create_mock_result(
data_asset="test_data.csv",
status="failure",
pass_rate=0.75,
issue_count=5,
severity_distribution={"critical": 2, "high": 2, "medium": 1}
)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
data_asset |
str |
"test_data.csv" |
Data asset name |
status |
str |
"failure" |
Validation status |
pass_rate |
float |
0.8 |
Pass rate |
issue_count |
int |
3 |
Issue count |
severity_distribution |
dict |
None |
Issues per severity |
MockResultBuilder¶
Fluent builder pattern:
from truthound.reporters.sdk import MockResultBuilder
result = (
MockResultBuilder()
.with_data_asset("orders.parquet")
.with_status("failure")
.with_pass_rate(0.65)
.add_issue(
validator="NullValidator",
column="email",
severity="critical",
message="Found 10 null values"
)
.add_issue(
validator="RangeValidator",
column="age",
severity="high",
message="5 values out of range"
)
.with_run_time("2024-01-15T10:30:45")
.build()
)
Builder Methods:
| Method | Description |
|---|---|
with_data_asset(name) |
Set data asset name |
with_status(status) |
Set validation status |
with_pass_rate(rate) |
Set pass rate |
with_run_time(time) |
Set run time |
add_issue(...) |
Add issue |
add_issues(issues) |
Add multiple issues |
build() |
Create MockValidationResult |
create_mock_results¶
Generate multiple results:
from truthound.reporters.sdk import create_mock_results
# Generate 5 random results
results = create_mock_results(count=5)
# Various status distributions
results = create_mock_results(
count=10,
status_distribution={"success": 0.7, "failure": 0.3}
)
Assertion Functions¶
General Validation¶
from truthound.reporters.sdk import assert_valid_output
# Auto-detect and validate output format
assert_valid_output(output, format="json")
assert_valid_output(output, format="xml")
assert_valid_output(output, format="csv")
JSON Validation¶
from truthound.reporters.sdk import assert_json_valid
# Basic JSON validation
assert_json_valid(json_output)
# Validate with schema
assert_json_valid(json_output, schema=my_schema)
# Validate required fields
assert_json_valid(json_output, required_fields=["status", "issues"])
XML Validation¶
from truthound.reporters.sdk import assert_xml_valid
# Basic XML validation
assert_xml_valid(xml_output)
# Validate root element
assert_xml_valid(xml_output, root_element="testsuites")
# Validate with XSD schema
assert_xml_valid(xml_output, xsd_path="schema.xsd")
CSV Validation¶
from truthound.reporters.sdk import assert_csv_valid
# Basic CSV validation
assert_csv_valid(csv_output)
# Validate columns
assert_csv_valid(
csv_output,
required_columns=["validator", "column", "severity"],
min_rows=1
)
Pattern Matching¶
from truthound.reporters.sdk import assert_contains_patterns
assert_contains_patterns(
output,
patterns=[
r"Status: (success|failure)",
r"Pass Rate: \d+\.\d+%",
r"Total Issues: \d+"
]
)
ReporterTestCase¶
Base class for test cases:
from truthound.reporters.sdk import ReporterTestCase
from truthound.reporters import get_reporter
class TestMyReporter(ReporterTestCase):
reporter_name = "my_format"
def test_basic_render(self):
"""Basic rendering test."""
reporter = get_reporter(self.reporter_name)
result = self.create_sample_result()
output = reporter.render(result)
self.assert_output_valid(output)
self.assertIn("Status:", output)
def test_empty_issues(self):
"""Test with no issues."""
result = self.create_result_with_no_issues()
output = self.render(result)
self.assert_output_valid(output)
def test_edge_cases(self):
"""Edge case tests."""
for edge_case in self.get_edge_cases():
with self.subTest(edge_case=edge_case.name):
output = self.render(edge_case.data)
self.assert_output_valid(output)
Provided Methods:
| Method | Description |
|---|---|
create_sample_result() |
Create standard sample result |
create_result_with_no_issues() |
Create result with no issues |
create_result_with_many_issues(n) |
Create result with n issues |
get_edge_cases() |
Return edge case list |
render(result) |
Render with reporter |
assert_output_valid(output) |
Validate output |
Test Data Generation¶
from truthound.reporters.sdk import (
create_sample_data,
create_edge_case_data,
create_stress_test_data,
)
# Standard sample data
sample = create_sample_data()
# Edge case data
edge_cases = create_edge_case_data()
# Returns: empty_result, single_issue, max_severity, unicode_content, ...
# Stress test data
stress = create_stress_test_data(
issue_count=10000,
validator_count=100
)
Output Capture and Benchmarking¶
capture_output¶
from truthound.reporters.sdk import capture_output
# Capture stdout/stderr
with capture_output() as captured:
reporter.print(result)
print(f"Stdout: {captured.stdout}")
print(f"Stderr: {captured.stderr}")
benchmark_reporter¶
from truthound.reporters.sdk import benchmark_reporter, BenchmarkResult
# Benchmark reporter performance
result: BenchmarkResult = benchmark_reporter(
reporter=get_reporter("json"),
data=create_stress_test_data(issue_count=1000),
iterations=100
)
print(f"Mean time: {result.mean_time:.4f}s")
print(f"Std dev: {result.std_dev:.4f}s")
print(f"Min time: {result.min_time:.4f}s")
print(f"Max time: {result.max_time:.4f}s")
print(f"Throughput: {result.throughput:.2f} ops/sec")
BenchmarkResult Fields:
| Field | Type | Description |
|---|---|---|
mean_time |
float |
Mean execution time (seconds) |
std_dev |
float |
Standard deviation |
min_time |
float |
Minimum execution time |
max_time |
float |
Maximum execution time |
throughput |
float |
Throughput per second |
iterations |
int |
Iteration count |
Custom Reporter Registration¶
Register with Decorator¶
from truthound.reporters import register_reporter
from truthound.reporters.base import ValidationReporter, ReporterConfig
@register_reporter("my_custom")
class MyCustomReporter(ValidationReporter[ReporterConfig]):
name = "my_custom"
file_extension = ".custom"
def render(self, data):
return f"Custom: {data.status.value}"
Manual Registration¶
from truthound.reporters.factory import register_reporter
register_reporter("my_custom", MyCustomReporter)
# Usage
reporter = get_reporter("my_custom")
API Reference¶
SDK Exports¶
from truthound.reporters.sdk import (
# Mixins
FormattingMixin,
AggregationMixin,
FilteringMixin,
SerializationMixin,
TemplatingMixin,
StreamingMixin,
# Builder
ReporterBuilder,
create_reporter,
create_validation_reporter,
# Templates
CSVReporter,
YAMLReporter,
JUnitXMLReporter,
NDJSONReporter,
TableReporter,
# Schema
ReportSchema,
JSONSchema,
XMLSchema,
CSVSchema,
TextSchema,
ValidationResult,
ValidationError,
SchemaError,
validate_output,
register_schema,
get_schema,
unregister_schema,
validate_reporter_output,
infer_schema,
merge_schemas,
# Testing
ReporterTestCase,
create_mock_result,
create_mock_results,
create_mock_validator_result,
MockResultBuilder,
MockValidationResult,
MockValidatorResult,
assert_valid_output,
assert_json_valid,
assert_xml_valid,
assert_csv_valid,
assert_contains_patterns,
create_sample_data,
create_edge_case_data,
create_stress_test_data,
capture_output,
benchmark_reporter,
BenchmarkResult,
)