dbt Macros¶
SQL macros for data quality validation.
truthound_check¶
Data quality validation macro for tables/models:
{{ truthound_check(
table='my_table',
rules=[
{'type': 'not_null', 'column': 'id'},
{'type': 'unique', 'column': 'id'},
]
) }}
Parameters¶
| Parameter | Type | Description |
|---|---|---|
table |
string | Table or model name |
rules |
list | List of validation rules |
Rule Format¶
Rule Types¶
not_null¶
NULL value check:
unique¶
Uniqueness check:
in_range¶
Range check:
{{ truthound_check(
table='orders',
rules=[{
'type': 'in_range',
'column': 'amount',
'min': 0,
'max': 1000000
}]
) }}
in_set¶
Value set check:
{{ truthound_check(
table='orders',
rules=[{
'type': 'in_set',
'column': 'status',
'values': ['pending', 'completed', 'cancelled']
}]
) }}
regex¶
Regex pattern check:
{{ truthound_check(
table='customers',
rules=[{
'type': 'regex',
'column': 'email',
'pattern': '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'
}]
) }}
Database-Specific Adapters¶
Handles SQL syntax differences across databases.
BigQuery¶
-- adapters/bigquery.sql
{% macro bigquery_regex_check(column, pattern) %}
REGEXP_CONTAINS({{ column }}, r'{{ pattern }}')
{% endmacro %}
Snowflake¶
-- adapters/snowflake.sql
{% macro snowflake_regex_check(column, pattern) %}
REGEXP_LIKE({{ column }}, '{{ pattern }}')
{% endmacro %}
Redshift¶
-- adapters/redshift.sql
{% macro redshift_regex_check(column, pattern) %}
{{ column }} ~ '{{ pattern }}'
{% endmacro %}
Databricks¶
-- adapters/databricks.sql
{% macro databricks_regex_check(column, pattern) %}
REGEXP({{ column }}, '{{ pattern }}')
{% endmacro %}
Utility Macros¶
truthound_utils.sql¶
-- Check if column exists
{% macro column_exists(table, column) %}
...
{% endmacro %}
-- Get table row count
{% macro row_count(table) %}
...
{% endmacro %}
Rule Definition Macros¶
Reusable rule definitions:
-- macros/truthound_rules.sql
{% macro common_id_rules(column='id') %}
{{ return([
{'type': 'not_null', 'column': column},
{'type': 'unique', 'column': column},
]) }}
{% endmacro %}
{% macro common_email_rules(column='email') %}
{{ return([
{'type': 'not_null', 'column': column},
{'type': 'regex', 'column': column, 'pattern': '^.*@.*$'},
]) }}
{% endmacro %}
Usage:
Test Models¶
Example test model: