Skip to content

GreatExpectationsAdapter

The Great Expectations adapter employs rules-based validation methodology.

Basic Usage

from common.engines import GreatExpectationsAdapter

engine = GreatExpectationsAdapter()

# Using common rule format (automatically converted to GE expectations)
result = engine.check(
    data=df,
    rules=[
        {"type": "not_null", "column": "id"},
        {"type": "unique", "column": "email"},
        {"type": "in_range", "column": "age", "min": 0, "max": 150},
        {"type": "in_set", "column": "status", "values": ["active", "inactive"]},
    ],
)

Rule Type Conversion

Common rule types are automatically converted to GE Expectations:

Common Rule Type GE Expectation
not_null expect_column_values_to_not_be_null
unique expect_column_values_to_be_unique
in_set expect_column_values_to_be_in_set
in_range expect_column_values_to_be_between
regex expect_column_values_to_match_regex
dtype expect_column_values_to_be_of_type
min_length / max_length expect_column_value_lengths_to_be_between
column_exists expect_column_to_exist

Direct GE Expectation Usage

result = engine.check(
    data=df,
    rules=[
        {"type": "expect_column_to_exist", "column": "id"},
        {
            "type": "expect_column_mean_to_be_between",
            "column": "value",
            "min_value": 0,
            "max_value": 100,
        },
    ],
)

GE-Specific Parameters

result = engine.check(
    data=df,
    rules=rules,
    result_format="COMPLETE",  # BOOLEAN_ONLY, BASIC, COMPLETE
    fail_on_error=True,
)

Data Conversion

GE requires Pandas DataFrames. Polars DataFrames are automatically converted:

import polars as pl

polars_df = pl.read_csv("data.csv")
result = engine.check(polars_df, rules)  # Operates transparently

Profiling

profile = engine.profile(df)

for col in profile.columns:
    print(f"{col.column_name}: {col.dtype}")
    print(f"  Null: {col.null_percentage}%")

Schema Learning

learn_result = engine.learn(df)

for rule in learn_result.rules:
    print(f"{rule.column}: {rule.rule_type}")

Lifecycle Management

with GreatExpectationsAdapter() as engine:
    result = engine.check(df, rules)
    health = engine.health_check()

Configuration

from common.engines import GreatExpectationsConfig

config = GreatExpectationsConfig(
    result_format="COMPLETE",
    context_root_dir=None,
    include_profiling=True,
    catch_exceptions=True,
    enable_data_docs=False,
)

engine = GreatExpectationsAdapter(config=config)

Supported Data Types

Data Type Support
Pandas DataFrame Native
Polars DataFrame Auto-conversion