Upstream Source
This page is part of Truthound Orchestration 3.x.
Source repository: seadonggyun4/truthound-orchestration
Upstream docs path: docs/prefect/retries-caching-concurrency.md
Edit upstream page: Edit in orchestration
Prefect Retries, Caching, and Concurrency¶
Prefect is strongest when operational behavior is explicit in Python. Truthound leans into that by exposing task and flow configuration surfaces for retries, cache settings, and execution control instead of hiding them behind adapter-specific magic.
Who This Is For¶
- platform teams defining retry and cache policy
- flow authors tuning expensive quality checks
- operators standardizing concurrency and replay behavior
When To Use It¶
Use this page when:
- validation tasks should retry transient failures
- repeated profile or learn operations should reuse cached results
- the team needs clear concurrency boundaries for larger validation workloads
Prerequisites¶
- familiarity with Prefect tasks and flows
- awareness of the shared Truthound preflight/runtime split
- a clear policy for transient failure vs hard data-quality failure
Minimal Quickstart¶
Use a flow config with retries:
from truthound_prefect import QualityFlowConfig, create_quality_flow
flow = create_quality_flow(
"validate_users",
cfg=QualityFlowConfig(
rules=[{"column": "id", "check": "not_null"}],
retries=2,
retry_delay_seconds=30,
engine_name="truthound",
),
)
Task builders also expose cache-related settings:
from truthound_prefect import create_check_task
check_task = create_check_task(
retries=1,
cache_key="dim_users_quality",
cache_expiration_seconds=300,
)
Production Pattern¶
Use this decision table:
| Concern | Recommended Policy |
|---|---|
| transient network or warehouse failure | retry at the task or flow layer |
| known bad dataset | do not retry blindly; fix data or downgrade severity |
| expensive repeated quality summaries | use a short cache window where safe |
| multi-table fan-out | control concurrency at the Prefect deployment/work-pool layer |
Recommended checklist:
- retry only infrastructure failures, not deterministic data failures
- keep cache keys dataset-specific and environment-aware
- document whether a cached result is allowed to gate downstream execution
Failure Modes and Troubleshooting¶
| Symptom | Likely Cause | What To Do |
|---|---|---|
| failures repeat across retries | the issue is data quality, not transport | stop retrying and surface the result directly |
| stale quality state appears | cache keys are too broad or cache expiration is too long | narrow the key and shorten expiration |
| flow overwhelms workers | parallel fan-out lacks an explicit concurrency policy | move concurrency control to deployment/work-pool settings |