Internationalization (i18n)¶
This document describes the profiler's multilingual support system.
Overview¶
The internationalization system implemented in src/truthound/profiler/i18n.py supports 7 languages, with 3 languages (English, Korean, Japanese) having built-in translations.
Supported Languages¶
| Language Code | Language | Built-in Translation |
|---|---|---|
en |
English | ✅ Fully supported |
ko |
한국어 | ✅ Fully supported |
ja |
日本語 | ✅ Fully supported |
zh |
中文 | 📝 Catalog only |
de |
Deutsch | 📝 Catalog only |
fr |
Français | 📝 Catalog only |
es |
Español | 📝 Catalog only |
MessageCode¶
Over 50 message codes are defined by category. Values use dot notation.
class MessageCode(str, Enum):
"""Message codes"""
# Analysis messages (analysis.*)
ANALYSIS_FAILED = "analysis.failed"
ANALYSIS_COLUMN_FAILED = "analysis.column_failed"
ANALYSIS_TABLE_FAILED = "analysis.table_failed"
ANALYSIS_EMPTY_DATA = "analysis.empty_data"
ANALYSIS_SKIPPED = "analysis.skipped"
# Pattern messages (pattern.*)
PATTERN_UNKNOWN = "pattern.unknown"
PATTERN_INVALID = "pattern.invalid"
PATTERN_NOT_FOUND = "pattern.not_found"
# Type messages (type.*)
TYPE_UNKNOWN = "type.unknown"
TYPE_MISMATCH = "type.mismatch"
TYPE_INFERENCE_FAILED = "type.inference_failed"
# IO messages (io.*)
IO_READ_FAILED = "io.read_failed"
IO_WRITE_FAILED = "io.write_failed"
IO_FILE_NOT_FOUND = "io.file_not_found"
# Timeout messages (timeout.*)
TIMEOUT_EXCEEDED = "timeout.exceeded"
TIMEOUT_WARNING = "timeout.warning"
# Validation messages (validation.*)
VALIDATION_FAILED = "validation.failed"
VALIDATION_SKIPPED = "validation.skipped"
# Config messages (config.*)
CONFIG_INVALID = "config.invalid"
CONFIG_MISSING = "config.missing"
# Error messages (err.*)
ERR_UNKNOWN = "err.unknown"
LocaleManager¶
A thread-safe locale manager.
from truthound.profiler.i18n import LocaleManager
manager = LocaleManager()
# Get current locale (property)
current = manager.current # "en"
# Set locale
manager.set_locale("ko")
# Get locale (method)
locale = manager.get_locale() # "ko"
# Get fallback chain
chain = manager.get_fallback_chain("ko_KR") # ["ko_KR", "ko", "en"]
# Register new locale
manager.register_locale("zh_CN", fallback="zh")
# Per-thread locale
import threading
# Each thread can have independent locale
MessageCatalog¶
Message translation catalog. Created per locale.
from truthound.profiler.i18n import MessageCatalog
# Create with locale and message dictionary
messages = {
"analysis.failed": "Analysis failed: {reason}",
"analysis.column_failed": "Column analysis failed: {column}",
}
catalog = MessageCatalog("en", messages)
# Get message (supports pluralization)
msg = catalog.get("analysis.failed") # "Analysis failed: {reason}"
msg = catalog.get("item.count", count=5) # Plural handling
# Check key existence
if catalog.has("analysis.failed"):
print("Key exists")
# Get all keys
all_keys = catalog.keys() # ["analysis.failed", "analysis.column_failed", ...]
I18n Singleton¶
Global I18n instance.
from truthound.profiler.i18n import I18n
# Get singleton instance
i18n = I18n.get_instance()
# Set locale
i18n.set_locale("ko")
# Get translated message (using t method)
msg = i18n.t("analysis.failed", reason="No data")
print(msg) # "분석 실패: No data"
# Check key existence
if i18n.has("analysis.failed"):
print("Key exists")
# Get all keys
all_keys = i18n.list_keys()
# Add custom loader
i18n.add_loader(my_custom_loader)
Convenience Functions¶
from truthound.profiler.i18n import set_locale, get_locale, get_message, t
# Set global locale
set_locale("ko")
# Get current locale
locale = get_locale() # "ko"
# Get message
msg = get_message("analysis.failed", reason="Data error")
print(msg) # "분석 실패: Data error"
# Shorthand alias (t function)
msg = t("pattern.not_found", pattern="email")
Context Manager¶
Context manager for temporary locale switching.
from truthound.profiler.i18n import locale_context
# Temporarily use Japanese
with locale_context("ja"):
msg = get_message(MessageCode.ANALYSIS_STARTED)
print(msg) # Japanese message
# Reverts to original locale after context exits
Hierarchical Fallback¶
Follows fallback chain when locale is not found.
# Fallback order: ko_KR → ko → en
# If ko_KR message not found, uses ko
# If ko message not found, uses en
I18nError Exception Hierarchy¶
Internationalized exception classes.
from truthound.profiler.i18n import (
I18nError,
I18nAnalysisError,
I18nPatternError,
I18nTypeError,
I18nIOError,
I18nTimeoutError,
I18nValidationError,
)
try:
profile = profiler.profile_file("nonexistent.csv")
except I18nIOError as e:
# Error message translated to current locale
print(e.localized_message) # "File not found: nonexistent.csv"
print(e.message_code) # MessageCode.IO_FILE_NOT_FOUND
Default message codes for each exception class:
| Exception Class | Default Message Code |
|---|---|
I18nError |
err.unknown |
I18nAnalysisError |
analysis.failed |
I18nPatternError |
pattern.unknown |
I18nTypeError |
type.unknown |
I18nIOError |
io.read_failed |
I18nTimeoutError |
timeout.exceeded |
I18nValidationError |
validation.failed |
Built-in Translations (English)¶
ENGLISH_MESSAGES = {
"analysis.failed": "Analysis failed: {reason}",
"analysis.column_failed": "Column analysis failed: {column}",
"analysis.table_failed": "Table analysis failed: {table}",
"analysis.empty_data": "Empty data provided",
"pattern.unknown": "Unknown pattern",
"pattern.not_found": "Pattern not found: {pattern}",
"type.unknown": "Unknown type",
"type.mismatch": "Type mismatch: expected {expected}, got {actual}",
"io.read_failed": "Failed to read: {path}",
"io.file_not_found": "File not found: {path}",
# ...
}
Built-in Translations (Korean)¶
KOREAN_MESSAGES = {
"analysis.failed": "분석 실패: {reason}",
"analysis.column_failed": "컬럼 분석 실패: {column}",
"analysis.table_failed": "테이블 분석 실패: {table}",
"analysis.empty_data": "빈 데이터가 제공되었습니다",
"pattern.unknown": "알 수 없는 패턴",
"pattern.not_found": "패턴을 찾을 수 없습니다: {pattern}",
"type.unknown": "알 수 없는 타입",
"type.mismatch": "타입 불일치: {expected} 예상, {actual} 받음",
"io.read_failed": "읽기 실패: {path}",
"io.file_not_found": "파일을 찾을 수 없습니다: {path}",
# ...
}
Built-in Translations (Japanese)¶
JAPANESE_MESSAGES = {
"analysis.failed": "分析失敗: {reason}",
"analysis.column_failed": "カラム分析失敗: {column}",
"analysis.table_failed": "テーブル分析失敗: {table}",
"analysis.empty_data": "空のデータが提供されました",
"pattern.unknown": "不明なパターン",
"pattern.not_found": "パターンが見つかりません: {pattern}",
"type.unknown": "不明な型",
"type.mismatch": "型の不一致: {expected}を期待、{actual}を受信",
"io.read_failed": "読み取り失敗: {path}",
"io.file_not_found": "ファイルが見つかりません: {path}",
# ...
}
Custom Message Registration¶
from truthound.profiler.i18n import register_messages
# Register custom messages
custom_messages = {
"MY_CUSTOM_MSG": "Custom message with {placeholder}",
}
register_messages("en", custom_messages)
# Load from file
from truthound.profiler.i18n import load_messages_from_file
load_messages_from_file("my_messages.yaml", locale="ko")
Report Internationalization¶
from truthound.profiler.visualization import HTMLReportGenerator, ReportConfig
from truthound.profiler.i18n import set_locale
# Generate Korean report
set_locale("ko")
generator = HTMLReportGenerator()
html = generator.generate(profile)
# All text in report displayed in Korean
CLI Usage¶
# Profile in Korean
th profile data.csv --locale ko
# Generate Japanese report
th profile data.csv --locale ja --output report.html
# Use system locale
th profile data.csv --locale auto
Environment Variables¶
| Variable | Description | Default |
|---|---|---|
TRUTHOUND_LOCALE |
Default locale | en |
LANG |
System locale (fallback) | - |
Next Steps¶
- Visualization - Generate internationalized reports
- Distributed Processing - Locale management in distributed environments