Error Handling

Transmog raises exceptions when errors occur during processing. All exceptions inherit from TransmogError.

Error Types

TransmogError

Base exception for all Transmog errors. Available as tm.TransmogError.

try:
    result = tm.flatten(data)
except tm.TransmogError as e:
    print(f"Transmog error: {e}")

ValidationError

Raised when input data validation or processing fails. Available as tm.ValidationError.

# Invalid data type
invalid_data = "not a dict or list"

try:
    result = tm.flatten(invalid_data)
except tm.ValidationError as e:
    print(f"Validation error: {e}")

MissingDependencyError

Raised when an optional dependency is missing. Available as tm.MissingDependencyError.

try:
    result.save("output.parquet")
except tm.MissingDependencyError as e:
    print(f"Missing dependency: {e}")
    print("Install with: pip install pyarrow")

ConfigurationError

Raised when TransmogConfig receives invalid parameters (e.g., batch_size < 1, invalid id_generation value). Not exported in the public API — catch using TransmogError as the base class.

OutputError

Raised when writing output files fails. Not exported in the public API — catch using TransmogError as the base class. Common triggers:

  • Schema drift in strict mode (CSV streaming encounters unexpected fields)

  • File permission errors or disk full during writes

  • Avro schema mismatch between batches

try:
    tm.flatten_stream(data, "output/", output_format="csv")
except tm.TransmogError as e:
    # OutputError is caught via the base class
    print(f"Write failed: {e}")

Custom Error Handling

def safe_flatten(data, **kwargs):
    try:
        return tm.flatten(data, **kwargs)
    except tm.ValidationError as e:
        logging.warning("Invalid data: %s", e)
        return None
    except tm.TransmogError as e:
        logging.error("Processing failed: %s", e)
        return None

Examples

Missing Natural IDs

config = tm.TransmogConfig(id_generation="natural", id_field="id")
data = {"name": "Product"}  # Missing 'id'

try:
    result = tm.flatten(data, config=config)
except tm.TransmogError as e:
    print(f"Error: {e}")

Malformed JSONL

# File with invalid JSON on line 2
try:
    result = tm.flatten("malformed.jsonl")
except tm.TransmogError as e:
    print(f"Error processing file: {e}")

Missing Optional Dependency

try:
    tm.flatten_stream(data, "output/", output_format="avro")
except tm.MissingDependencyError as e:
    print(f"Missing dependency: {e}")
    print("Install with: pip install fastavro cramjam")

Troubleshooting

Common Errors

“Missing dependency” when saving Parquet/ORC: Install PyArrow: pip install pyarrow

“Missing dependency” when saving Avro: Install fastavro and cramjam: pip install fastavro cramjam

Schema drift error during Avro streaming: When using flatten_stream() with Avro output, the schema is locked after the first batch. If later batches contain fields not present in the first batch, a schema drift error is raised. Ensure input data has a consistent structure, or process a representative sample first to establish the schema.

ConfigurationError on invalid config: Catch using TransmogError since ConfigurationError is not exported:

try:
    config = tm.TransmogConfig(batch_size=-1)
except tm.TransmogError as e:
    print(f"Invalid config: {e}")