ID Management¶
ID generation strategies track records and maintain relationships between parent and child tables.
Random IDs (Default)¶
The default strategy generates unique UUIDs for all records:
import transmog as tm
data = {"product": {"name": "Laptop"}}
result = tm.flatten(data, name="products")
print(result.main[0])
# {'product_name': 'Laptop', '_id': 'uuid-generated', '_timestamp': '...'}
Natural IDs¶
Use existing ID fields from your data:
data = {
"product": {
"product_id": "PROD123",
"name": "Gaming Laptop",
"reviews": [
{"review_id": "REV456", "rating": 5},
{"review_id": "REV789", "rating": 4}
]
}
}
config = tm.TransmogConfig(id_generation="natural", id_field="product_id")
result = tm.flatten(data, name="products", config=config)
print(result.main[0])
# {'product_id': 'PROD123', 'product_name': 'Gaming Laptop'}
print(result.tables["products_reviews"][0])
# {'review_id': 'REV456', 'rating': 5, '_parent_id': 'PROD123'}
Important
Strategy "natural" requires the specified field to exist in all records.
A missing, empty, or null ID field raises TransmogError.
Child records use their own natural ID if the field is present; otherwise they fall back to generated IDs.
Hash-Based IDs¶
Generate deterministic IDs based on record content:
# Hash entire record
config = tm.TransmogConfig(id_generation="hash")
data = {"name": "Laptop", "price": 999}
result1 = tm.flatten(data, name="products", config=config)
result2 = tm.flatten(data, name="products", config=config)
# Same data produces same ID
assert result1.main[0]["_id"] == result2.main[0]["_id"]
Composite Key IDs¶
Hash only specific fields to create composite keys:
data1 = {"region": "US", "store": "001", "product": "laptop", "price": 999}
data2 = {"region": "US", "store": "001", "product": "laptop", "price": 899}
config = tm.TransmogConfig(id_generation=["region", "store", "product"])
result1 = tm.flatten(data1, name="sales", config=config)
result2 = tm.flatten(data2, name="sales", config=config)
# Same composite key produces same ID (price is ignored)
assert result1.main[0]["_id"] == result2.main[0]["_id"]
Metadata Field Names¶
id_field, parent_field, and time_field control the names of metadata
columns in the output. They do not affect how source data is read, with one
exception: id_field doubles as the source field name when
id_generation="natural" (see Natural IDs above).
Customize these names when the defaults conflict with your data schema:
config = tm.TransmogConfig(
id_field="record_id",
parent_field="parent_ref",
time_field="_created_at"
)
result = tm.flatten(data, config=config)
# Records use custom field names
print(result.main[0])
# {'name': 'Product', 'record_id': '...', '_created_at': '...'}
All three names must be distinct. Supplying the same value for any two raises a
ConfigurationError.
Disable timestamp tracking:
config = tm.TransmogConfig(time_field=None)
result = tm.flatten(data, config=config)
Parent-Child Relationships¶
Child records reference their parents through the parent_field output column.
This link is built automatically from the nesting structure — no configuration
beyond the field name is required.
result = tm.flatten(data, name="products")
main_id = result.main[0]["_id"]
for review in result.tables["products_reviews"]:
assert review["_parent_id"] == main_id