Changelog
Release history and migration guides.
0.2.0 (2026-04-13)
Multi-table synthesis, API redesign, and developer experience improvements.
Multi-Table Synthesis
Tableclass — typed table definition withprimary_key,foreign_keys,sequential, andsequence_byparameterssynthesize_tables()— synthesize related tables with automatic dependency ordering, PK auto-assignment, and FK remapping- Sequential generation — child tables are generated conditioned on the parent, preserving per-entity patterns (transaction counts, temporal ordering, value distributions)
- Fan-out support — multiple child tables referencing the same parent (e.g. accounts → transactions + loans)
- N-parent tables —
sequence_byfor disambiguation when a table has multiple foreign keys - Flat remap —
sequential=Falsefor independent generation with FK integrity only
API Redesign
foreign_keysparameter acceptsTableobjects instead of string references — IDE autocomplete, type-safe, typo → immediate errorcontext_keyremoved — sequential generation is automatic whenforeign_keysis setModel.create()—foreign_keyparameter (renamed fromgroup_by),parent_keyinference from FK column name
Developer Experience
- Logging —
dataxid.enable_logging()/dataxid.disable_logging()/DATAXID_LOGenvironment variable ModelConfig— typed dataclass with IDE-discoverable fields, replaces untyped config dict (dict still works)- Datetime auto-detection — string columns with datetime-like names are automatically encoded as datetime
TrainingTimeoutError— raised when server-side training exceedstimeoutTrainingError— raised when training fails on the server
Migration from 0.1.0
No breaking changes. All 0.1.0 code works unchanged.
# 0.1.0 — still works
synthetic = dataxid.synthesize(data=df, n_samples=1000)
# 0.2.0 — new capabilities
from dataxid import Table
synthetic = dataxid.synthesize_tables({
"accounts": Table(accounts_df, primary_key="account_id"),
"transactions": Table(transactions_df, foreign_keys={"account_id": accounts_tbl}),
})0.1.0 (2026-03-19)
Initial release.
dataxid.synthesize()— single-table synthetic data generation in one calldataxid.Model.create()/model.generate()— step-by-step control for large datasets and custom config- Privacy by architecture — raw data never leaves your machine, only embeddings (64 floats/row) cross the API boundary
- Error handling — typed exception hierarchy (
AuthenticationError,RateLimitError,QuotaExceededError, etc.) - Automatic retries — exponential backoff for 5xx errors and rate limits
DATAXID_API_KEYenvironment variable support