One grammar, many dialects
Datoria parses, renders, and analyzes 15 SQL dialects from a single grammar definition. Each dialect gets its own generated parser, lexer, renderer, and typed AST. They share a consistent API and the same semantic analysis tools.
One grammar, all dialects
All 15 dialects live in one grammar. Dialect-specific syntax is gated by predicates: when(bigquery) activates BigQuery's STRUCT syntax, when(snowflake) activates QUALIFY, and so on. At codegen time the predicates are evaluated, and each dialect gets a specialized parser with no runtime branching.
That's architecturally different from every other multi-dialect parser. SQLGlot maintains 31 hand-written dialect overrides. SDF/Fusion uses separate ANTLR grammars per dialect. Datoria writes the grammar once and generates the rest.
The practical consequences:
- Adding a dialect is fast. Redshift was added in a day, SQLite in a week. You add predicates, not a new parser.
- Bug fixes propagate. Fix a CTE parsing issue in the shared grammar and it's fixed in all 15 dialects.
- Consistent API. The same interface hierarchy, optimizer pipeline, formatter config, and error recovery work identically across all dialects.
Tests are identity roundtrips: parse SQL, render it back, verify byte-identical output. The corpus comes from 34+ real-world sources including PostgreSQL's pg_regress suite, the DuckDB test suite, Apache Spark tests, and more. Not synthetic — these are the tests the database vendors themselves use.