One grammar, many dialects

Datoria parses, renders, and analyzes 15 SQL dialects from a single grammar definition. Each dialect gets its own generated parser, lexer, renderer, and typed AST. They share a consistent API and the same semantic analysis tools.

One grammar, all dialects

All 15 dialects live in one grammar. Dialect-specific syntax is gated by predicates: when(bigquery) activates BigQuery's STRUCT syntax, when(snowflake) activates QUALIFY, and so on. At codegen time the predicates are evaluated, and each dialect gets a specialized parser with no runtime branching.

That's architecturally different from every other multi-dialect parser. SQLGlot maintains 31 hand-written dialect overrides. SDF/Fusion uses separate ANTLR grammars per dialect. Datoria writes the grammar once and generates the rest.

The practical consequences:

Adding a dialect is fast. Redshift was added in a day, SQLite in a week. You add predicates, not a new parser.
Bug fixes propagate. Fix a CTE parsing issue in the shared grammar and it's fixed in all 15 dialects.
Consistent API. The same interface hierarchy, optimizer pipeline, formatter config, and error recovery work identically across all dialects.

Tests are identity roundtrips: parse SQL, render it back, verify byte-identical output. The corpus comes from 34+ real-world sources including PostgreSQL's pg_regress suite, the DuckDB test suite, Apache Spark tests, and more. Not synthetic — these are the tests the database vendors themselves use.

One grammar, all dialects​

One grammar, all dialects