dquery

An SQL compiler, generated from one grammar.

43 dialects of SQL — every common database, and still growing — defined once as a declarative grammar. A code generator emits the parser, renderer, and a full semantic layer — types, column lineage, optimization. Engineered for performance and precision, so your product doesn't need a SQL compiler team.

43Dialects

609k+Tests

99.9%Roundtrip Pass Rate

44µsPer File

What it is

An SQL compiler, generated from a declarative grammar.

Not a hand-written parser, and not a wrapper around a database. SQL is described once as pure data — a declarative grammar — and a code generator emits the entire engine from it. That architecture is where the performance and the precision come from.

One declarative grammar

Every SQL construct is written once as pure data — no code, no functions — so the whole grammar can be analyzed statically. 43 dialects share it through inheritance: PostgreSQL extends ANSI, DuckDB extends PostgreSQL. Adding the next one is a grammar change, not a new parser — which is why coverage keeps growing.

The parser is generated, not written

The grammar is the source of truth. A language-agnostic code generator reads it and emits the entire parser, lexer, typed AST, and renderer — none of it hand-written. Today it's a Java library, callable from any JVM language; Rust is next, with Python and TypeScript wrappers likely to follow.

A semantic layer on top

Above the parser: scope resolution, type inference, column-level lineage, and a multi-pass optimizer — all dialect-agnostic. Parse PostgreSQL, analyze it neutrally, render it as Snowflake.

Performance44µs / file

A JVM library that out-parses native code: 2.6x faster than sqlparser-rs (Rust) and 5.0x faster than libpg_query, the C parser inside PostgreSQL. Generated code, not interpreted — zero backtracking, with dispatch trees computed statically from the grammar.

Faster than hand-written parsers because the structure is decided at generation time, not at runtime.
A Java library today, callable from any JVM language; Rust is next, with native and WebAssembly builds underway.

Precision99.9% pass

Lossless roundtripping: parse and re-render byte-for-byte, every token preserved — keywords, punctuation, even whitespace and comments.

609,098 identity tests from 47 independent sources, parsed, rendered, and re-parsed.
Fully typed AST — 10k+ immutable node types, not untyped dictionaries.
Column lineage and type inference that survive CTEs, subqueries, and window functions.

Try it

Types, lineage, and AST, live

Edit the SQL below. The analyzer returns inferred types (with nullability, VARIANT, and struct types), column-level lineage through CTEs and JOINs, and the full typed AST. Switch dialects to see how parsing differs.

Loading playground...

What you can build

Anything that has to understand SQL.

The hard part — reading SQL precisely, across 43 dialects, with types and lineage — is done. Anything that has to read, check, transform, or trace SQL becomes a product you can build on top. A few of the directions it opens:

AI agents & copilots

Give an agent real tools for SQL: a column’s full lineage to reason about a query, formatting, and verification of the SQL it generates — checked before it ever runs.

Dialect migration

Datoria parses every major dialect today — Snowflake, Redshift, BigQuery, MySQL. Migration becomes paste-and-go for your users, not a six-month consulting engagement.

Pipeline & dbt lineage

Stitch every transformation — raw, templated, or dbt — into one typed graph. Trace any column end to end, and catch type and breaking changes before they ship.

Query-log analysis

Which of last week’s 80,000 queries touch users.email? Seconds, not a manual audit. Pattern mining, regression detection, and BI lineage all fall out of it.

Smarter SQL editors

Column-aware autocomplete, semantic errors, jump-to-definition. The understanding layer ships today — wire it into VS Code, JetBrains, or your own IDE.

Schema evolution

Before a migration applies, flag the views that will break, the policy referencing a renamed column, the ALTER that isn’t type-compatible. Before, not after.

The first product built on it is dfmt, our free SQL formatter.

The coverage board

Full dialects. All 43 of them.
609,098 statements, byte for byte.

A box is not a demo — it is a complete SQL dialect, DDL through window functions through vendor quirks, proven by parse → render → re-parse roundtrips on its own corpus — including the official PostgreSQL and Google ZetaSQL compliance suites. A line groups dialects sharing one grammar; +N engines are the compatible services each also answers as. Gold means 100%.

ClickHouse100%147,669 tests+1 engine IBM DB2100%27,243 tests SQLite100%13,672 tests Snowflake100%3,396 tests Oracle100%2,393 tests Amazon Redshift100%968 tests ANSI SQL98.9%714 tests PostgreSQL100%116,581 tests+7 engines YugabyteDB100%63,036 tests CockroachDB96.9%3,605 tests Apache Hive99.1%55,424 tests Spark SQL100%16,228 tests Apache Impala99.8%15,466 tests Databricks SQL100%271 tests StarRocks100%15,195 tests Apache Doris100%13,815 tests MariaDB100%11,872 tests MySQL100%10,044 tests+1 engine SingleStore100%9,910 tests TiDB94.1%3,838 tests PlanetScale100%157 tests DuckDB100%46,313 tests MotherDuck100%59 tests Trino100%6,116 tests Presto100%4,510 tests Amazon Athena100%162 tests DuneSQL100%29 tests Starburst100%48 tests BigQuery100%10,270 tests Cloud Spanner96.7%457 tests T-SQL100%9,445 tests+1 engine Azure Synapse100%42 tests Microsoft Fabric100%150 tests PromQL100%2,052 tests KQL100%1,771 tests GraphQL100%3 tests

Performance

44µs per file

22.9x faster than SQLGlot. No backtracking, linear time, JIT-friendly generated code.

Druid (JVM) (12/15)

27µs0.6x

Datoria (JVM)

44µs1.0x

jOOQ (JVM)

103µs2.4x

sqlparser-rs

114µs2.6x

Polyglot

115µs2.6x

sqloxide

168µs3.8x

Trino (JVM) (14/15)

175µs4.0x

Flink SQL (JVM) (6/15)

200µs4.6x

Spark (JVM)

208µs4.8x

pg_query

217µs5.0x

1000 iterations, 15 files, Apple M1 Max.

The stack

From grammar to column lineage in one pipeline

Grammar DSL

One grammar definition with dialect predicates. Pure data, no embedded functions.

Code Generator

Generates parser, lexer, 10,814 immutable typed AST interfaces, and a renderer per dialect.

dbt & Jinja

Full Jinja evaluator with ref(), source(), var(), config(). Compiles 164 public dbt projects (27,665 models) end-to-end.

Scope Resolution

Fully qualified references across nested CTEs, subqueries, and lateral joins.

Type Inference

20+ rule types, 134 function signatures, dialect-aware coercion, nullable tracking.

Column Lineage

One-pass DAG traces columns through JOINs, CTEs, window functions, and star expansion. Tested across 164 public dbt projects (27,665 models).

Error Recovery

Partial parses with precise error positions. Never crashes on broken or incomplete SQL.

Query Optimizer

15-pass pipeline: qualify, simplify, pushdown, unnest, merge, eliminate.

SQL Formatter

Adaptive formatting with trivia-signal detection. 20 config options, Jinja-aware.

dbt & Jinja

164 public dbt projects. 27,665 models. Zero errors.

We don't shell out to dbt or rely on regex. We wrote our own Jinja parser, evaluator, and dbt project loader, because you can't analyze a template you can't evaluate.

Jinja parser and evaluator

A purpose-built Jinja engine with its own generated parser (same grammar IR as the SQL parser). Handles macros, filters, control flow, and nested expressions, with proper scoping.

dbt project loader

Reads dbt_project.yml, resolves ref() and source(), loads seeds and schemas, and builds the DAG. Every model is compiled to pure SQL, then parsed, type-checked, and traced for lineage.

Adapter support

dbt-utils, dbt-expectations, and the core adapter macros for Snowflake, BigQuery, PostgreSQL, Redshift, Spark, Databricks, and more. Each project declares its dialect, and the matching parser handles the compiled SQL.

Runtime macros

Some dbt macros call the warehouse at build time (run_query, execute). We analyze the SQL they intend to run statically, so your pipeline doesn't need live database credentials.

Grammar

One grammar, 43 dialects

Every SQL construct is defined once in a declarative grammar, with dialect predicates picking the variants. Click any element to drill in. Switch dialects to see how the syntax varies.

Loading grammar...

Comparison

How it stacks up

The other open-source SQL compilers — SQLGlot, SDF, SQLMesh — are now owned by Fivetran or dbt. Datoria is the one that isn't, and the only engine pairing this dialect breadth with full semantic analysis.

	Datoria	SQLGlot Python, Fivetran-owned	SDF/Fusion Rust, dbt Labs-owned	Calcite Java, Apache	jOOQ Java, commercial
SQL dialects	43, growing	24+	4	1 (ANSI)	20+
Parse speed	44µs	999µs	—	221µs	103µs
Roundtripping	Lossless	Lossy	Lossy	Lossy	Lossy
Typed AST	3k+ shared + 7k+ dialect types	Untyped dict	Rust structs	Java classes	Java DSL
Column lineage	Yes (O(n))	Limited	Yes	No	No
Type inference	Dialect-aware	Limited	Yes	Yes	Yes
dbt / Jinja	Built-in	No	Built-in	No	No
Error recovery	Yes	No	No	No	No
SQL formatter	Adaptive	Basic	No	No	No
Independence	Yes	Fivetran	dbt Labs	Apache	Data Geekery

Two ways in.

Try the formatter now. If you want the full engine, we're opening early access to a small number of companies, so get in touch.

Try dfmt, free Get access