Error Recovery
Most SQL parsers treat a syntax error as total failure -- one unexpected token and you get nothing back. For any tool that needs to work with SQL as it's being written (IDEs, code agents, autocomplete) or with messy real-world SQL (migration analysis, batch linting), this is a dealbreaker.
Datoria's parser handles broken, incomplete, and malformed SQL gracefully -- producing partial ASTs with precise error locations instead of failing outright. The valid 80% of a query retains full structure, typing, and lineage capability.
What You Get
- Partial ASTs -- the correctly-parsed portions of a query are represented as normal typed AST nodes; only the broken portions become error nodes
- Precise error positions -- each error carries an exact source position (or span), not just a line number
- Multiple errors per statement -- the parser recovers and continues, finding as many errors as possible in a single pass
- All tokens preserved -- even tokens inside error spans are retained in the AST, maintaining lossless representation
- Error messages -- each error node carries a diagnostic message describing what went wrong
Try it in the playground -- the ANSI playground includes an error recovery example you can edit.
Example
Given this broken SQL with a typo in the WHERE clause:
SELECT customer_id, name, email
FROM customers
WHERE status = 'active'
ANND created_at > '2024-01-01'
ORDER BY name
The parser produces a full AST for the SELECT, FROM, and ORDER BY clauses. The WHERE clause contains a partial AST with the valid status = 'active' condition, plus an error node spanning ANND created_at > '2024-01-01' with a precise position pointing to the unexpected token ANND.
Most other parsers would reject the entire statement and return nothing -- losing the structural information about the valid 80% of the query.
Zero Overhead on Valid SQL
Error recovery adds no performance cost when parsing valid SQL. The recovery mechanisms only activate when the parser encounters unexpected tokens. On the happy path, parsing runs at full speed -- the same 56 microseconds/file that benchmarks measure.