Type Inference
The type inference engine determines the data type of every expression in a SQL query -- from simple column references and literals to complex nested expressions involving functions, CASE, CAST, aggregates, and window functions.
Nullable-Aware Type System
Every inferred type carries nullability information:
| Type | Meaning |
|---|---|
| Non-nullable | Known data type, guaranteed non-null |
| Nullable | Known data type, may be null |
| Null | The NULL literal |
| Unknown | Type could not be resolved (with diagnostic reason) |
This four-state model propagates through all expressions. A COALESCE(nullable_col, 0) produces a non-nullable result -- because at least one argument (0) is non-nullable, the output is guaranteed non-null. A LEFT JOIN column becomes Nullable even if the source column is non-nullable. A CASE with no ELSE clause is Nullable.
Data Types
The type system includes 100+ data types covering all 15 dialects:
- Numeric -- TinyInt, SmallInt, Integer, BigInt, Int128, Decimal, BigDecimal, Float32, Float64
- String -- Char, NChar, VarChar, NVarChar, Text
- Temporal -- Date, Time, DateTime, Timestamp, Timestamptz, Timetz, Interval
- Boolean -- Boolean
- Binary -- Binary, VarBinary, Blob
- Complex -- Array, Struct, Map (with recursive element types)
- Dialect-specific -- JSON, JSONB (PostgreSQL), SUPER (Redshift), GEOGRAPHY (BigQuery), VARIANT (Snowflake)
Function Signatures
134+ built-in ANSI function signatures are registered, with dialect-aware extensions. Each function signature defines how input types map to output types:
- Fixed return --
COUNT(*)always returns Integer - Passthrough --
ABS(x)returns the same type as its argument - Coercion --
COALESCE(a, b)returns the least upper bound of all argument types - Math-aware -- division of two integers produces a floating-point result
- Conditional --
IF(cond, then, else)coerces the then/else branches - Constructor --
ARRAY[1, 'a']infers the array element type from its contents - Aggregate --
SUM(integer_col)widens to BigInt;SUM(decimal_col)preserves Decimal
Type Coercion
When types need to be combined (UNION columns, CASE branches, COALESCE arguments), the engine applies coercion hierarchies:
- Numeric rank -- TinyInt < SmallInt < Integer < BigInt < Int128 < Decimal < BigDecimal < Float32 < Float64
- Temporal -- Date widens to DateTime widens to Timestamp widens to Timestamptz
- String -- Text > VarChar > Char > NVarChar > NChar
- Parameterized -- Decimal precision/scale, VarChar length, and Timestamp precision are merged to the wider variant
- Complex -- Array and Struct element types are coerced recursively
What It Powers
Type inference is not just informational -- it enables:
- Column lineage accuracy -- knowing that
SUM(a)produces BigInt vs Decimal affects downstream type propagation - Optimization safety -- the optimizer uses types to determine whether simplifications preserve semantics (e.g.,
x = NULLcan only be simplified if the nullability is known) - Schema validation -- verify that UNION branches, INSERT values, and CASE branches have compatible types
- IDE features -- type-aware auto-complete and inline type hints