SQL Nulls and Tristate Logic - Fooled Again
Categories: Programming
I work only intermittently with SQL, and every once in a while I fall into the old SQL tristate logic trap - like yesterday. It’s just not natural..
For those who don’t know, here’s a quick recap…
In most programming languages, boolean expressions have two possible outcomes: true or false.
In SQL, they have three: true, false, or unknown. Unknown and sql null are the same thing.
An SQL null can be considered to represent “missing data”; it is a placeholder indicating “this value is not known” - aka “unknown”. So for some integer column:
- (null == 1) –> unknown (the value might be 1 or not, we can’t say as we don’t have the data)
- (null != 1) –> unknown (also might or might not be the case, we don’t have the data)
The usual boolean and/or operators also have interesting behaviour:
- (true or null) –> true (defininitely true even when the second part of the or is an “unknown”)
- (true and null) –> unknown
- (false or null) –> unknown
- (false and null) –> false (definitely false even when the second part of the and is an “unknown”)
A SQL select statement’s WHERE clause is a boolean expression - and matches (selects) only records for which the where-clause returns true. Records for which the where-clause evaluates to false are ignored (not selected) - as are those records for which the boolean where-clause evaluates to “unknown”!
So given a table with (id integer not null, domain varchar) and contents:
- (id=1, domain=”vonos.net”)
- (id=2, domain=”foo.example”)
- (id=3, domain=null)
the following occurs:
- “
select id where domain='vonos.net'
” returns just (1) - expected by most programmers - “
select id where domain<>'vonos.net'
” returns just (2) - often not expected
In the second case, “null <> 'vonos.net'
” evaluates to unknown, which is not true so the select skips over it.
Possibly the most unintuitive part is that a matched pair of two “equals” and “not-equals” selects still don’t return the full set of data (the trap I just fell into). To match all records, a third statement is needed:
- “
select id where domain is null
”