SQL Nulls and Tristate Logic - Fooled Again

Categories: Programming

I work only intermittently with SQL, and every once in a while I fall into the old SQL tristate logic trap - like yesterday. It’s just not natural..

For those who don’t know, here’s a quick recap…

In most programming languages, boolean expressions have two possible outcomes: true or false.

In SQL, they have three: true, false, or unknown. Unknown and sql null are the same thing.

An SQL null can be considered to represent “missing data”; it is a placeholder indicating “this value is not known” - aka “unknown”. So for some integer column:

  • (null == 1) –> unknown (the value might be 1 or not, we can’t say as we don’t have the data)
  • (null != 1) –> unknown (also might or might not be the case, we don’t have the data)

The usual boolean and/or operators also have interesting behaviour:

  • (true or null) –> true (defininitely true even when the second part of the or is an “unknown”)
  • (true and null) –> unknown
  • (false or null) –> unknown
  • (false and null) –> false (definitely false even when the second part of the and is an “unknown”)

A SQL select statement’s WHERE clause is a boolean expression - and matches (selects) only records for which the where-clause returns true. Records for which the where-clause evaluates to false are ignored (not selected) - as are those records for which the boolean where-clause evaluates to “unknown”!

So given a table with (id integer not null, domain varchar) and contents:

  • (id=1, domain=”vonos.net”)
  • (id=2, domain=”foo.example”)
  • (id=3, domain=null)

the following occurs:

  • select id where domain='vonos.net'” returns just (1) - expected by most programmers
  • select id where domain<>'vonos.net'” returns just (2) - often not expected

In the second case, “null <> 'vonos.net'” evaluates to unknown, which is not true so the select skips over it.

Possibly the most unintuitive part is that a matched pair of two “equals” and “not-equals” selects still don’t return the full set of data (the trap I just fell into). To match all records, a third statement is needed:

  • select id where domain is null