Dear group,
I would like to ask a brief question about NULL values and the IN operator.
The following SQL evaluates to (with ANSI_NULLS ON):
1 IN (1, 2, 3, NULL) --> Evaluates to true (makes perfect sense).
NULL IN (1, 2, 3) --> Evaluates to unknown (makes perfect sense).
4 IN (1, 2, 3) --> Evaluates to false (makes perfect sense).
ButK
4 IN (1, 2, 3, NULL) --> Evaluates to unknown and not to false.
I read that most major databases do this exactly the same and I actually
was able to find something about this behavior in the PostGreSQL
documentation (I couldnt find anything about this in the BOL, so I hoped
that the PostgreSQL might apply)
The docs state that for the IN operator:
If there are no equal right-hand values and at least one right-hand row
yields null, the result of the IN construct will be null, not false. This
is in accordance with SQL's normal rules for Boolean combinations of null
values.
So, why does SQL compare to a NULL value (if present) when no matching
values can be found for the right hand of the IN construct?
Kind regards,
Marcel> 4 IN (1, 2, 3, NULL) --> Evaluates to unknown and not to false.
In this expression, we know for sure that 4 is not equal to 1, 2 or 3.
However, we can't say with certainty whether or not 4 is equal to or is not
equal to the unknown NULL value. The expression would be true if the
unknown value were 4 and would be false if the unknown value were 5.
Without certainty of the unknown value, standard SQL rules implemented by
various DBMS products return NULL rather than true or false.
Hope this helps.
Dan Guzman
SQL Server MVP
"Marcel van den Hof" <marcelvdh@.gmail.com> wrote in message
news:jetivog7pu0t$.jqeavwovs3nw.dlg@.40tude.net...
> Dear group,
> I would like to ask a brief question about NULL values and the IN
> operator.
> The following SQL evaluates to (with ANSI_NULLS ON):
> 1 IN (1, 2, 3, NULL) --> Evaluates to true (makes perfect sense).
> NULL IN (1, 2, 3) --> Evaluates to unknown (makes perfect sense).
> 4 IN (1, 2, 3) --> Evaluates to false (makes perfect sense).
> ButK
> 4 IN (1, 2, 3, NULL) --> Evaluates to unknown and not to false.
> I read that most major databases do this exactly the same and I actually
> was able to find something about this behavior in the PostGreSQL
> documentation (I couldnt find anything about this in the BOL, so I hoped
> that the PostgreSQL might apply)
> The docs state that for the IN operator:
> If there are no equal right-hand values and at least one right-hand row
> yields null, the result of the IN construct will be null, not false. This
> is in accordance with SQL's normal rules for Boolean combinations of null
> values.
> So, why does SQL compare to a NULL value (if present) when no matching
> values can be found for the right hand of the IN construct?
> Kind regards,
> Marcel|||On Sat, 6 Aug 2005 19:22:01 +0100, Marcel van den Hof wrote:
(snip)
>The docs state that for the IN operator:
>If there are no equal right-hand values and at least one right-hand row
>yields null, the result of the IN construct will be null, not false. This
>is in accordance with SQL's normal rules for Boolean combinations of null
>values.
Hi Marcel,
This is from the PostgreSQL docs you mentioned, I presume?
This behaviour is in compliance with the ANSI standard. Dan has already
explained the rationale. There is only one minor mistake in the
PostGreSQL doc - the result of the IN construct with a NULL at the
right-hand side is not NULL, but UNKNOWN.
This distinction IS relevant. Null means "no valid data". Unknown is
valid data in three-valued logic.
Of course, the PostgreSQL doc is a thousand times better than the SQL
Server Books Online. BOL states:
"If the value of test_expression is equal to any value returned by
subquery or is equal to any expression from the comma-separated list,
the result value is TRUE. Otherwise, the result value is FALSE.
Using NOT IN negates the returned value."
And that is not a minor mistake - it is just plain wrong.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||> 4 IN (1, 2, 3, NULL) --> Evaluates to unknown and not to false.
Because we don't know the value represented by the NULL and therefore we
don't know whether 4 is in the list or not.
This is reasonably intuitive but ultimately you will search in vain for
satisfactory logic in SQL's handling of NULLs and three-value logic.
Consider the boolean expression:
(x=x) AND (y=y)
where x is NULL is y is non-NULL. The expected result is UNKNOWN, not TRUE.
This defies rational explanation. If the value of x is unknown then the one
thing we DO know for sure about x is that it is equal to itself! On the
other hand if the value x is deemed "inapplicable" then the comparison (x=x)
is surely a no-op and the rest of the expression should be evaluated without
it:
(y=y) = TRUE ... (but not in SQL).
Sorry, but the correct answer to your question is "because the SQL Standard
says so". :-)
David Portas
SQL Server MVP
--|||Dan, Hugo and David thank you for your very clear and concise answers. You
have really helped me to improve my understanding of the three valued logic
and NULL values that are used in SQL server. A pity the BOL documentation
is somewhat inaccurate about these important matters.
If I want to further my understanding about these matters then I suppose
the best place for me is to study the ANSI SQL 92/ 99 standard?
Any links or pointers to relevant documentation (that is accurate ;-)) are
greatly appreciated.
Once again, thanks for the prompt reply to my question.
Kind regards,
Marcel van den Hof|||Marcel van den Hof (marcelvdh@.gmail.com) writes:
> Dan, Hugo and David thank you for your very clear and concise answers.
> You have really helped me to improve my understanding of the three
> valued logic and NULL values that are used in SQL server. A pity the BOL
> documentation is somewhat inaccurate about these important matters.
I checked the SQL 2005 docs, and they are equally wrong. I submitted
a bug for this, although I believe it's too late for it to be fixed
for SQL 2005 RTM.
The bug is on
http://lab.msdn.microsoft.com/Produ...BK340
83
--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||On Sun, 7 Aug 2005 01:18:09 +0100, Marcel van den Hof wrote:
(snip)
>If I want to further my understanding about these matters then I suppose
>the best place for me is to study the ANSI SQL 92/ 99 standard?
Hi Marcel,
Not exactly. Studying the ANSI documentation is not a job for the faint
of heart. Seriously - they are written to define a standard, in as
concise a way as possible. They are not written to facilitate easy
understanding.
>Any links or pointers to relevant documentation (that is accurate ;-)) are
>greatly appreciated.
Most books are fairly accurate. Just keep in mind that all authors are
human, and humans can err. Also keep in mind that the more entry-level
books have to simplify things; books aimed at expert level will usually
present more of the fine details.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)
No comments:
Post a Comment