我想知道为什么
BINARY_CHECKSUM
函数返回相同的不同结果:
SELECT *, BINARY_CHECKSUM(a,b) AS bc
FROM (VALUES(1, NULL, 100),
(2, NULL, NULL),
(3, 1, 2)) s(id,a,b);
SELECT *, BINARY_CHECKSUM(a,b) AS bc
FROM (VALUES(1, NULL, 100),
(2, NULL, NULL)) s(id,a,b);
Ouput:
+-----+----+------+-------------+
| id | a | b | bc |
+-----+----+------+-------------+
| 1 | | 100 | -109 |
| 2 | | | -2147483640 |
| 3 | 1 | 2 | 18 |
+-----+----+------+-------------+
-- -109 vs 100
+-----+----+------+------------+
| id | a | b | bc |
+-----+----+------+------------+
| 1 | | 100 | 100 |
| 2 | | | 2147483647 |
+-----+----+------+------------+
对于第二个样本,我得到了我预期的结果:
SELECT *, BINARY_CHECKSUM(a,b) AS bc
FROM (VALUES(1, 1, 100),
(2, 3, 4),
(3,1,1)) s(id,a,b);
SELECT *, BINARY_CHECKSUM(a,b) AS bc
FROM (VALUES(1, 1, 100),
(2, 3, 4)) s(id,a,b);
前两行的输出:
+-----+----+------+-----+
| id | a | b | bc |
+-----+----+------+-----+
| 1 | 1 | 100 | 116 |
| 2 | 3 | 4 | 52 |
+-----+----+------+-----+
db<>fiddle demo
当我想比较两个表/查询时,会产生奇怪的结果:
WITH t AS (
SELECT 1 AS id, NULL AS a, 100 b
UNION ALL SELECT 2, NULL, NULL
UNION ALL SELECT 3, 1, 2
), s AS (
SELECT 1 AS id ,100 AS a, NULL as b
UNION ALL SELECT 2, NULL, NULL
UNION ALL SELECT 3, 2, 1
)
SELECT t.*,s.*
,BINARY_CHECKSUM(t.a, t.b) AS bc_t, BINARY_CHECKSUM(s.a, s.b) AS bc_s
FROM t
JOIN s
ON s.id = t.id
WHERE BINARY_CHECKSUM(t.a, t.b) = BINARY_CHECKSUM(s.a, s.b);
db<>fiddle demo2
对于3行,我得到单个结果:
+-----+----+----+-----+----+----+--------------+-------------+
| id | a | b | id | a | b | bc_t | bc_s |
+-----+----+----+-----+----+----+--------------+-------------+
| 2 | | | 2 | | | -2147483640 | -2147483640 |
+-----+----+----+-----+----+----+--------------+-------------+
但对于2行,我也得到id=1:
+-----+----+------+-----+------+----+-------------+------------+
| id | a | b | id | a | b | bc_t | bc_s |
+-----+----+------+-----+------+----+-------------+------------+
| 1 | | 100 | 1 | 100 | | 100 | 100 |
| 2 | | | 2 | | | 2147483647 | 2147483647 |
+-----+----+------+-----+------+----+-------------+------------+
评论:
对于这个定义,我们说指定类型的空值,
以相等值进行比较。如果
表达式列表更改,表达式校验和也可以更改。
然而,这并不能保证。因此,要检测值是否
已更改,我们建议仅在
应用程序可以容忍偶尔错过的更改。
这对我来说很奇怪
哈希函数为相同的输入参数返回不同的结果
.
这种行为是设计出来的还是某种小故障?
编辑:
AS
@scsimon
指出它适用于物化表,但不适用于CTE。
db<>fiddle actual table
CTE元数据:
SELECT name, system_type_name
FROM sys.dm_exec_describe_first_result_set('
SELECT *
FROM (VALUES(1, NULL, 100),
(2, NULL, NULL),
(3, 1, 2)) s(id,a,b)', NULL,0);
SELECT name, system_type_name
FROM sys.dm_exec_describe_first_result_set('
SELECT *
FROM (VALUES(1, NULL, 100),
(2, NULL, NULL)) s(id,a,b)', NULL,0)
SELECT name, system_type_name
FROM sys.dm_exec_describe_first_result_set('
SELECT *
FROM (VALUES(1, cast(NULL as int), 100),
(2, NULL, NULL)) s(id,a,b)', NULL,0)
对于所有情况,所有列都是
INT
但是有明确的
CAST
它的行为应该如此。
db<>fidde metadata