代码之家 › 专栏 › 技术社区 › Kevin Babcock

按一组唯一的列值筛选SQL查询,而不管它们的顺序如何

oracle sql

Kevin Babcock · 技术社区 · 15 年前

我在Oracle中有一个表,其中包含两列,我想查询包含值的唯一组合的记录,而不管这些值的顺序如何。例如,如果我有下表:

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2)
);

我想查询所有独特的关系。所以,如果我有一个唱片人,我不想看到另一个唱片人,我的唱片人,我的唱片人,我不想看到另一个唱片人,我的唱片人,我的唱片人。

有简单的方法吗?

10 回复 | 直到 15 年前

Bill Karwin 15 年前

关于你是否想防止从插入数据库中复制。您可能只想获取唯一的对,同时保留重复项。

因此,对于后一种情况,这里有一个可选的解决方案,查询唯一的对,即使存在重复项:

SELECT r1.*
FROM Relationships r1
LEFT OUTER JOIN Relationships r2
  ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1)
WHERE r1.person_1 < r1.person_2
  OR  r2.person_1 IS NULL;

因此,如果有一个与ID相反的匹配行,那么就有一个规则,查询应该选择哪个行(ID是按数字顺序排列的)。

如果没有匹配的行,那么r2将为空(这是外部联接的工作方式),因此在这种情况下只使用在r1中找到的任何内容。

无需使用 GROUP BY 或 DISTINCT ,因为只能有零行或一行匹配。

在mysql中尝试,我得到了以下优化计划:

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                               | rows | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
|  1 | SIMPLE      | r1    | ALL    | NULL          | NULL    | NULL    | NULL                              |    2 |                          | 
|  1 | SIMPLE      | r2    | eq_ref | PRIMARY       | PRIMARY | 8       | test.r1.person_2,test.r1.person_1 |    1 | Using where; Using index | 
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+

这似乎是索引的一个合理的好用法。

Marc Gravell 15 年前

关系是否总是双向的?也就是说,如果约翰和吉尔有亲戚关系,那有没有总是 A约翰,吉尔和吉尔,约翰?如果是这样的话,只需限制在人1<人2的地方,并采用不同的集合。

tekBlues 15 年前

select distinct
case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a,
case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b
FROM RELATIONSHIPS;

Rob van Wijk 15 年前

未经测试的:

select least(person_1,person_2)
     , greatest(person_1,person_2)
  from relationships
 group by least(person_1,person_2)
     , greatest(person_1,person_2)

为了防止出现这样的重复条目,您可以使用相同的思想添加一个唯一的索引(tested!):

SQL> create table relationships
  2  ( person_1 number not null
  3  , person_2 number not null
  4  , relationship number not null
  5  , constraint pk_relationships primary key (person_1, person_2)
  6  )
  7  /

Table created.

SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2))
  2  /

Index created.

SQL> insert into relationships values (1,2,0)
  2  /

1 row created.

SQL> insert into relationships values (1,3,0)
  2  /

1 row created.

SQL> insert into relationships values (2,1,0)
  2  /
insert into relationships values (2,1,0)
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated

当做, Rob。

Bill Karwin 15 年前

您应该在 Relationships 表中的数字 person_1 值必须小于数字 person_2 价值。

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2),
    constraint UNIQ_RELATIONSHIPS
        CHECK (PERSON_1 < PERSON_2)
);

这样就可以确保(2,1)永远不会被插入——它必须是(1,2)。那么,主键约束将防止重复。

附言:我看到马克·格雷维尔的回答比我快,有一个类似的解决办法。

Aistina 15 年前

我认为像这样的事情应该可以做到:

select * from RELATIONSHIPS group by PERSON_1, PERSON_2

MikeNereson 15 年前

“我想KM差不多把它弄好了,”我补充说。

SELECT DISTINCT *
    FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS
          UNION 
          SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS
         ) dt

copaX 15 年前

这是愚蠢的,但它至少会告诉你你有什么独特的组合,只是不是在一个真正方便的方式…

select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end)
from relationships;

Flatline75 15 年前

可能最简单的解决方案(不需要更改数据结构或创建触发器)是创建一组不带重复项的结果,并将其中一个重复项添加到该集合中。

看起来像:

 select * from relationships where rowid not in 
    (select a.rowid from  relationships a,relationships b 
       where a.person_1=b.person_2 and a.person_2=b.person_1)
union all
 select * from relationships where rowid in 
    (select a.rowid from  relationships a,relationships b where 
       a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2)

但通常我不会创建没有单列主键的表。

Scott Swank 15 年前

你可以,

with rel as (
select *,
       row_number() over (partition by least(person_1,person_2), 
                                       greatest(person_1,person_2)) as rn
  from relationships
       )
select *
  from rel
 where rn = 1;