代码之家 › 专栏 › 技术社区 › Maryam Ghafarinia

如何删除选择到自身表中的对

mysql sql

Maryam Ghafarinia · 技术社区 · 3 年前

我用它自己连接了一张桌子,我重复了几对,正如我在下图中强调的那样,如何删除它们?

select DISTINCT A.name as name1 , B.name as name2
from (select name , ratings.* from reviewers inner join ratings on reviewers.id = 
ratings.reviewer_id ) A ,
(select name , ratings.* from reviewers inner join ratings on reviewers.id = 
ratings.reviewer_id ) B
where A.reviewer_id <> B.reviewer_id 
and A.book_id = B.book_id
order by name1 , name2 ASC

名称1	名称2
艾莉丝露易斯	伊丽莎白·布莱克
克里斯·托马斯	约翰·史密斯
克里斯·托马斯	Mike White
伊丽莎白·布莱克	艾莉丝露易斯
伊丽莎白·布莱克	杰克·格林
杰克·格林	伊丽莎白·布莱克
乔·马丁内斯	安德森先生
约翰·史密斯	克里斯·托马斯
安德森先生	乔·马丁内斯
Mike White	克里斯·托马斯

上表曾经是 image

5 回复 | 直到 3 年前

alejandroMAD 3 年前

我创建了DDL和DML语句来重现数据库,并编写一个检索未复制对的查询。以下是对其他人有帮助的“建筑”规范:

CREATE TABLE books (
  id INT PRIMARY KEY,
  title VARCHAR(100)
);

CREATE TABLE reviewers (
  id INT PRIMARY KEY,
  name VARCHAR(50)
);

CREATE TABLE ratings (
  id INT PRIMARY KEY,
  reviewer_id INT,
  book_id INT,
  rating INT,
  FOREIGN KEY (reviewer_id) REFERENCES reviewers(id),
  FOREIGN KEY (book_id) REFERENCES books(id)
);

-- Inserting sample records
INSERT INTO reviewers (id, name)
VALUES
  (1, 'Alice Lewis'),
  (2, 'Elizabeth Black'),
  (3, 'Chris Thomas'),
  (4, 'John Smith'),
  (5, 'Mike White'),
  (6, 'Jack Green'),
  (7, 'Joe Martinez'),
  (8, 'Mike Anderson');

INSERT INTO books (id, title)
VALUES
  (1, 'The Gulag Archipelago'),
  (2, 'One Day in the Life of Ivan Denisovich'),
  (3, 'Cancer Ward');

-- Insertion of rating records
INSERT INTO ratings (id, reviewer_id, book_id, rating)
VALUES
  (1, 1, 1, 4),
  (2, 1, 2, 3),
  (3, 2, 1, 5),
  (4, 2, 2, 4),
  (5, 2, 3, 2),
  (6, 3, 1, 3),
  (7, 3, 3, 4),
  (8, 4, 1, 2),
  (9, 4, 3, 3),
  (10, 5, 2, 5),
  (11, 6, 1, 1),
  (12, 6, 2, 3),
  (13, 6, 3, 4),
  (14, 7, 1, 3),
  (15, 7, 2, 4),
  (16, 8, 3, 2);

下面是重构后的查询:

SELECT DISTINCT
  A.name AS name1,
  B.name AS name2
FROM
  (
    SELECT
      reviewers.id,
      reviewers.name,
      ratings.book_id
    FROM
      reviewers
      INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) A
  JOIN (
    SELECT
      reviewers.id,
      reviewers.name,
      ratings.book_id
    FROM
      reviewers
      INNER JOIN ratings ON reviewers.id = ratings.reviewer_id
  ) B ON A.book_id = B.book_id
     AND A.id <> B.id
     AND A.name < B.name
ORDER BY
  name1,
  name2 ASC;

同一个子查询与别名A和B一起使用两次,将 reviewers 和 ratings 表,并检索每对评论者书籍评级的数据。

主外部查询然后从子查询结果中选择不同的审阅者名称对。我们使用 JOIN 在子查询A和B之间关于3个条件:

A.book_id = B.book_id 因此,两位评论者对同一本书进行了评分。
A.id <> B.id 以筛选出具有相同ID的成对审阅者,从而防止自匹配。
A.name < B.name 以确保配对的顺序与下面的输出一致,从而消除重复。这样,给定的一对只考虑一个名字组合,例如“伊丽莎白-布莱克-杰克-格林”,而不是“杰克-格林-伊丽莎白-布莱克”。

这是您将从重构后的查询中获得的输出:

+-----------------+-----------------+
|      name1      |      name2      |
+-----------------+-----------------+
| Alice Lewis     | Chris Thomas    |
| Alice Lewis     | Elizabeth Black |
| Alice Lewis     | Jack Green      |
| Alice Lewis     | Joe Martinez    |
| Alice Lewis     | John Smith      |
| Alice Lewis     | Mike White      |
| Chris Thomas    | Elizabeth Black |
| Chris Thomas    | Jack Green      |
| Chris Thomas    | Joe Martinez    |
| Chris Thomas    | John Smith      |
| Chris Thomas    | Mike Anderson   |
| Elizabeth Black | Jack Green      |
| Elizabeth Black | Joe Martinez    |
| Elizabeth Black | John Smith      |
| Elizabeth Black | Mike Anderson   |
| Elizabeth Black | Mike White      |
| Jack Green      | Joe Martinez    |
| Jack Green      | John Smith      |
| Jack Green      | Mike Anderson   |
| Jack Green      | Mike White      |
| Joe Martinez    | John Smith      |
| Joe Martinez    | Mike White      |
| John Smith      | Mike Anderson   |
+-----------------+-----------------+

Stu 3 年前

你可以做

select Name1, Name2
from ...
where Name1 < Name2;

看见 this example

SelVazi 3 年前

这可以使用 greatest 和 least 要识别跨列的重复项,请执行以下操作:

此查询将获得重复项:

select greatest(name, name2), least(name, name2)
from mytable
group by greatest(name, name2), least(name, name2)
having count(1) > 1

结果:

first_name  second_name
Jack Green  Elizabeth Black

然后:

DELETE t.* FROM mytable t
INNER JOIN (
  select greatest(name, name2) as first_name, least(name, name2) as second_name
  from mytable
  group by greatest(name, name2), least(name, name2)
  having count(1) > 1
) as s ON (t.name = s.first_name and t.name2 = s.second_name)
          OR (t.name2 = s.first_name and t.name = s.second_name)

Demo here

Luuk WHIP 3 年前

要删除重复项,您首先需要找到它们:

SELECT 
  LEAST(name1,name2) as L,
  GREATEST(name1,name2) as G
FROM names
GROUP BY LEAST(name1,name2), GREATEST(name1,name2) 
HAVING count(*)>1;

然后您可以删除它们:

WITH cte as (
  SELECT 
    LEAST(name1,name2) as L,
    GREATEST(name1,name2) as G
  FROM names
  GROUP BY LEAST(name1,name2), GREATEST(name1,name2) 
  HAVING count(*)>1
)
DELETE FROM names
WHERE (name2,name1) in (select * from cte);

请参阅: DBFIDDLE