代码之家  ›  专栏  ›  技术社区  ›  IanCun

从MySQL数据库中选择多个字段副本

  •  1
  • IanCun  · 技术社区  · 6 年前

    我有一个旧论坛,其中包含重复的第一个帖子线程(可能不同的答复)。我想删除除一个线程以外的所有线程(使该线程具有最高的视图计数)。

    我有下面的SQL查询来帮助识别重复的线程,但是我找不到方法让它只列出xf的值最低的重复线程_thread.view\u计数列:

    SELECT
        t.thread_id, MIN(t.view_count)
    FROM
        xf_thread t
    INNER JOIN
        xf_post p ON p.thread_id = t.thread_id
    WHERE
        t.first_post_id = p.post_id
    GROUP BY
        t.title,
        t.username,
        p.message
    HAVING
        COUNT(t.title) > 1
        AND COUNT(t.username) > 1
        AND COUNT(p.message) > 1;
    

    我已经阅读了如何解决这个问题,但我不知道如何实现这一点-因为看起来不可能按查询分组对行排序。

    编辑

    多亏了Madhur的帮助,查询现在返回所有要删除的线程id。但是,我可以找出如何删除具有匹配线程id的行( https://stackoverflow.com/a/52314208/2469308 )以秒为单位运行:

    DELETE FROM xf_thread 
    WHERE  thread_id IN (SELECT Substring_index(Group_concat(DISTINCT t.thread_id 
                                                ORDER BY 
                                                t.view_count 
                                                                    ASC 
                                                SEPARATOR ','), 
                                       ',', 1) AS 
                                thread_id_with_minimum_views 
                         FROM   (SELECT * 
                                 FROM   xf_thread) t 
                                INNER JOIN xf_post p 
                                        ON p.thread_id = t.thread_id 
                         WHERE  t.first_post_id = p.post_id 
                                AND t.user_id = 0 
                                AND t.reply_count < 2 
                         GROUP  BY t.title, 
                                   t.username, 
                                   p.message 
                         HAVING Count(t.title) > 1 
                                AND Count(t.username) > 1 
                                AND Count(p.message) > 1 
                         ORDER  BY t.thread_id); 
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   Madhur Bhaiya    6 年前

    一个非常棘手的解决办法是 thread_id 排序依据 view_count 在一个 GROUP_CONCAT . 然后,我们可以利用字符串操作来获得 最低限度 查看\u计数

    在你的 SELECT 子句,而不是 t.thread_id

    SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT t.thread_id 
                                 ORDER BY t.view_count ASC 
                                 SEPARATOR ','), 
                    ',', 
                    1) AS thread_id_with_minimum_views
    

    现在,根据 选择 查询以识别具有最小视图的重复记录, DELETE 查询以从 xf_thread 表格如下:

    DELETE t_delete FROM xf_thread AS t_delete 
    INNER JOIN (SELECT CAST(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT t.thread_id ORDER BY t.view_count ASC SEPARATOR ','), ',', 1) AS UNSIGNED) AS tid_min_view 
                FROM (SELECT * FROM xf_thread) t 
                INNER JOIN xf_post p ON p.thread_id = t.thread_id 
                WHERE t.first_post_id = p.post_id 
                  AND t.user_id = 0 
                  AND t.reply_count < 2 
                GROUP BY t.title, t.username, p.message 
                HAVING Count(t.title) > 1 
                   AND Count(t.username) > 1 
                   AND Count(p.message) > 1 
                ORDER BY t.thread_id) AS t_dup 
      ON t_delete.thread_id = t_dup.tid_min_view