代码之家  ›  专栏  ›  技术社区  ›  Ben L.

mysql-类似时间戳上的组

  •  3
  • Ben L.  · 技术社区  · 15 年前

    查询:

      SELECT project_id, 
             COUNT(*) AS count,
             MIN(date_added) AS date_start, 
             MAX(date_added) AS date_end
        FROM my_table 
    GROUP BY project_id, TIMESTAMPDIFF(MINUTE, date_added) < 5
    WHERE user_id = 1 LIMIT 10
    

    我怎样才能做到这一点?我想将项目分组,使一个组中的两个连续项目之间的间隔不超过5分钟,但开始和结束时间可以是任意距离。在数据库中是否有任何方法可以做到这一点,或者我需要获取所有数据并在程序中找到它?

    3 回复  |  直到 15 年前
        1
  •  3
  •   Mike    15 年前

    好的,这里是:

    SELECT id, project_id, start_time, MAX(end_time) AS end_time FROM (
      SELECT
        @new_group :=
          ((TIME_TO_SEC(date_added) - @prev_second) > (5 * 60)) ||
          (project_id <> @prev_project_id) AS new_group,
        @date_added_group := @date_added_group + @new_group AS date_added_group,
        @start_time := IF(@new_group, date_added, @start_time) AS start_time,
        id,
        project_id,
        date_added AS end_time,
        @prev_second := TIME_TO_SEC(date_added) AS prev_sec,
        @prev_project_id := project_id AS prev_project
      FROM my_table,
      (SELECT
        @new_group :=0,
        @date_added_group := 0,
        @start_time := 0,
        @prev_second := 0,
        @prev_project_id := 0) AS vars
      ORDER BY project_id, date_added
    ) AS my_table GROUP BY project_id, date_added_group;
    

    鉴于此数据:

    +----+------------+---------------------+
    | id | project_id | date_added          |
    +----+------------+---------------------+
    |  1 |          1 | 2010-07-15 19:00:00 | < new project
    |  2 |          1 | 2010-07-15 19:01:00 |
    |  3 |          1 | 2010-07-15 19:02:00 |
    |  4 |          2 | 2010-07-15 19:03:00 | < new project
    |  5 |          2 | 2010-07-15 19:04:00 |
    |  6 |          2 | 2010-07-15 19:25:00 | < new interval
    |  7 |          2 | 2010-07-15 19:26:00 |
    |  8 |          2 | 2010-07-15 19:27:00 |
    |  9 |          2 | 2010-07-15 19:48:00 | < new interval
    | 10 |          2 | 2010-07-15 19:49:00 |
    | 11 |          3 | 2010-07-15 19:50:00 |
    | 12 |          3 | 2010-07-15 20:11:00 | < new interval
    | 13 |          4 | 2010-07-15 20:12:00 | < new project
    | 14 |          4 | 2010-07-15 20:13:00 |
    | 15 |          4 | 2010-07-15 20:14:00 |
    | 16 |          5 | 2010-07-15 20:15:00 | < new project
    | 17 |          5 | 2010-07-15 20:16:00 |
    | 18 |          5 | 2010-07-15 21:27:00 | < new interval
    | 19 |          5 | 2010-07-15 21:28:00 |
    | 20 |          6 | 2010-07-15 21:29:00 | < new project
    | 21 |          6 | 2010-07-15 21:30:00 |
    | 22 |          6 | 2010-07-15 21:31:00 |
    +----+------------+---------------------+
    

    查询返回此结果集:

    +----+------------+---------------------+---------------------+
    | id | project_id | start_time          | end_time            |
    +----+------------+---------------------+---------------------+
    |  1 |          1 | 2010-07-15 19:00:00 | 2010-07-15 19:02:00 | 
    |  4 |          2 | 2010-07-15 19:03:00 | 2010-07-15 19:04:00 |
    |  6 |          2 | 2010-07-15 19:25:00 | 2010-07-15 19:27:00 |
    |  9 |          2 | 2010-07-15 19:48:00 | 2010-07-15 19:49:00 |
    | 11 |          3 | 2010-07-15 19:50:00 | 2010-07-15 19:50:00 |
    | 12 |          3 | 2010-07-15 20:11:00 | 2010-07-15 20:11:00 |
    | 13 |          4 | 2010-07-15 20:12:00 | 2010-07-15 20:14:00 |
    | 16 |          5 | 2010-07-15 20:15:00 | 2010-07-15 20:16:00 |
    | 18 |          5 | 2010-07-15 21:27:00 | 2010-07-15 21:28:00 |
    | 20 |          6 | 2010-07-15 21:29:00 | 2010-07-15 21:31:00 |
    +----+------------+---------------------+---------------------+
    
        2
  •  2
  •   OMG Ponies    15 年前

    从我的头顶上,我没有尝试:

      SELECT project_id, 
             COUNT(*) AS count,
             MIN(date_added) AS date_start, 
             MAX(date_added) AS date_end
        FROM my_table 
       WHERE user_id = 1 
    GROUP BY project_id, ROUND(date_added / (5 * 60))
       LIMIT 10
    

    当然,假设“添加日期”只需几秒钟。

    换句话说,项目是根据它们所属的5分钟切片进行分组的。

        3
  •  0
  •   Harold L    15 年前

    既然你想把每对相距不到5分钟的最长记录链连在一起,我想你不能用 GROUP BY .

    尝试使用 WHILE loop . 从选择记录的光标开始 ORDER BY date_added . 您可以创建一个 TEMPORARY TABLE 在每组末尾插入一行。通过将其作为存储过程进行操作,可以避免将所有记录拉回到程序中(通常是通过网络),从而使其速度更快。