代码之家  ›  专栏  ›  技术社区  ›  Hannover Fist

分组连续记录

  •  0
  • Hannover Fist  · 技术社区  · 6 年前

    我正在尝试对一些记录进行分组,以便为给定客户的特定站点找到第一个记录。问题是这些记录在不同的站点之间来回移动,所以我需要将非连续的站点日期范围分开。

    考虑到样本数据,我想最终得到3条记录——一条记录是从2018年7月3日开始的站点1,第二条记录是从2018年9月3日开始的站点2,第三条记录是从2018年11月3日开始的站点1。

    SELECT 9999 AS CLIENT_ID, 1 AS SITE_NUM, '2018-07-03' AS START_DATE, '2018-08-05' AS CREATED_DATE, 1 AS RECORD_ID
    INTO #TEMP
    UNION 
    SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-08-01' AS CONSENT_SIGN_DATE, '2018-10-05' AS CREATED_DATE, 2
    UNION 
    SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-07-03' AS CONSENT_SIGN_DATE, '2018-09-22' AS CREATED_DATE, 3
    UNION 
    SELECT 9999 AS MEMBER_ID, 2 AS SITE_NUM, '2018-09-03' AS CONSENT_SIGN_DATE, '2018-09-05' AS CREATED_DATE, 4
    UNION 
    SELECT 9999 AS MEMBER_ID, 2 AS SITE_NUM, '2018-10-03' AS CONSENT_SIGN_DATE, '2018-10-05' AS CREATED_DATE, 5
    UNION 
    SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-11-03' AS CONSENT_SIGN_DATE, '2018-11-05' AS CREATED_DATE, 6
    UNION 
    SELECT 9999 AS MEMBER_ID, 1 AS SITE_NUM, '2018-12-01' AS CONSENT_SIGN_DATE, '2018-12-05' AS CREATED_DATE, 7
    

    我一直在玩ROW_NUM,但还没有弄清楚如何区分Site 1的两组日期。

    SELECT *, ROW_NUMBER()OVER(PARTITION BY T.CLIENT_ID, T.SITE_NUM ORDER BY T.START_DATE, T.RECORD_ID)
    FROM #TEMP T 
    LEFT JOIN #TEMP T2 ON T2.CLIENT_ID = T.CLIENT_ID AND T2.RECORD_ID = T.RECORD_ID - 1
    ORDER BY T.RECORD_ID
    

    如何按客户和单个站点的连续日期对结果进行分组?

    2 回复  |  直到 6 年前
        1
  •  2
  •   Gordon Linoff    6 年前

    这是一个缺口和岛屿问题。为此,行号的差异是最好的方法:

    select t.client_id, t.site_num, min(t.start_date), max(t.start_date)
    from (select t.*,
                 row_number() over (partition by t.client_id order by T.START_DATE, T.RECORD_ID) as seqnum_c,
                 row_number() over (partition by t.client_id, t.site_num order by T.START_DATE, T.RECORD_ID) as seqnum_cs
          from #temp t
         ) t
    group by client_id, site_num, (seqnum_c - seqnum_cs)
    
        2
  •  1
  •   Sahil Dhoked    6 年前

    您需要的是,连续行不应具有相同的SITE_NUM值。您只需在查询末尾添加where子句。

    SELECT *, ROW_NUMBER()OVER(PARTITION BY T.CLIENT_ID, T.SITE_NUM ORDER BY T.START_DATE, T.RECORD_ID)
    FROM #TEMP T 
    LEFT JOIN #TEMP T2 ON T2.CLIENT_ID = T.CLIENT_ID AND T2.RECORD_ID = T.RECORD_ID - 1
    ORDER BY T.RECORD_ID
    WHERE T.SITE_NUM <> T2.SITE_NUM OR T2.SITE_NUM IS NULL
    

    编辑 根据@SteveB的建议,添加 T2.SITE_NUM IS NULL 也要播放最后一张唱片。