代码之家  ›  专栏  ›  技术社区  ›  kee

BigQuery:如何从重复记录中只提取某个字段作为另一个重复字段

  •  0
  • kee  · 技术社区  · 7 年前

    下面是BigQuery中的一个示例表:

    WITH test AS (
      SELECT
        [ 
          STRUCT("Rudisha" as name, 123 as id),
          STRUCT("Murphy" as name, 124 as id),
          STRUCT("Bosse" as name, 125 as id),
          STRUCT("Rotich" as name,  126 as id)
        ] AS data
    
        UNION
    
        [
          STRUCT("Lewandowski" as name, 127 as id),
          STRUCT("Kipketer" as name, 128 as id),
          STRUCT("Berian" as name, 129 as id)
        ] AS data
    )
    

    在这里,我想提取记录字段(“data”)中的“id”字段作为可重复字段。因此,行数将保持不变,但仅限于重复类型的ids字段:

    ids: [123, 124, 125, 126]
    ids: [127, 128, 129]
    

    我该怎么做?

    1 回复  |  直到 5 年前
        1
  •  3
  •   Mikhail Berlyant    7 年前

    下面是BigQuery标准SQL

    #standardSQL
    WITH test AS (
      SELECT
        [ 
          STRUCT("Rudisha" AS name, 123 AS id),
          STRUCT("Murphy" AS name, 124 AS id),
          STRUCT("Bosse" AS name, 125 AS id),
          STRUCT("Rotich" AS name,  126 AS id)
        ] AS data
        UNION ALL SELECT
        [
          STRUCT("Lewandowski" AS name, 127 AS id),
          STRUCT("Kipketer" AS name, 128 AS id),
          STRUCT("Berian" AS name, 129 AS id)
        ] AS data
    )
    SELECT ARRAY(SELECT id FROM UNNEST(data)) ids
    FROM test