代码之家  ›  专栏  ›  技术社区  ›  tototl

BigQuery如何获取JSON结构中的值之和?

  •  2
  • tototl  · 技术社区  · 8 年前

    SELECT 
        JSON_EXTRACT(json, '$.Weights') as weight 
    from 
    (select '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' as json)
    

    返回:

    {“蓝色”:1.0,“紫色”:0.0,“黄色”:1.0,“绿色”:1.0}

    我想看看有没有办法 总结 所有颜色的值。返回的含义:

    3

    我一直在尝试使用拆分和最不常用的函数,但没有任何成功,有什么建议吗?谢谢

    2 回复  |  直到 8 年前
        1
  •  2
  •   Elliott Brossard    8 年前

    下面是一个使用 REGEXP_EXTRACT_ALL :

    WITH T AS (
      SELECT '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json
    )
    SELECT
      (
        SELECT SUM(CAST(val AS FLOAT64))
        FROM UNNEST(
          REGEXP_EXTRACT_ALL(
            JSON_EXTRACT(json, '$.Weights'),
            r':([^,}]+)')
        ) AS val
      )
    FROM T;
    
        2
  •  1
  •   Mikhail Berlyant    8 年前

    为了探索其他选择-

    下面是BigQuery标准SQL

    第一个示例是为每行提取键:值对

    #standardSQL
    WITH `project.dataset.yourTbale` AS (
      SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
      UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
    )
    SELECT id,
      REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') color, 
      SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64) value
    FROM `project.dataset.yourTbale`, 
    UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair
    

    这将给出以下结果

    id  color   value    
    1   blue    1.0  
    1   purple  0.0  
    1   yellow  1.0  
    1   green   1.0  
    2   blue    1.0  
    2   red     2.0  
    2   yellow  1.0  
    2   orange  3.0  
    

    现在很容易把上面的问题推广到 if there is a way to sum up all the values of the colors 甚至可以通过对特定颜色的潜在过滤来扩展它-参见下面的示例

    #standardSQL
    WITH `project.dataset.yourTbale` AS (
      SELECT 1 AS id, '{"Weights":{"blue":1.0,"purple":0.0,"yellow":1.0,"green":1.0}}' AS json 
      UNION ALL SELECT 2, '{"Weights":{"blue":1.0,"red":2.0,"yellow":1.0,"orange":3.0}}'
    )
    SELECT id,
      SUM(SAFE_CAST(SPLIT(pair, ':')[OFFSET (1)] AS FLOAT64)) AS total
    FROM `project.dataset.yourTbale`, 
    UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(json, '$.Weights'), r'{|}', ''))) pair
    WHERE REPLACE(SPLIT(pair, ':')[OFFSET (0)], '"', '') != 'blue'
    GROUP BY id
    

    结果如下(不包括计算中的颜色=蓝色)

    id  total    
    1   2.0  
    2   6.0