代码之家  ›  专栏  ›  技术社区  ›  Yunus Einsteinium

加入时出现重复记录

  •  -1
  • Yunus Einsteinium  · 技术社区  · 6 年前

    BigQuery .

    inventory 表,粒度是 depot_id product_id ,和 inventorytransaction 表中的每个操作(加法或减法)日志都是从 库存 桌子。

    所需的是获得当前年度每个月(1月至12月)的数量总和,作为中的额外列 桌子 SELECT 就像这样

    SELECT inventory.*, janTotalQuantity, febTotalQuantity, marTotalQuantity,...
    

    LEFT JOIN 带有子查询的库存表,该子查询可获取每个仓库和产品的月-年总数量(例如,2019年1月、2019年2月、2019年3月……)。下面是 SQL 声明就是这样。

    SELECT inv.inventory_id, p.product_name, p.product_type, p.product_distributor as distributor, p.product_category as category, d.depot_name as location, inv.quantity, inv.lower_limit, inv.unit_cost, inv.quantity * inv.unit_cost as value, p.product_id, d.depot_id, TIMESTAMP_SECONDS(inv.update_date) as last_update, inv.delete_status, IF(agg_sd.mon_year = "Jan-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS janQuantityTotal,IF(agg_sd.mon_year = "Feb-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS febQuantityTotal,IF(agg_sd.mon_year = "Mar-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS marQuantityTotal,IF(agg_sd.mon_year = "Apr-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS aprQuantityTotal,IF(agg_sd.mon_year = "May-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS mayQuantityTotal,IF(agg_sd.mon_year = "Jun-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS junQuantityTotal,IF(agg_sd.mon_year = "Jul-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS julQuantityTotal,IF(agg_sd.mon_year = "Aug-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS augQuantityTotal,IF(agg_sd.mon_year = "Sep-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS sepQuantityTotal,IF(agg_sd.mon_year = "Oct-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS octQuantityTotal,IF(agg_sd.mon_year = "Nov-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS novQuantityTotal,IF(agg_sd.mon_year = "Dec-{{ execution_date.year }}", agg_sd.totalQuantity, 0) AS decQuantityTotal      
    FROM iprocure_stage.inventory inv
    JOIN iprocure_ods.product p ON p.product_id = inv.product_id 
    JOIN iprocure_ods.depot d ON d.depot_id = inv.depot_id
    LEFT JOIN (
         SELECT FORMAT_TIMESTAMP('%b-%Y', transaction_date) mon_year, product_id, depot_id, SUM(quantity) as totalQuantity
         FROM `iprocure_ods.inventorytransaction`
         WHERE EXTRACT(YEAR FROM transaction_date) = {{ execution_date.year }}
                    AND transaction_type = 1 AND (reference_type = 1 OR reference_type = 6) AND delete_status = 0
                    GROUP BY mon_year, product_id, depot_id 
     ) AS agg_sd ON agg_sd.product_id = inv.product_id AND agg_sd.depot_id = inv.depot_id
    

    ----------------------------------------------------------------------------------
      inventory_id    depot_id    product_id    janTotalQuantity    febTotalQuantity
    -------------------------------------------------------------------------------------
        123             2             3              56                   0
        123             2             3              0                    65
    

    如何避免重复 库存 并在BigQuery中添加每月总数量列

    2 回复  |  直到 6 年前
        1
  •  2
  •   Piotr Kamoda    6 年前

    你可以将部分和之外的每一项进行分组,并对这些项应用求和聚合函数。这将使输出数据集变平:

    SELECT inv.inventory_id, p.product_name, p.product_type, p.product_distributor as distributor, p.product_category as category, d.depot_name as location, inv.quantity, inv.lower_limit, inv.unit_cost, inv.quantity * inv.unit_cost as value, p.product_id, d.depot_id, TIMESTAMP_SECONDS(inv.update_date) as last_update, inv.delete_status,
    SUM(IF(agg_sd.mon_year = "Jan-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS janQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Feb-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS febQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Mar-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS marQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Apr-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS aprQuantityTotal,
    SUM(IF(agg_sd.mon_year = "May-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS mayQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Jun-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS junQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Jul-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS julQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Aug-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS augQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Sep-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS sepQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Oct-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS octQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Nov-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS novQuantityTotal,
    SUM(IF(agg_sd.mon_year = "Dec-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS decQuantityTotal      
    FROM iprocure_stage.inventory inv
    JOIN iprocure_ods.product p ON p.product_id = inv.product_id 
    JOIN iprocure_ods.depot d ON d.depot_id = inv.depot_id
    LEFT JOIN (
         SELECT FORMAT_TIMESTAMP('%b-%Y', transaction_date) mon_year, product_id, depot_id, SUM(quantity) as totalQuantity
         FROM `iprocure_ods.inventorytransaction`
         WHERE EXTRACT(YEAR FROM transaction_date) = {{ execution_date.year }}
                    AND transaction_type = 1 AND (reference_type = 1 OR reference_type = 6) AND delete_status = 0
                    GROUP BY mon_year, product_id, depot_id 
     ) AS agg_sd ON agg_sd.product_id = inv.product_id AND agg_sd.depot_id = inv.depot_id
    GROUP BY inv.inventory_id, p.product_name, p.product_type, p.product_distributor as distributor, p.product_category as category, d.depot_name as location, inv.quantity, inv.lower_limit, inv.unit_cost, inv.quantity * inv.unit_cost as value, p.product_id, d.depot_id, TIMESTAMP_SECONDS(inv.update_date), inv.delete_status
    
        2
  •  1
  •   ScaisEdge    6 年前

    您正在尝试模拟一个pivot表,因为它应该使用(假)聚合函数

    SELECT inv.inventory_id
      , p.product_name
      , p.product_type
      , p.product_distributor as distributor
      , p.product_category as category
      , d.depot_name as location
      , inv.quantity
      , inv.lower_limit
      , inv.unit_cost
      , inv.quantity * inv.unit_cost as value
      , p.product_id, d.depot_id
      , TIMESTAMP_SECONDS(inv.update_date) as last_update
      , inv.delete_status
      , max(IF(agg_sd.mon_year = "Jan-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS janQuantityTotal
      , max(IF(agg_sd.mon_year = "Feb-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS febQuantityTotal
      , max(IF(agg_sd.mon_year = "Mar-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS marQuantityTotal
      , max(IF(agg_sd.mon_year = "Apr-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS aprQuantityTotal
      , max(IF(agg_sd.mon_year = "May-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS mayQuantityTotal
      , max(IF(agg_sd.mon_year = "Jun-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS junQuantityTotal
      , max(IF(agg_sd.mon_year = "Jul-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS julQuantityTotal
      , max(IF(agg_sd.mon_year = "Aug-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS augQuantityTotal
      , max(IF(agg_sd.mon_year = "Sep-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS sepQuantityTotal
      , max(IF(agg_sd.mon_year = "Oct-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS octQuantityTotal
      , max(IF(agg_sd.mon_year = "Nov-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS novQuantityTotal
      , max(IF(agg_sd.mon_year = "Dec-{{ execution_date.year }}", agg_sd.totalQuantity, 0)) AS decQuantityTotal      
    FROM iprocure_stage.inventory inv
    JOIN iprocure_ods.product p ON p.product_id = inv.product_id 
    JOIN iprocure_ods.depot d ON d.depot_id = inv.depot_id
    LEFT JOIN (
         SELECT FORMAT_TIMESTAMP('%b-%Y', transaction_date) mon_year, product_id, depot_id, SUM(quantity) as totalQuantity
         FROM `iprocure_ods.inventorytransaction`
         WHERE EXTRACT(YEAR FROM transaction_date) = {{ execution_date.year }}
                    AND transaction_type = 1 AND (reference_type = 1 OR reference_type = 6) AND delete_status = 0
                    GROUP BY mon_year, product_id, depot_id 
     ) AS agg_sd ON agg_sd.product_id = inv.product_id AND agg_sd.depot_id = inv.depot_id
    GROUP BY inv.inventory_id
      , p.product_name
      , p.product_type
      , p.product_distributor as distributor
      , p.product_category as category
      , d.depot_name as location
      , inv.quantity
      , inv.lower_limit
      , inv.unit_cost
      , inv.quantity * inv.unit_cost as value
      , p.product_id, d.depot_id
      , TIMESTAMP_SECONDS(inv.update_date) as last_update
      , inv.delete_status 
    
    推荐文章