代码之家  ›  专栏  ›  技术社区  ›  user2490003

如何使用PL/pgSQL构造具有动态列的表

  •  3
  • user2490003  · 技术社区  · 7 年前

    我有一个Postgres表叫做 locations . 它有几百万行,数据格式如下

     id |  location_a  |  location_b
    ----+--------------+--------------
     36 | Sydney       | London
     37 | Atlanta      | London
     38 | New York     | Tokyo
     39 | Tokyo        | Sydney
     40 | Tokyo        | Sydney
    .....
    

    enter image description here

    问题是列的数量是可变的,因此必须通过编程/动态确定,而不是使用静态 SELECT 查询

    但我在开始时遇到了很多麻烦。有没有一种简单的方法来计算上述内容?

    1 回复  |  直到 7 年前
        1
  •  2
  •   klin    7 年前

    可以动态创建视图。描述了一个比您的更简单的案例的想法和解决方案 in this answer. 请在继续之前阅读。

    with all_locations(location) as (
        select distinct location_a
        from locations
        union
        select distinct location_b
        from locations
    )
    
    select location_a as location, json_object_agg(location_b, count order by location_b) as data
    from (
        select a.location as location_a, b.location as location_b, count(l.*)
        from all_locations a
        cross join all_locations b
        left join locations l on location_a = a.location and location_b = b.location
        group by 1, 2
        ) s
    group by 1
    order by 1;
    

    结果:

     location |                                    data                                    
    ----------+----------------------------------------------------------------------------
     Atlanta  | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
     London   | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
     New York | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 1 }
     Sydney   | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
     Tokyo    | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 2, "Tokyo" : 0 }
    (5 rows)
    

    cities . 注意,您可以将函数中的第一个查询替换为更简单的查询(它是不同城市的简单有序列表)。

    create or replace function create_locations_view()
    returns void language plpgsql as $$
    declare
        cities text[];
        list text;
    begin
    --  fill array with all cities in alphabetical order
        select array_agg(location_a order by location_a)
        from (
            select distinct location_a
            from locations
            union
            select distinct location_b
            from locations
            ) s
        into cities;
    
    --  construct list of columns to use in select list
        select string_agg(format($s$data->>'%1$s' "%1$s"$s$, city), ', ')
        from unnest(cities) city
        into list;
    
    --  create view from select based on the above list
        execute format($ex$
            drop view if exists locations_view;
            create view locations_view as 
            select location, %1$s
            from (
                select location_a as location, json_object_agg(location_b, count order by location_b) as data
                from (
                    select a.location as location_a, b.location as location_b, count(l.*)
                    from unnest(%2$L::text[]) a(location)
                    cross join unnest(%2$L::text[]) b(location)
                    left join locations l on location_a = a.location and location_b = b.location
                    group by 1, 2
                    ) s
                group by 1
            ) s
            order by 1
            $ex$, list, cities);
    end $$;
    

    select create_locations_view();
    select * from locations_view;
    
     location | Atlanta | London | New York | Sydney | Tokyo 
    ----------+---------+--------+----------+--------+-------
     Atlanta  | 0       | 1      | 0        | 0      | 0
     London   | 0       | 0      | 0        | 0      | 0
     New York | 0       | 0      | 0        | 0      | 1
     Sydney   | 0       | 1      | 0        | 0      | 0
     Tokyo    | 0       | 0      | 0        | 2      | 0
    (5 rows)
    

    我多次使用这种方法,但我没有处理真正大数据的经验,所以我不能保证它是有效的。