代码之家  ›  专栏  ›  技术社区  ›  learningstudent

如何基于mysql结果数据在python中使用条件打印数据

  •  1
  • learningstudent  · 技术社区  · 7 年前

    我有一个mysql示例数据,结果如下表所示

    main_cat| sub_cat | number | org_id
    Career  | school  | 5      | A
    Career  | college | 3      | A
    Career  | higher  | 4      | A 
    Job     | Blr     | 6      | A
    Job     | Hyd     | 11     | A
    Job     | Chennai | 12     | A 
    Career  | school  | 15     | B
    Career  | college | 30     | B
    Career  | higher  | 5      | B 
    Job     | Blr     | 5      | B
    Career  | college | 8      | C
    Job     | Chennai | 4      | C 
    

    我想打印top 2 main_cat 对于每个组织。每排前2名 主猫 ,我要打印top 2 sub_cat 对于每个组织。因此,每个组织都应该有4个或更少的前2名记录 主猫 排名前二 sub_猫 对于每个 主猫

    请帮帮我。

    2 回复  |  直到 7 年前
        1
  •  1
  •   mata    7 年前

    对于分组,python提供了 itertools.groupby() ,哪些组 分类 由给定的键函数输入。

    在这种情况下,输入需要首先按 org_id ,然后 main_cat 然后 number 按降序排列,例如,如果列表中的数据如下所示:

    data = [
     ['Career', 'school', 5, 'A'],
     ['Career', 'college', 3, 'A'],
     ['Career', 'higher', 4, 'A'],
     ['Job', 'Blr', 6, 'A'],
     ['Job', 'Hyd', 11, 'A'],
     ['Job', 'Chennai', 12, 'A'],
     ['Career', 'school', 15, 'B'],
     ['Career', 'college', 30, 'B'],
     ['Career', 'higher', 5, 'B'],
     ['Job', 'Blr', 5, 'B'],
     ['Career', 'college', 8, 'C'],
     ['Job', 'Chennai', 4, 'C']
    ]
    

    然后你会这样分类:

    data.sort(key = lambda x: (x[3], x[0], -x[2]))
    

    或者通过修改sql语句来包括 ORDER BY main_cat, sub_cat, number DESC ,然后您可以从数据库中按正确的顺序获取它。

    现在您可以使用 groupby 分组,以及 islice 要限制每个分组类别的结果数,请执行以下操作:

    from itertools import groupby, islice
    from operator import itemgetter
    
    # already sorted data
    data = [
     ['Career', 'school', 5, 'A'],
     ['Career', 'higher', 4, 'A'],
     ['Career', 'college', 3, 'A'],
     ['Job', 'Chennai', 12, 'A'],
     ['Job', 'Hyd', 11, 'A'],
     ['Job', 'Blr', 6, 'A'],
     ['Career', 'college', 30, 'B'],
     ['Career', 'school', 15, 'B'],
     ['Career', 'higher', 5, 'B'],
     ['Job', 'Blr', 5, 'B'],
     ['Career', 'college', 8, 'C'],
     ['Job', 'Chennai', 4, 'C']
    ]
    
    data.sort(key = lambda x: (x[3], x[0], -x[2]))
    
    for org, by_org in groupby(data, key=itemgetter(3)):
        print("org:", org)
        for cat, by_cat in islice(groupby(by_org, key=itemgetter(0)), 2):
            print("  cat:", cat)
            for subcat, by_subcat in islice(groupby(by_cat, key=itemgetter(1)), 2):
                print("    subcat:", subcat, " = ", list(by_subcat))
    

    输出:

    org: A
      cat: Career
        subcat: school  =  [['Career', 'school', 5, 'A']]
        subcat: higher  =  [['Career', 'higher', 4, 'A']]
      cat: Job
        subcat: Chennai  =  [['Job', 'Chennai', 12, 'A']]
        subcat: Hyd  =  [['Job', 'Hyd', 11, 'A']]
    org: B
      cat: Career
        subcat: college  =  [['Career', 'college', 30, 'B']]
        subcat: school  =  [['Career', 'school', 15, 'B']]
      cat: Job
        subcat: Blr  =  [['Job', 'Blr', 5, 'B']]
    org: C
      cat: Career
        subcat: college  =  [['Career', 'college', 8, 'C']]
      cat: Job
        subcat: Chennai  =  [['Job', 'Chennai', 4, 'C']]
    
        2
  •  0
  •   zipa    7 年前

    您可以使用pandas将查询处理为数据帧 read_sql :

    import pandas as pd
    df = pd.read_sql(connection,query)
    result = df.groupby(['org_id', 'main_cat', 'sub_cat'])['number'].head(2)
    

    变量 connection 是否连接到db和 query 是你的 SELECT 一串