代码之家  ›  专栏  ›  技术社区  ›  Shankar Panda

如何使用Python将列表中的嵌套json存储到文本文件中?

  •  0
  • Shankar Panda  · 技术社区  · 6 年前

    我正在创建一个嵌套的json,并将其存储在一个列表对象中。下面是我的代码,它按照预期获得了正确的层次化json。

    enter image description here

    数据源,数据源,类别,类别,子类别,子类别 劳工统计局,44,就业和工资,44,就业和工资,44

    import pandas as pd
    df=pd.read_csv('queryhive16273.csv')
    def split_df(df):
       for (vendor, count), df_vendor in df.groupby(["datasource", "datasource_cnt"]):
           yield {
               "vendor_name": vendor,
               "count": count,
               "categories": list(split_category(df_vendor))
           }
    
    def split_category(df_vendor):
       for (category, count), df_category in df_vendor.groupby(
           ["category", "category_cnt"]
       ):
           yield {
               "name": category,
               "count": count,
               "subCategories": list(split_subcategory(df_category)),
           }
    
    def split_subcategory(df_category):
       for (subcategory, count), df_subcategory in df_category.groupby(
           ["subcategory", "subcategory_cnt"]
       ):
           yield {
               "count": count,
               "name": subcategory,
                 }
    
    
    abc=list(split_df(df))
    

    abc包含如下所示的数据。这是预期的结果。

    [{
        'count': 44,
        'vendor_name': 'Bureau of Labor Statistics',
        'categories': [{
            'count': 44,
            'name': 'Employment and wages',
            'subCategories': [{
                'count': 44,
                'name': 'Employment and wages'
            }]
        }]
    }]
    

    现在我尝试将其存储到json文件中。

    with open('your_file2.json', 'w') as f:
        for item in abc:
           f.write("%s\n" % item)
            #f.write(abc)
    

    你能帮帮我吗。

    {
        'count': 44,
        'vendor_name': 'Bureau of Labor Statistics',
        'categories': [{
            'count': 44,
            'name': 'Employment and wages',
            'subCategories': [{
                'count': 44,
                'name': 'Employment and wages'
            }]
        }]
    }
    

    [{
        "count": 44,
        "vendor_name": "Bureau of Labor Statistics",
        "categories": [{
            "count": 44,
            "name": "Employment and wages",
            "subCategories": [{
                "count": 44,
                "name": "Employment and wages"
            }]
        }]
    }]
    
    2 回复  |  直到 6 年前
        1
  •  1
  •   jlandercy    6 年前

    使用您的数据和PSL json 给我:

    TypeError: Object of type 'int64' is not JSON serializable
    

    这只意味着一些numpy对象存在于嵌套结构中,并且没有 encode 方法将其转换为JSON序列化。

    当对象本身缺少字符串转换时,强制encode使用字符串转换足以使代码正常工作:

    import io
    d = io.StringIO("datasource,datasource_cnt,category,category_cnt,subcategory,subcategory_cnt\nBureau of Labor Statistics,44,Employment and wages,44,Employment and wages,44")
    df=pd.read_csv(d)
    
    abc=list(split_df(df))
    
    import json
    json.dumps(abc, default=str)
    

    它返回一个有效的JSON(但带有 int 转化为 str ):

    '[{"vendor_name": "Bureau of Labor Statistics", "count": "44", "categories": [{"name": "Employment and wages", "count": "44", "subCategories": [{"count": "44", "name": "Employment and wages"}]}]}]'
    

    如果不适合您的需要,请使用专用的 Encoder

    import numpy as np
    class MyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj, np.int64):
                return int(obj)
            return json.JSONEncoder.default(self, obj)
    
    json.dumps(abc, cls=MyEncoder)
    

    这将返回请求的JSON:

    '[{"vendor_name": "Bureau of Labor Statistics", "count": 44, "categories": [{"name": "Employment and wages", "count": 44, "subCategories": [{"count": 44, "name": "Employment and wages"}]}]}]'
    

    另一种选择是在编码之前直接转换数据:

    def split_category(df_vendor):
       for (category, count), df_category in df_vendor.groupby(
           ["category", "category_cnt"]
       ):
           yield {
               "name": category,
               "count": int(count), # Cast here before encoding
               "subCategories": list(split_subcategory(df_category)),
           }
    
        2
  •  0
  •   buran    6 年前
    import json
    
    data = [{
        'count': 44,
        'vendor_name': 'Bureau of Labor Statistics',
        'categories': [{
            'count': 44,
            'name': 'Employment and wages',
            'subCategories': [{
                'count': 44,
                'name': 'Employment and wages'
            }]
        }]
    }]
    
    with open('your_file2.json', 'w') as f:
        json.dump(data, f, indent=2)
    

    [
      {
        "count": 44,
        "vendor_name": "Bureau of Labor Statistics",
        "categories": [
          {
            "count": 44,
            "name": "Employment and wages",
            "subCategories": [
              {
                "count": 44,
                "name": "Employment and wages"
              }
            ]
          }
        ]
      }
    ]