代码之家  ›  专栏  ›  技术社区  ›  Aso Strife

使用json数据集中的行值,在特定条件下更改行值

  •  3
  • Aso Strife  · 技术社区  · 7 年前

    {
        "date": "2018-01-01", 
        "body": "some txt", 
        "id": 111, 
        "sentiment": null
    }, 
    {
        "date": "2018-01-02", 
        "body": "some txt", 
        "id": 112, 
        "sentiment": {
            "basic": "Bearish"
        }
    }
    

    我想用pandas来阅读这篇文章,并将每个行的differents列的值从null改为。

    pd.read_json(path)
    

    这是我得到的结果:

    body           ...    sentiment
    0                      None
    1                      {u'basic': u'Bullish'}
    

    {u'basic': u'Bullish'} 但只有基本的价值。 因此,为了找到正确的行,我使用

    df.loc[self.df['sentiment'].isnull() != True, 'sentiment'] = (?)
    

    我试过了,但没用

    df.loc[self.df['sentiment'].isnull() != True, 'sentiment'] = df['sentiment']['basic]
    

    有什么想法吗?谢谢

    3 回复  |  直到 7 年前
        1
  •  3
  •   paulo.filip3    7 年前

    您可以尝试:

    mask = df['sentiment'].notnull()
    df.loc[mask, 'sentiment'] = df.loc[mask, 'sentiment'].apply(lambda x: x['basic'])
    
        2
  •  2
  •   Deepak Saini    7 年前

    您可以这样做:

    df = pd.read_json(path)  # creates the dataframe with dict objects in sentiment column 
    pd.concat([df.drop(['sentiment'], axis=1), df['sentiment'].apply(pd.Series)], axis=1)  # create new columns for each sentiment type
    

    [{
        "date": "2018-01-01", 
        "body": "some txt", 
        "id": 111, 
        "sentiment": null
    }, 
    {
        "date": "2018-01-02", 
        "body": "some txt", 
        "id": 112, 
        "sentiment": {
            "basic": "Bearish"
        }
    },
    {
        "date": "2018-01-03", 
        "body": "some other txt", 
        "id": 113, 
        "sentiment": {
            "basic" : "Bullish",
            "non_basic" : "Bearish"
        }
    }]
    

    1号线后的df:

                 body       date   id                                     sentiment
    0        some txt 2018-01-01  111                                          None
    1        some txt 2018-01-02  112                          {'basic': 'Bearish'}
    2  some other txt 2018-01-03  113  {'basic': 'Bullish', 'non_basic': 'Bearish'}
    

    2号线后的df:

                 body       date   id    basic non_basic
    0        some txt 2018-01-01  111      NaN       NaN
    1        some txt 2018-01-02  112  Bearish       NaN
    2  some other txt 2018-01-03  113  Bullish   Bearish
    

    嗯。

        3
  •  0
  •   jpp    7 年前

    fillna pop + join

    这里有一个可扩展的解决方案,它可以避免按行执行 apply 并将任意数量的键转换为序列:

    df = pd.DataFrame({'body': [0, 1],
                       'sentiment': [None, {u'basic': u'Bullish'}]})
    
    df['sentiment'] = df['sentiment'].fillna(pd.Series([{}]*len(df.index), index=df.index))
    
    df = df.join(pd.DataFrame(df.pop('sentiment').values.tolist()))
    
    print(df)
    
       body    basic
    0     0      NaN
    1     1  Bullish