代码之家  ›  专栏  ›  技术社区  ›  HHH

如何找出年中两个日期之间的差异

  •  0
  • HHH  · 技术社区  · 6 年前

    我的数据框架中有两列已经转换为日期时间。我正试着减去这些数字,找出年份的差异。这是我使用的代码:

    from dateutil.relativedelta import relativedelta
    difference_in_years = relativedelta(x['start'], x['end']).year
    

    但是,我收到以下错误消息:

    ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
    

    问题是什么?

    4 回复  |  直到 6 年前
        1
  •  2
  •   jezrael    6 年前

    使用属性 .years 具有 apply axis=1 按行处理:

    df = pd.DataFrame({'start':['2015-10-02','2014-11-05'],
                       'end':['2018-01-02','2018-10-05']})
    
    df['start'] = pd.to_datetime(df['start'])
    df['end'] = pd.to_datetime(df['end'])
    
    from dateutil.relativedelta import relativedelta
    
    df['y'] = df.apply(lambda x: relativedelta(x['end'], x['start']).years, axis=1)
    

    或使用 list comprehension :

    df['y'] = [relativedelta(i, j).years for i, j in zip(df['end'], df['start'])]
    

    print (df)
           start        end  y
    0 2015-10-02 2018-01-02  2
    1 2014-11-05 2018-10-05  3
    

    编辑:

    df = pd.DataFrame({'start':['2015-10-02','2014-11-05'],
                       'end':['2018-01-02',np.nan]})
    
    df['start'] = pd.to_datetime(df['start'])
    df['end'] = pd.to_datetime(df['end'])
    
    from dateutil.relativedelta import relativedelta
    
    m = df[['start','end']].notnull().all(axis=1)
    df.loc[m, 'y'] = df[m].apply(lambda x: relativedelta(x['end'], x['start']).years, axis=1)
    print (df)
           start        end    y
    0 2015-10-02 2018-01-02  2.0
    1 2014-11-05        NaT  NaN
    
        2
  •  1
  •   Jorge    6 年前

    检查这个答案 calculate the difference between two datetime.date() dates in years and months

    from dateutil import relativedelta as rdelta
    from datetime import date
    d1 = date(2001,5,1)
    d2 = date(2012,1,1)
    rd = rdelta.relativedelta(d2,d1)
    rd
    relativedelta(years=+10, months=+8)
    
        3
  •  0
  •   jpp    6 年前

    你可以分一个 timedelta 按年份排列的系列单位,如有必要,可四舍五入:

    # data from jezrael
    
    df['years'] = (df['end'] - df['start']) / np.timedelta64(1, 'Y')
    df['years_floor'] = df['years'].round()
    
    print(df)
    
           start        end     years  years_floor
    0 2015-10-02 2018-01-02  2.253297          2.0
    1 2014-11-05        NaT       NaN          NaN
    
        4
  •  0
  •   ayorgo    6 年前

    你可以通过

    (df['end'] - df['start'])/pd.Timedelta(1, 'Y')
    

    如果需要的话,将结果取整。

    熊猫 v0.23.4 以后你就可以了

    (df['end'] - df['start'])//pd.Timedelta(1, 'Y')
    

    直接获得全年差异。