代码之家 › 专栏 › 技术社区 › Lynn

使用Pandas在数据帧行中巧妙地偏移日期

group-by datetime numpy pandas python

Lynn · 技术社区 · 2 年前

我想在基于列的新类别的末尾添加一个新的连续日期行。例如,基于“面积”列。

数据

   Date         Start       End         Area    ID  Stat
   1/1/2022     2/1/2022    3/1/2022    NY      222 Y
   2/1/2022     3/1/2022    4/1/2022    NY      111 Y
   1/1/2022     2/1/2022    3/1/2022    CA      333 Y
   2/1/2022     3/1/2022    4/1/2022    CA      100 Y

渴望的

 Date          Start        End         Area    ID  Stat
 1/1/2022      2/1/2022     3/1/2022    NY      222 Y
 2/1/2022      3/1/2022     4/1/2022    NY      111 Y
 3/1/2022      4/1/2022     5/1/2022    NY      
 1/1/2022      2/1/2022     3/1/2022    CA      333 Y
 2/1/2022      3/1/2022     4/1/2022    CA      100 Y
 3/1/2022      4/1/2022     5/1/2022    CA

正在执行

SO成员帮助编写了这段代码,这对我来说是部分有效的,因为日期没有偏移:

# Convert the cols to datetime
c = ['Start', 'End']
df[c] = df[c].apply(pd.to_datetime, dayfirst=True)

# drop the duplicates rows by Area while keeping only the last row
rows = df[[*c, 'Area']].drop_duplicates('Area', keep='last')

# Add a dateoffset of 1 day
rows[c] += pd.DateOffset(days=1)

# Concat the rows and sort index to maintain order
pd.concat([df, rows]).sort_index(ignore_index=True)

欢迎提出任何建议。我目前正在对此进行研究。