一个想法是使用
merge_asof
,但最后一行不同:
main_df['created_at'] = pd.to_datetime(main_df['created_at'])
aux_df['created_at'] = pd.to_datetime(aux_df['created_at'])
df = pd.merge_asof(aux_df[['created_at']], main_df, on=['created_at'])
print (df)
created_at value feed_id
0 2019-03-06 07:35:33-05:00 NaN NaN
1 2019-03-06 07:36:34-05:00 NaN NaN
2 2019-03-06 07:37:36-05:00 NaN NaN
3 2019-03-06 07:38:36-05:00 0.0 1010077.0
4 2019-03-06 07:39:37-05:00 1.0 1010077.0
5 2019-03-06 07:40:38-05:00 1.0 1010077.0
6 2019-03-06 07:41:38-05:00 1.0 1010077.0
7 2019-03-06 07:42:39-05:00 1.0 1010077.0
8 2019-03-06 07:43:40-05:00 1.0 1010077.0
9 2019-03-06 07:44:41-05:00 1.0 1010077.0
另一个是使用
Series.dt.floor
round
main_df['created_at'] = pd.to_datetime(main_df['created_at'])
aux_df['created_at'] = pd.to_datetime(aux_df['created_at'])
main_df['created_at_2'] = main_df.created_at.dt.floor('min')
aux_df['created_at_2'] = aux_df.created_at.dt.floor('min')
df = pd.merge(aux_df[['created_at_2']], main_df, on=['created_at_2'], how='left')
print (df)
created_at_2 value feed_id created_at
0 2019-03-06 07:35:00-05:00 NaN NaN NaT
1 2019-03-06 07:36:00-05:00 NaN NaN NaT
2 2019-03-06 07:37:00-05:00 NaN NaN NaT
3 2019-03-06 07:38:00-05:00 0.0 1010077.0 2019-03-06 07:38:18-05:00
4 2019-03-06 07:39:00-05:00 1.0 1010077.0 2019-03-06 07:39:26-05:00
5 2019-03-06 07:40:00-05:00 1.0 1010077.0 2019-03-06 07:40:33-05:00
6 2019-03-06 07:41:00-05:00 1.0 1010077.0 2019-03-06 07:41:41-05:00
7 2019-03-06 07:42:00-05:00 1.0 1010077.0 2019-03-06 07:42:49-05:00
8 2019-03-06 07:43:00-05:00 1.0 1010077.0 2019-03-06 07:43:56-05:00
9 2019-03-06 07:44:00-05:00 NaN NaN NaT