我认为这个问题没有解决
default
中的参数
np.select
(
default=df['logout_datetime']
)把它改成
default=df['logout_datetime'].dt.date
对于从返回的相同类型
np.选择
:
df['logout_date'] = np.select([m1 & m3, m2 & m3],
[df['login_date'].dt.date,
(df['login_date'] + pd.DateOffset(days=2)).dt.date],
default=df['logout_date'].dt.date)
print (df)
person_id person_type login_date logout_date
0 101 A 2013-05-07 09:27:00 2013-05-09
1 101 A 2013-09-08 11:21:00 2013-11-08
2 101 B 2014-06-06 08:00:00 2014-06-06
3 101 C 2014-06-06 05:00:00 2014-06-06
4 202 D 2011-12-11 10:00:00 2011-12-11
5 202 B 2012-10-13 00:00:00 2012-10-13
6 202 A 2012-12-13 11:45:00 2012-12-15
然后是日期时间
Series.dt.normalize
删除次数(设置为
00:00:00
datetime
s、 所以工作得很好:
df['logout_date'] = np.select([m1 & m3, m2 & m3],
[df['login_date'].dt.normalize(),
(df['login_date'] + pd.DateOffset(days=2)).dt.normalize()],
default=df['logout_date'])
print (df)
person_id person_type login_date logout_date
0 101 A 2013-05-07 09:27:00 2013-05-09 00:00:00
1 101 A 2013-09-08 11:21:00 2013-11-08 11:21:00
2 101 B 2014-06-06 08:00:00 2014-06-06 00:00:00
3 101 C 2014-06-06 05:00:00 2014-06-06 05:00:00
4 202 D 2011-12-11 10:00:00 2011-12-11 00:00:00
5 202 B 2012-10-13 00:00:00 2012-10-13 00:00:00
6 202 A 2012-12-13 11:45:00 2012-12-15 00:00:00
df['logout_date'] = np.select([m1 & m3, m2 & m3],
[df['login_date'],
(df['login_date'] + pd.DateOffset(days=2))],
default=df['logout_date'])
print (df)
person_id person_type login_date logout_date
0 101 A 2013-05-07 09:27:00 2013-05-09 09:27:00
1 101 A 2013-09-08 11:21:00 2013-11-08 11:21:00
2 101 B 2014-06-06 08:00:00 2014-06-06 08:00:00
3 101 C 2014-06-06 05:00:00 2014-06-06 05:00:00
4 202 D 2011-12-11 10:00:00 2011-12-11 10:00:00
5 202 B 2012-10-13 00:00:00 2012-10-13 00:00:00
6 202 A 2012-12-13 11:45:00 2012-12-15 11:45:00