我的组织拥有由多个字段组合而成的帐号。最后一个字段始终为4个字符(通常为0000)
Org Account
01 01-123-0000
01 01-456-0000
02 02-789-0000
02 02-456-0000
03 03-987-0000
03 03-123-1234
我还有一个字典映射,说明最后一个组件应该有多少个字符。
MAP = {'01': 4, '02': 3, '03': 3}
然而,组织03也有特殊的映射:
D03_SPECIAL_MAP = {'0000': '012', '1234': '123'}
我更新最后一个组件的代码是:
for i,r in df.iterrows():
updated = False # Keep track if we have updated this row
# Split off last component from the rest of the account
Acct, last_comp = r['Account'].rsplit('-',1)
# Check if we need to update code length and the code length does not match
if r['Org'] in MAP and len(last_comp) != MGMT_MAP[r['Org']]:
df.at[i,'Account'] = '-'.join(Acct) + "-" + last_comp.zfill(MAP[r['Org']])
updated = True
# Special mapping for Org 03
if r['Org'] =='03' and last_comp in D03_SPECIAL_MAP.keys():
df.at[i,'Account'] = '-'.join(Acct) + "-" + D03_SPECIAL_MAP[last_comp]
updated = True
if not updated: # Join Default if we have not hit either of he conditions above
df.at[i,'Account'] = '-'.join(Acct) + "-" + last_comp
其输出将是:
Org Account
01 01-123-0000
01 01-456-0000
02 02-789-000
02 02-456-000
03 03-987-012
03 03-123-123
我的代码按预期工作,只是这个过程检查每条记录有点慢。是否有方法在不使用的情况下执行相同的操作
df.iterrows()
?