我在我的两台机器上运行以下操作:
import os, sqlite3
import pandas as pd
from feat_transform import filter_anevexp
db_path = r'C:\Users\timregan\Desktop\anondb_280718.sqlite3'
db = sqlite3.connect(db_path)
anevexp_df = filter_anevexp(db, 0)
在我的笔记本电脑(8GB内存)上,这个运行没有问题(尽管
filter_anevexp
需要几分钟)。在我的桌面(128GB的RAM)上,它以内存错误的方式失败:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\timregan\source\MentalHealth\code\preprocessing\feat_transform.py", line 171, in filter_anevexp
anevexp_df = anevexp_df[anevexp_df["user_id"].isin(df)].copy()
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 2682, in __getitem__
return self._getitem_array(key)
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 2724, in _getitem_array
return self._take(indexer, axis=0)
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py", line 2789, in _take
verify=True)
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals.py", line 4539, in take
axis=axis, allow_dups=True)
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals.py", line 4425, in reindex_indexer
for blk in self.blocks]
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals.py", line 4425, in <listcomp>
for blk in self.blocks]
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals.py", line 1258, in take_nd
allow_fill=True, fill_value=fill_value)
File "C:\Users\timregan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\algorithms.py", line 1655, in take_nd
out = np.empty(out_shape, dtype=dtype)
MemoryError
在内存大的机器上,我需要做什么特别的事情来防止错误(例如寻址错误)?
注意:我没有把代码包括在
过滤器\u anevexp