在这里,BigARTM Score Tracker API在0.7.9版和0.8.0版之间有一个小的变化。以下示例应适用于v0.8.0
import artm
batch_vectorizer = artm.BatchVectorizer(data_path=r'D:\Datasets\kos',
data_format='batches')
dictionary = artm.Dictionary(data_path=r'D:\Datasets\kos')
model = artm.ARTM(num_topics=15,
num_document_passes=5,
dictionary=dictionary,
scores=[artm.TopTokensScore(name='top_tokens_score')])
model.fit_offline(batch_vectorizer=batch_vectorizer, num_collection_passes=3)
top_tokens = model.score_tracker['top_tokens_score']
for topic_name in model.topic_names:
print '\n', topic_name
for (token, weight) in zip(top_tokens.last_tokens[topic_name],
top_tokens.last_weights[topic_name]):
print token, '-', weight
有关BigARTM Python API的其他更改,请参阅发行说明:
http://docs.bigartm.org/en/stable/release_notes/python.html