我找到了一个更好的解决办法
break
正如我在评论中建议的那样:
result
列出每个块数据并将其存储在list的单独元素中(例如dict)。如果你不读-
行,您可以保证,您刚刚读取的行与当前数据块相关。当前数据块是
结果
页眉
行,只需将新元素附加到
结果
如果内容的大小是常量,则可以使用
itertools.cycle
迭代器将“编写”您的解析过程:
from itertools import cycle
text1 = """Header1
number of Samples1
Content1
a1, aa1, aaa1
b1, bb1, bbb1
Header2
number of Samples2
Content2
a2, aa2, aaa2
b2, bb2, bbb2"""
size = 5
iterator = cycle(range(size))
result = []
for line in text1.split('\n'):
i = next(iterator)
if i == 0:
result.append({'header': line})
elif i == 1:
result[-1]['num_of_samples'] = line
elif i == 2:
result[-1]['content_header'] = line
elif i == 3:
result[-1]['content'] = [line.split(', ')]
else:
result[-1]['content'].append(line.split(', '))
text2 = """Header1
number of Samples1
Content1
a1, aa1, aaa1
b1, bb1, bbb1
b1, bb1, bbb1
Header2
number of Samples2
Content2
b2, bb2, bbb2
Header3
number of Samples3
Content3
a3, aa3, aaa3
b3, bb3, bbb3"""
result = []
i = 0
for line in text2.split('\n'):
if line.startswith('Header'): # Your condition for headers
result.append({'header': line})
elif line.startswith('number'): # Your condition for number of samples
result[-1]['num_of_samples'] = line
elif line.startswith('Content'): # Your condition for content headers
result[-1]['content_header'] = line
else:
if 'content' not in result[-1]: # We don't know is the content list created
result[-1]['content'] = [line.split(', ')]
else:
result[-1]['content'].append(line.split(', '))