代码之家 › 专栏 › 技术社区 › Seth

第一次分析后无法检查列表中的空行

null csv list python

Seth · 技术社区 · 7 年前

我正在解析来自Cisco的路由表。我只需要BGP条目。有一些奇怪的OSPF线路使线路与众不同。我不在乎那些。但是,因为行分割[8:10]是不同的,我最后写的是一个空行。我写的是网络,网络掩码为0.0.0.0),但我想在最后写的时候跳过空行。这是该文件的一个示例,以及解析该文件的代码。由于实际的文件很大,我试图避免重复循环。

编辑:目标是编写一个包含两列的csv文件,其中包含IP地址和网络掩码。

B        10.34.86.0/24 [20/0] via 10.15.33.73, 2w3d
B        10.34.93.0/24 [20/0] via 10.15.33.73, 2w3d
O E1     10.34.95.0/24  <- DON'T CARE ABOUT HIM  
B        10.34.97.0/24 [20/0] via 10.15.33.73, 2w0d
B        10.34.98.0/24 [20/0] via 10.15.33.73, 2w3d

所需输出(注意O线不存在)

10.34.86.0,24
10.34.93.0,24
10.34.97.0,24
10.34.98.0,24

还有我的Python 3

import csv
import re
import time
timestr = time.strftime("%m%d%y")
with open('RemediationStatus' + timestr + '.csv', 'a', newline='') as csvfile:
    with open('routes-3-28.txt','r')as msroutes:
        headwrite = csv.writer(csvfile)
        headwrite.writerow(["Network", "Netmask"])
        for line in msroutes:
            firstpass = re.split(r'[\s,/]', line)
            finalpass = (firstpass[8:10])
            if not finalpass[0]:
                finalpass = (["0.0.0.0", "0.0.0.0"])
        print(finalpass)
        writer = csv.writer(csvfile)
        writer.writerow(finalpass)

2 回复 | 直到 7 年前

bigbounty 7 年前

你应该利用 re python中的模块,如果我捕捉到了您的逻辑,您希望得到如下内容:

import re

text = ''''B        10.34.86.0/24 [20/0] via 10.15.33.73, 2w3d
B        10.34.93.0/24 [20/0] via 10.15.33.73, 2w3d
O E1     10.34.95.0/24  <- DON'T CARE ABOUT HIM  
B        10.34.97.0/24 [20/0] via 10.15.33.73, 2w0d
B        10.34.98.0/24 [20/0] via 10.15.33.73, 2w3d'''

result = re.findall('B\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\/(\d+)', text)
result
#[('10.34.86.0', '24'),
# ('10.34.93.0', '24'),
# ('10.34.97.0', '24'),
# ('10.34.98.0', '24')]

pault Tanjin 7 年前

IIUC这里有一种使用regex的方法:

full_pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/\d{1,3} \[\d+/\d+]"
small_pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/\d{1,3}"

msroutes = [
    "B        10.34.86.0/24 [20/0] via 10.15.33.73, 2w3d",
    "B        10.34.93.0/24 [20/0] via 10.15.33.73, 2w3d",
    "O E1     10.34.95.0/24",
    "B        10.34.97.0/24 [20/0] via 10.15.33.73, 2w0d",
    "B        10.34.98.0/24 [20/0] via 10.15.33.73, 2w3d"
]

for line in msroutes:
    finalpass = re.findall(full_pattern, line)
    if finalpass:
        finalpass = re.findall(small_pattern, finalpass[0])
        finalpass = finalpass[0].split('/')
        print(finalpass)

#['10.34.86.0', '24']
#['10.34.93.0', '24']
#['10.34.97.0', '24']
#['10.34.98.0', '24']

因为我不确定是否有更简单的方法可以忽略您要求忽略的行(也许可以检查它是否以 'B' vs以开始 'O' ?),我做了两次正则表达式搜索。第一种方法寻找与模式相似的东西:

##.##.##.##/## [##/#]

其中 # 表示数字。模式 \d{1,3} 表示1到3位数之间的匹配。这将消除没有 [20/0] 在里面。

接下来,对于匹配的行,我执行较小的regex搜索,只获取IP和掩码值。我们不必对此进行错误检查,因为我们知道它已经匹配了更大的模式。

如果您知道您只想处理以 “B” :

for line in msroutes:
    if line.startswith('B'):
        finalpass = re.findall(small_pattern, line)[0].split('/')
        print(finalpass)
#['10.34.86.0', '24']
#['10.34.93.0', '24']
#['10.34.97.0', '24']
#['10.34.98.0', '24']

您也可以使用现有代码,通过将连续的空格视为一个空格来简化逻辑,而不是使用长的regex模式。

for line in msroutes:
    firstpass = re.split(r'\s+', line)
    if len(firstpass) > 1 and "/" in firstpass[1]:
        finalpass = firstpass[1].split("/")
    else:
        finalpass = ["0.0.0.0", "0.0.0.0"]
    print(finalpass)
#['10.34.86.0', '24']
#['10.34.93.0', '24']
#['0.0.0.0', '0.0.0.0']
#['10.34.97.0', '24']
#['10.34.98.0', '24']

\s+ 表示匹配一个或多个空格。