如果在控制台中运行此JavaScript,它将从页面中提取所有名称和描述。
let trs = document.querySelectorAll('#TableWithRules tbody tr')
trs.forEach((el) => {
let tds = el.querySelectorAll('td')
let name = tds[0].innerText;
let description = tds[1].innerText;
console.log(name, description)
})
使用相同的代码
硒
例如:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get("https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=hp")
trs = driver.find_elements(By.XPATH, "//div[@id='TableWithRules']//tbody//tr")
for tr in trs:
tds = tr.find_elements(By.XPATH, ".//td")
name = tds[0].text
description = tds[1].text
print(name, description)
driver.close()
输出
...
CVE-1999-0016 Land IP denial of service.
CVE-1999-0014 Unauthorized privileged access or denial of service via dtappgather program in CDE.
CVE-1999-0011 Denial of Service vulnerabilities in BIND 4.9 and BIND 8 Releases via CNAME record and zone transfer.
CVE-1999-0010 Denial of Service vulnerability in BIND 8 Releases via maliciously formatted DNS messages.
CVE-1999-0009 Inverse query buffer overflow in BIND 4.9 and BIND 8 Releases.
...
代码说明
最初,检索所有
tr
元素从
tbody
在
#TableWithRules
桌子。然后,构造一个for循环来迭代这些
tr
元素,提取全部
td
其中包含的元素。通常有两种
td
元素:一个用于
name
另一个为
description
。继续从以下位置获取文本
td[0]
和
td[1]
.
那么“主题”呢?
过程为
THEAD
与上述类似。主要区别在于目标
THEAD
而不是
TBODY
,并专注于
th
元素而不是
td
.