我下载的HTML与您的期望不符。这是一个适用于我的表达式:
tree.xpath('//div[@id="technicalProductFeaturesATF"]/ul/li[1]/text()')
完整程序:
from lxml import html
import requests
from pprint import pprint
url = 'http://www.amazon.co.uk/dp/B009CX5VN2'
page = requests.get(url)
tree = html.fromstring(page.text)
feature_bullets = tree.xpath('//div[@id="technicalProductFeaturesATF"]/ul/li/text()')
pprint(feature_bullets)
结果:
$ python foo.py
['Fun, credit card-sized prints',
'LCD film counter and shooting mode display',
'Camera mounted mirror for self portraits',
'Powered by CR2 Batteries, Built-in, Automatic electronic flash',
'Fujifilm Instax Mini 25 + 30 Instax Mini Film']