from bs4 import BeautifulSoup from bs4.dammit import EncodingDetector import requests resp = requests.get("https://finance.yahoo.com/gainers") http_encoding = resp.encoding if 'charset' in resp.headers.get('content-type', '').lower() else None html_encoding = EncodingDetector.find_declared_encoding(resp.content, is_html=True) encoding = html_encoding or http_encoding soup = BeautifulSoup(resp.content, from_encoding=encoding) myclass = soup.findAll("a", {"class": "Fw(600) C($linkColor)"}) myclass
这给了我这个。
[<a class="Fw(600) C($linkColor)" data-reactid="79" href="/quote/TSNP?p=TSNP" title="Tesoro Enterprises, Inc.">TSNP</a>, <a class="Fw(600) C($linkColor)" data-reactid="105" href="/quote/FDVRF?p=FDVRF" title="Facedrive Inc.">FDVRF</a>, <a class="Fw(600) C($linkColor)" data-reactid="131" href="/quote/SKLZ?p=SKLZ" title="Skillz Inc.">SKLZ</a>, <a class="Fw(600) C($linkColor)" data-reactid="157" href="/quote/GOOS?p=GOOS" title="Canada Goose Holdings Inc.">GOOS</a>, <a class="Fw(600) C($linkColor)" data-reactid="183" href="/quote/WMS?p=WMS" title="Advanced Drainage Systems, Inc.">WMS</a>, etc., etc.
我真正想要的是股票的符号:台山核电,FDVRF,SKLZ,GOOS,WMS等等。
谢谢大家。
你可以用 .text 属性返回的元素 .findAll()
.text
.findAll()
for e in soup.findAll("a", {"class": "Fw(600) C($linkColor)"}): print(e.text)
输出:
TSNP FDVRF SKLZ GOOS WMS APPS ...
如果你想把它们列在一个列表中,简单的列表理解就可以了:
gainers = soup.findAll("a", {"class": "Fw(600) C($linkColor)"}) tickers = [e.text for e in gainers]
['TSNP', 'FDVRF', 'SKLZ', 'GOOS', 'WMS', 'APPS', 'TIGR', ...]