我可能是完全错误的,但是我现在有一个如下所示的函数,它获取了我在搜索结果中出现的第一个YouTube视频的链接,给出了一个字符串输入:
def searchYTLink(title):
query = urllib.parse.quote(title)
url = "https://www.youtube.com/results?search_query=" + query
response = urllib.request.urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
result = soup.findAll(attrs={'class': 'yt-uix-tile-link'})[0]
return 'https://www.youtube.com' + result['href']
现在,我想向这个函数输入一个字符串列表,并将它映射到我的所有工作节点上。为此,我编写了以下代码:
# Make sure that you initialize the Sppark Context
sc = SparkContext(appName="MusicClassifier")
searchTest = ['videoa', 'videob', ...]
sc.parallelize(searchTest).map(searchYTLink)
这样做对吗?