代码之家  ›  专栏  ›  技术社区  ›  ababuji

云自然语言API返回socket.gaierror:时不时进行情绪分析后提供nodename或servname

  •  7
  • ababuji  · 技术社区  · 6 年前

    我在Jupyter笔记本上运行代码,我修改了这个代码 link 所以它从Jupyter笔记本而不是控制台中获取,并遍历文件列表。

    """Demonstrates how to make a simple call to the Natural Language API."""
    
    import argparse
    import requests
    from google.cloud import language
    from google.cloud.language import enums
    from google.cloud.language import types
    
    
    def print_result(annotations, movie_review_filename):
    
    
        score = annotations.document_sentiment.score
        magnitude = annotations.document_sentiment.magnitude
    
    
        file_path_split = movie_review_filename.split("/")
        fileName = file_path_split[len(file_path_split) - 1][:-4]
    
        sentencelist = []  
        statuslist = []
    
        for index, sentence in enumerate(annotations.sentences):
            sentence_sentiment = sentence.sentiment.score
            singlesentence = [fileName, sentence.text.content, sentence.sentiment.magnitude, sentence_sentiment]
            sentencelist.append(singlesentence)
    
    
        outputdf = pd.DataFrame(sentencelist, columns = ['status_id', 'sentence', 'sentence_magnitude', 'sentence_sentiment'])        
    
        outputdf.to_csv("/Users/abhi/Desktop/RetrySentenceCSVs/" + fileName + ".csv", index = False)
    
        return 0
    
    
    def analyze(movie_review_filename):
        """Run a sentiment analysis request on text within a passed filename."""
        client = language.LanguageServiceClient()
    
        with open(movie_review_filename, 'r') as review_file:
            # Instantiates a plain text document.
            content = review_file.read()
    
        document = types.Document(
            content=content,
            type=enums.Document.Type.PLAIN_TEXT)
        annotations = client.analyze_sentiment(document=document)
    
        # Print the results
        print_result(annotations, movie_review_filename)
    
    
    if __name__ == '__main__':
    
        import glob
        csv_file_list = glob.glob("/Users/abhi/Desktop/mytxtfilepath/*.txt")
        for file in csv_file_list: #Iterate through a list of file paths
    
            analyze(file)
    

    对于10%的文本文件集(我有687个),代码运行良好,但过了一段时间,它开始抛出错误:

    ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
        for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
      File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    socket.gaierror: [Errno 8] nodename nor servname provided, or not known
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
        chunked=chunked)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
        self._validate_conn(conn)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
        conn.connect()
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
        conn = self._new_conn()
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
        self, "Failed to establish a new connection: %s" % e)
    urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 445, in send
        timeout=timeout
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
        _stacktrace=sys.exc_info()[2])
      File "/anaconda3/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
        raise MaxRetryError(_pool, url, error or ResponseError(cause))
    urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 120, in __call__
        **kwargs)
      File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 512, in request
        resp = self.send(prep, **send_kwargs)
      File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 622, in send
        r = adapter.send(request, **kwargs)
      File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 513, in send
        raise ConnectionError(e, request=request)
    requests.exceptions.ConnectionError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/grpc/_plugin_wrapping.py", line 77, in __call__
        callback_state, callback))
      File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 77, in __call__
        callback(self._get_authorization_headers(context), None)
      File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 65, in _get_authorization_headers
        headers)
      File "/anaconda3/lib/python3.6/site-packages/google/auth/credentials.py", line 122, in before_request
        self.refresh(request)
      File "/anaconda3/lib/python3.6/site-packages/google/oauth2/service_account.py", line 322, in refresh
        request, self._token_uri, assertion)
      File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 145, in jwt_grant
        response_data = _token_endpoint_request(request, token_uri, body)
      File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 106, in _token_endpoint_request
        method='POST', url=token_uri, headers=headers, body=body)
      File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 124, in __call__
        six.raise_from(new_exc, caught_exc)
      File "<string>", line 3, in raise_from
    google.auth.exceptions.TransportError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
    ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
        (self._dns_host, self.port), self.timeout, **extra_kw)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
        for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
      File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    socket.gaierror: [Errno 8] nodename nor servname provided, or not known
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
        chunked=chunked)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
        self._validate_conn(conn)
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
        conn.connect()
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
        conn = self._new_conn()
      File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
        self, "Failed to establish a new connection: %s" % e)
    urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b84470>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known
    
    During handling of the above exception, another exception occurred:
    ...
    

    错误会重复自身,然后对文件运行情感分析,然后多次出现,然后对文件运行情感分析,最后停止 RendezVous 错误(忘记捕获此消息)我想知道的是,代码在某些文件集上运行一段时间并抛出错误消息,再运行一段时间,抛出错误消息,然后在一点后完全停止工作,这是怎么回事?

    我重新运行代码,结果发现,它在文件夹中的一些随机文件后返回socket.gaierror。因此,人们可以合理地确信,问题不在于文件内容。

    EDIT1:文件只是任何 .txt 里面有文字的文件。 有人能帮我解决这个问题吗?我也可以向您保证,我在所有680个文件中的所有文本总共占1400个请求,我在根据云自然API定义的请求的基础上进行了非常细致的计算。所以我完全在我的范围之内。

    我试过了 sleep(10) 这似乎有一段时间效果不错,但又开始抛出错误。。

    1 回复  |  直到 6 年前
        1
  •  3
  •   ababuji    6 年前

    我想出来了。您不必一次读取所有600个文件,而是尝试以50个文件的批读取。(创建12个文件夹,每个文件夹50个文件),并在每次扫描完文件夹后手动运行代码。我不知道为什么会这样,但它只是起作用。