ChatGPT解决这个技术问题 Extra ChatGPT

尝试/除了使用 Python requests 模块的正确方法?

try:
    r = requests.get(url, params={'s': thing})
except requests.ConnectionError, e:
    print e #should I also sys.exit(1) after this?

这个对吗?有没有更好的方法来构建它?这会覆盖我所有的基地吗?


C
Community

查看请求 exception docs。简而言之:

如果出现网络问题(例如 DNS 故障、连接被拒绝等),Requests 将引发 ConnectionError 异常。如果出现罕见的无效 HTTP 响应,Requests 将引发 HTTPError 异常。如果请求超时,则会引发超时异常。如果请求超过配置的最大重定向数,则会引发 TooManyRedirects 异常。 Requests 显式引发的所有异常都继承自 requests.exceptions.RequestException。

要回答您的问题,您所展示的内容不会涵盖您的所有基础。您只会捕获与连接相关的错误,而不是超时的错误。

捕获异常时该怎么做实际上取决于脚本/程序的设计。可以接受退出吗?你可以继续再试一次吗?如果错误是灾难性的并且您无法继续,那么可以,您可以通过引发 SystemExit 中止程序(打印错误和调用 sys.exit 的好方法)。

您可以捕获基类异常,它将处理所有情况:

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.RequestException as e:  # This is the correct syntax
    raise SystemExit(e)

或者你可以分别捕捉它们并做不同的事情。

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.Timeout:
    # Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects:
    # Tell the user their URL was bad and try a different one
except requests.exceptions.RequestException as e:
    # catastrophic error. bail.
    raise SystemExit(e)

正如 Christian 指出的:

如果您希望 http 错误(例如 401 Unauthorized)引发异常,您可以调用 Response.raise_for_status。如果响应是 http 错误,这将引发 HTTPError。

一个例子:

try:
    r = requests.get('http://www.google.com/nothere')
    r.raise_for_status()
except requests.exceptions.HTTPError as err:
    raise SystemExit(err)

将打印:

404 Client Error: Not Found for url: http://www.google.com/nothere

处理请求库的细节以及一般异常捕获的非常好的答案。
请注意,由于底层 urllib3 库中存在错误,如果您使用超时,您还需要捕获 socket.timeout 异常:github.com/kennethreitz/requests/issues/1236
未来的评论读者:这已在 Requests 2.9(捆绑 urllib3 1.13)中修复
如果您希望 http 错误(例如 401 Unauthorized)引发异常,您可以调用 Response.raise_for_status。如果响应是 http 错误,这将引发 HTTPError。
Request website 上的例外列表不完整。您可以阅读完整列表 here
S
Sam

一项额外的建议是明确的。似乎最好从错误堆栈中的特定到一般,以获取要捕获的所需错误,因此特定错误不会被一般错误掩盖。

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)

Http Error: 404 Client Error: Not Found for url: http://www.google.com/blahblah

对比

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)     

OOps: Something Else 404 Client Error: Not Found for url: http://www.google.com/blahblah

这也是 post 的有效语法吗?
@ScipioAfricanus 是的。
Max retries exceeded with url: 的例外情况是什么?我已将所有异常添加到异常列表中,但仍未处理。
@theking2 尝试 urllib3.exceptions.MaxRetryError 或 requests.exceptions.RetryError
@theking2 尝试 requests.ConnectionError,它会正常
t
tsh

异常对象还包含原始响应 e.response,如果需要查看服务器响应的错误正文,这可能很有用。例如:

try:
    r = requests.post('somerestapi.com/post-here', data={'birthday': '9/9/3999'})
    r.raise_for_status()
except requests.exceptions.HTTPError as e:
    print (e.response.text)

m
mike rodent

这是一种通用的处理方式,这至少意味着您不必用 try ... except 包围每个 requests 调用:

# see the docs: if you set no timeout the call never times out! A tuple means "max 
# connect time" and "max read time"
DEFAULT_REQUESTS_TIMEOUT = (5, 15) # for example

def log_exception(e, verb, url, kwargs):
    # the reason for making this a separate function will become apparent
    raw_tb = traceback.extract_stack()
    if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
        kwargs['data'] = f'{kwargs["data"][:500]}...'  
    msg = f'BaseException raised: {e.__class__.__module__}.{e.__class__.__qualname__}: {e}\n' \
        + f'verb {verb}, url {url}, kwargs {kwargs}\n\n' \
        + 'Stack trace:\n' + ''.join(traceback.format_list(raw_tb[:-2]))
    logger.error(msg) 

def requests_call(verb, url, **kwargs):
    response = None
    exception = None
    try:
        if 'timeout' not in kwargs:
            kwargs['timeout'] = DEFAULT_REQUESTS_TIMEOUT
        response = requests.request(verb, url, **kwargs)
    except BaseException as e:
        log_exception(e, verb, url, kwargs)
        exception = e
    return (response, exception)

注意

请注意内置的 ConnectionError,与类 requests.ConnectionError* 无关。我假设后者在这种情况下更常见,但没有真正的想法......在检查非 None 返回的异常时,所有请求异常(包括 requests.ConnectionError)的超类 requests.RequestException 不是“requests.RequestException”。 exceptions.RequestException”根据文档。自从接受答案以来,它可能已经发生了变化。**显然,这假设已经配置了一个记录器。在 except 块中调用 logger.exception 似乎是个好主意,但这只会在此方法中提供堆栈!相反,获取导致调用此方法的跟踪。然后记录(包含异常的详细信息以及导致问题的调用的详细信息)

*我查看了源代码:requests.ConnectionError 子类化单个类 requests.RequestException,子类化单个类 IOError(内置)

**但是,在撰写本文时(2022 年 2 月),您在 this page 的底部找到“requests.exceptions.RequestException”......但它链接到上述页面:令人困惑。

用法很简单:

search_response, exception = utilities.requests_call('get',
    f'http://localhost:9200/my_index/_search?q={search_string}')

首先,您检查响应:如果是 None,发生了一些有趣的事情,并且您将有一个异常,必须根据上下文(以及异常)以某种方式对其进行处理。在 Gui 应用程序 (PyQt5) 中,我通常实现一个“可视日志”以向用户提供一些输出(同时也记录到日志文件),但添加的消息应该是非技术性的。因此,通常可能会出现这样的情况:

if search_response == None:
    # you might check here for (e.g.) a requests.Timeout, tailoring the message
    # accordingly, as the kind of error anyone might be expected to understand
    msg = f'No response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    return
response_json = search_response.json()
if search_response.status_code != 200: # NB 201 ("created") may be acceptable sometimes... 
    msg = f'Bad response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    # usually response_json will give full details about the problem
    log_msg = f'search on |{search_string}| bad response\n{json.dumps(response_json, indent=4)}'
    logger.error(log_msg)
    return

# now examine the keys and values in response_json: these may of course 
# indicate an error of some kind even though the response returned OK (status 200)... 

鉴于堆栈跟踪是自动记录的,您通常不需要更多...

但是,要跨越 Ts:

如果如上所述,异常给出消息“无响应”和非 200 状态“错误响应”,我建议

响应的 JSON 结构中缺少预期的键应导致消息“异常响应”

消息“意外响应”的超出范围或奇怪的值

以及消息“错误响应”中存在诸如“错误”或“错误”之类的键,其值为 True 或其他值

这些可能会或可能不会阻止代码继续。

......事实上,在我看来,让这个过程更加通用是值得的。对我来说,这些下一个函数通常将使用上述 requests_call 的 20 行代码减少到大约 3 行,并使您的大部分处理和日志消息标准化。在您的项目中进行了多次 requests 调用,代码变得更加美观且不那么臃肿:

def log_response_error(response_type, call_name, deliverable, verb, url, **kwargs):
    # NB this function can also be used independently
    if response_type == 'No': # exception was raised (and logged)
        if isinstance(deliverable, requests.Timeout):
            MainWindow.the().visual_log(f'Time out of {call_name} before response received!', logging.ERROR)
            return    
    else:
        if isinstance(deliverable, BaseException):
            # NB if response.json() raises an exception we end up here
            log_exception(deliverable, verb, url, kwargs)
        else:
            # if we get here no exception has been raised, so no stack trace has yet been logged.  
            # a response has been returned, but is either "Bad" or "Anomalous"
            response_json = deliverable.json()

            raw_tb = traceback.extract_stack()
            if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
                kwargs['data'] = f'{kwargs["data"][:500]}...'
            added_message = ''     
            if hasattr(deliverable, 'added_message'):
                added_message = deliverable.added_message + '\n'
                del deliverable.added_message
            call_and_response_details = f'{response_type} response\n{added_message}' \
                + f'verb {verb}, url {url}, kwargs {kwargs}\nresponse:\n{json.dumps(response_json, indent=4)}'
            logger.error(f'{call_and_response_details}\nStack trace: {"".join(traceback.format_list(raw_tb[:-1]))}')
    MainWindow.the().visual_log(f'{response_type} response {call_name}. See log.', logging.ERROR)
    
def check_keys(req_dict_structure, response_dict_structure, response):
    # both structures MUST be dict
    if not isinstance(req_dict_structure, dict):
        response.added_message = f'req_dict_structure not dict: {type(req_dict_structure)}\n'
        return False
    if not isinstance(response_dict_structure, dict):
        response.added_message = f'response_dict_structure not dict: {type(response_dict_structure)}\n'
        return False
    for dict_key in req_dict_structure.keys():
        if dict_key not in response_dict_structure:
            response.added_message = f'key |{dict_key}| missing\n'
            return False
        req_value = req_dict_structure[dict_key]
        response_value = response_dict_structure[dict_key]
        if isinstance(req_value, dict):
            # if the response at this point is a list apply the req_value dict to each element:
            # failure in just one such element leads to "Anomalous response"... 
            if isinstance(response_value, list):
                for resp_list_element in response_value:
                    if not check_keys(req_value, resp_list_element, response):
                        return False
            elif not check_keys(req_value, response_value, response): # any other response value must be a dict (tested in next level of recursion)
                return False
        elif isinstance(req_value, list):
            if not isinstance(response_value, list): # if the req_value is a list the reponse must be one
                response.added_message = f'key |{dict_key}| not list: {type(response_value)}\n'
                return False
            # it is OK for the value to be a list, but these must be strings (keys) or dicts
            for req_list_element, resp_list_element in zip(req_value, response_value):
                if isinstance(req_list_element, dict):
                    if not check_keys(req_list_element, resp_list_element, response):
                        return False
                if not isinstance(req_list_element, str):
                    response.added_message = f'req_list_element not string: {type(req_list_element)}\n'
                    return False
                if req_list_element not in response_value:
                    response.added_message = f'key |{req_list_element}| missing from response list\n'
                    return False
        # put None as a dummy value (otherwise something like {'my_key'} will be seen as a set, not a dict 
        elif req_value != None: 
            response.added_message = f'required value of key |{dict_key}| must be None (dummy), dict or list: {type(req_value)}\n'
            return False
    return True

def process_json_requests_call(verb, url, **kwargs):
    # "call_name" is a mandatory kwarg
    if 'call_name' not in kwargs:
        raise Exception('kwarg "call_name" not supplied!')
    call_name = kwargs['call_name']
    del kwargs['call_name']

    required_keys = {}    
    if 'required_keys' in kwargs:
        required_keys = kwargs['required_keys']
        del kwargs['required_keys']

    acceptable_statuses = [200]
    if 'acceptable_statuses' in kwargs:
        acceptable_statuses = kwargs['acceptable_statuses']
        del kwargs['acceptable_statuses']

    exception_handler = log_response_error
    if 'exception_handler' in kwargs:
        exception_handler = kwargs['exception_handler']
        del kwargs['exception_handler']
        
    response, exception = requests_call(verb, url, **kwargs)

    if response == None:
        exception_handler('No', call_name, exception, verb, url, **kwargs)
        return (False, exception)
    try:
        response_json = response.json()
    except BaseException as e:
        logger.error(f'response.status_code {response.status_code} but calling json() raised exception')
        # an exception raised at this point can't truthfully lead to a "No response" message... so say "bad"
        exception_handler('Bad', call_name, e, verb, url, **kwargs)
        return (False, response)
        
    status_ok = response.status_code in acceptable_statuses
    if not status_ok:
        response.added_message = f'status code was {response.status_code}'
        log_response_error('Bad', call_name, response, verb, url, **kwargs)
        return (False, response)
    check_result = check_keys(required_keys, response_json, response)
    if not check_result:
        log_response_error('Anomalous', call_name, response, verb, url, **kwargs)
    return (check_result, response)      

示例调用:

success, deliverable = utilities.process_json_requests_call('get', 
    f'{ES_URL}{INDEX_NAME}/_doc/1', 
    call_name=f'checking index {INDEX_NAME}',
    required_keys={'_source':{'status_text': None}})
if not success: return False
# here, we know the deliverable is a response, not an exception
# we also don't need to check for the keys being present
index_status = deliverable.json()['_source']['status_text']
if index_status != 'successfully completed':
    # ... i.e. an example of a 200 response, but an error nonetheless
    msg = f'Error response: ES index {INDEX_NAME} does not seem to have been built OK: cannot search'
    MainWindow.the().visual_log(msg)
    logger.error(f'index |{INDEX_NAME}|: deliverable.json() {json.dumps(deliverable.json(), indent=4)}')
    return False

因此,例如,在缺少键“status_text”的情况下,用户看到的“可视日志”消息将是“异常响应检查索引 XYZ。请参阅日志”。 (并且日志将显示有问题的密钥)。

注意

强制kwarg:call_name;可选的 kwargs:required_keys、acceptable_statuses、exception_handler。

required_keys 字典可以嵌套到任何深度

可以通过在 kwargs 中包含一个函数 exception_handler 来完成更细粒度的异常处理(尽管不要忘记 requests_call 将记录调用详细信息、异常类型和 __str__ 以及堆栈跟踪)。

在上面,我还对可能记录的任何 kwargs 中的关键“数据”进行了检查。这是因为批量操作(例如,在 Elasticsearch 的情况下填充索引)可能包含大量字符串。例如,减少到前 500 个字符。

PS 是的,我确实知道 elasticsearch Python 模块(requests 周围的“薄包装”)。以上所有内容仅用于说明目的。