try:
r = requests.get(url, params={'s': thing})
except requests.ConnectionError, e:
print e #should I also sys.exit(1) after this?
这个对吗?有没有更好的方法来构建它?这会覆盖我所有的基地吗?
查看请求 exception docs。简而言之:
如果出现网络问题(例如 DNS 故障、连接被拒绝等),Requests 将引发 ConnectionError 异常。如果出现罕见的无效 HTTP 响应,Requests 将引发 HTTPError 异常。如果请求超时,则会引发超时异常。如果请求超过配置的最大重定向数,则会引发 TooManyRedirects 异常。 Requests 显式引发的所有异常都继承自 requests.exceptions.RequestException。
要回答您的问题,您所展示的内容不会涵盖您的所有基础。您只会捕获与连接相关的错误,而不是超时的错误。
捕获异常时该怎么做实际上取决于脚本/程序的设计。可以接受退出吗?你可以继续再试一次吗?如果错误是灾难性的并且您无法继续,那么可以,您可以通过引发 SystemExit 中止程序(打印错误和调用 sys.exit
的好方法)。
您可以捕获基类异常,它将处理所有情况:
try:
r = requests.get(url, params={'s': thing})
except requests.exceptions.RequestException as e: # This is the correct syntax
raise SystemExit(e)
或者你可以分别捕捉它们并做不同的事情。
try:
r = requests.get(url, params={'s': thing})
except requests.exceptions.Timeout:
# Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects:
# Tell the user their URL was bad and try a different one
except requests.exceptions.RequestException as e:
# catastrophic error. bail.
raise SystemExit(e)
正如 Christian 指出的:
如果您希望 http 错误(例如 401 Unauthorized)引发异常,您可以调用 Response.raise_for_status。如果响应是 http 错误,这将引发 HTTPError。
一个例子:
try:
r = requests.get('http://www.google.com/nothere')
r.raise_for_status()
except requests.exceptions.HTTPError as err:
raise SystemExit(err)
将打印:
404 Client Error: Not Found for url: http://www.google.com/nothere
一项额外的建议是明确的。似乎最好从错误堆栈中的特定到一般,以获取要捕获的所需错误,因此特定错误不会被一般错误掩盖。
url='http://www.google.com/blahblah'
try:
r = requests.get(url,timeout=3)
r.raise_for_status()
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)
Http Error: 404 Client Error: Not Found for url: http://www.google.com/blahblah
对比
url='http://www.google.com/blahblah'
try:
r = requests.get(url,timeout=3)
r.raise_for_status()
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
OOps: Something Else 404 Client Error: Not Found for url: http://www.google.com/blahblah
Max retries exceeded with url:
的例外情况是什么?我已将所有异常添加到异常列表中,但仍未处理。
异常对象还包含原始响应 e.response
,如果需要查看服务器响应的错误正文,这可能很有用。例如:
try:
r = requests.post('somerestapi.com/post-here', data={'birthday': '9/9/3999'})
r.raise_for_status()
except requests.exceptions.HTTPError as e:
print (e.response.text)
这是一种通用的处理方式,这至少意味着您不必用 try ... except
包围每个 requests
调用:
# see the docs: if you set no timeout the call never times out! A tuple means "max
# connect time" and "max read time"
DEFAULT_REQUESTS_TIMEOUT = (5, 15) # for example
def log_exception(e, verb, url, kwargs):
# the reason for making this a separate function will become apparent
raw_tb = traceback.extract_stack()
if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
kwargs['data'] = f'{kwargs["data"][:500]}...'
msg = f'BaseException raised: {e.__class__.__module__}.{e.__class__.__qualname__}: {e}\n' \
+ f'verb {verb}, url {url}, kwargs {kwargs}\n\n' \
+ 'Stack trace:\n' + ''.join(traceback.format_list(raw_tb[:-2]))
logger.error(msg)
def requests_call(verb, url, **kwargs):
response = None
exception = None
try:
if 'timeout' not in kwargs:
kwargs['timeout'] = DEFAULT_REQUESTS_TIMEOUT
response = requests.request(verb, url, **kwargs)
except BaseException as e:
log_exception(e, verb, url, kwargs)
exception = e
return (response, exception)
注意
请注意内置的 ConnectionError,与类 requests.ConnectionError* 无关。我假设后者在这种情况下更常见,但没有真正的想法......在检查非 None 返回的异常时,所有请求异常(包括 requests.ConnectionError)的超类 requests.RequestException 不是“requests.RequestException”。 exceptions.RequestException”根据文档。自从接受答案以来,它可能已经发生了变化。**显然,这假设已经配置了一个记录器。在 except 块中调用 logger.exception 似乎是个好主意,但这只会在此方法中提供堆栈!相反,获取导致调用此方法的跟踪。然后记录(包含异常的详细信息以及导致问题的调用的详细信息)
*我查看了源代码:requests.ConnectionError
子类化单个类 requests.RequestException
,子类化单个类 IOError
(内置)
**但是,在撰写本文时(2022 年 2 月),您在 this page 的底部找到“requests.exceptions.RequestException”......但它链接到上述页面:令人困惑。
用法很简单:
search_response, exception = utilities.requests_call('get',
f'http://localhost:9200/my_index/_search?q={search_string}')
首先,您检查响应:如果是 None
,发生了一些有趣的事情,并且您将有一个异常,必须根据上下文(以及异常)以某种方式对其进行处理。在 Gui 应用程序 (PyQt5) 中,我通常实现一个“可视日志”以向用户提供一些输出(同时也记录到日志文件),但添加的消息应该是非技术性的。因此,通常可能会出现这样的情况:
if search_response == None:
# you might check here for (e.g.) a requests.Timeout, tailoring the message
# accordingly, as the kind of error anyone might be expected to understand
msg = f'No response searching on |{search_string}|. See log'
MainWindow.the().visual_log(msg, log_level=logging.ERROR)
return
response_json = search_response.json()
if search_response.status_code != 200: # NB 201 ("created") may be acceptable sometimes...
msg = f'Bad response searching on |{search_string}|. See log'
MainWindow.the().visual_log(msg, log_level=logging.ERROR)
# usually response_json will give full details about the problem
log_msg = f'search on |{search_string}| bad response\n{json.dumps(response_json, indent=4)}'
logger.error(log_msg)
return
# now examine the keys and values in response_json: these may of course
# indicate an error of some kind even though the response returned OK (status 200)...
鉴于堆栈跟踪是自动记录的,您通常不需要更多...
但是,要跨越 Ts:
如果如上所述,异常给出消息“无响应”和非 200 状态“错误响应”,我建议
响应的 JSON 结构中缺少预期的键应导致消息“异常响应”
消息“意外响应”的超出范围或奇怪的值
以及消息“错误响应”中存在诸如“错误”或“错误”之类的键,其值为 True 或其他值
这些可能会或可能不会阻止代码继续。
......事实上,在我看来,让这个过程更加通用是值得的。对我来说,这些下一个函数通常将使用上述 requests_call
的 20 行代码减少到大约 3 行,并使您的大部分处理和日志消息标准化。在您的项目中进行了多次 requests
调用,代码变得更加美观且不那么臃肿:
def log_response_error(response_type, call_name, deliverable, verb, url, **kwargs):
# NB this function can also be used independently
if response_type == 'No': # exception was raised (and logged)
if isinstance(deliverable, requests.Timeout):
MainWindow.the().visual_log(f'Time out of {call_name} before response received!', logging.ERROR)
return
else:
if isinstance(deliverable, BaseException):
# NB if response.json() raises an exception we end up here
log_exception(deliverable, verb, url, kwargs)
else:
# if we get here no exception has been raised, so no stack trace has yet been logged.
# a response has been returned, but is either "Bad" or "Anomalous"
response_json = deliverable.json()
raw_tb = traceback.extract_stack()
if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
kwargs['data'] = f'{kwargs["data"][:500]}...'
added_message = ''
if hasattr(deliverable, 'added_message'):
added_message = deliverable.added_message + '\n'
del deliverable.added_message
call_and_response_details = f'{response_type} response\n{added_message}' \
+ f'verb {verb}, url {url}, kwargs {kwargs}\nresponse:\n{json.dumps(response_json, indent=4)}'
logger.error(f'{call_and_response_details}\nStack trace: {"".join(traceback.format_list(raw_tb[:-1]))}')
MainWindow.the().visual_log(f'{response_type} response {call_name}. See log.', logging.ERROR)
def check_keys(req_dict_structure, response_dict_structure, response):
# both structures MUST be dict
if not isinstance(req_dict_structure, dict):
response.added_message = f'req_dict_structure not dict: {type(req_dict_structure)}\n'
return False
if not isinstance(response_dict_structure, dict):
response.added_message = f'response_dict_structure not dict: {type(response_dict_structure)}\n'
return False
for dict_key in req_dict_structure.keys():
if dict_key not in response_dict_structure:
response.added_message = f'key |{dict_key}| missing\n'
return False
req_value = req_dict_structure[dict_key]
response_value = response_dict_structure[dict_key]
if isinstance(req_value, dict):
# if the response at this point is a list apply the req_value dict to each element:
# failure in just one such element leads to "Anomalous response"...
if isinstance(response_value, list):
for resp_list_element in response_value:
if not check_keys(req_value, resp_list_element, response):
return False
elif not check_keys(req_value, response_value, response): # any other response value must be a dict (tested in next level of recursion)
return False
elif isinstance(req_value, list):
if not isinstance(response_value, list): # if the req_value is a list the reponse must be one
response.added_message = f'key |{dict_key}| not list: {type(response_value)}\n'
return False
# it is OK for the value to be a list, but these must be strings (keys) or dicts
for req_list_element, resp_list_element in zip(req_value, response_value):
if isinstance(req_list_element, dict):
if not check_keys(req_list_element, resp_list_element, response):
return False
if not isinstance(req_list_element, str):
response.added_message = f'req_list_element not string: {type(req_list_element)}\n'
return False
if req_list_element not in response_value:
response.added_message = f'key |{req_list_element}| missing from response list\n'
return False
# put None as a dummy value (otherwise something like {'my_key'} will be seen as a set, not a dict
elif req_value != None:
response.added_message = f'required value of key |{dict_key}| must be None (dummy), dict or list: {type(req_value)}\n'
return False
return True
def process_json_requests_call(verb, url, **kwargs):
# "call_name" is a mandatory kwarg
if 'call_name' not in kwargs:
raise Exception('kwarg "call_name" not supplied!')
call_name = kwargs['call_name']
del kwargs['call_name']
required_keys = {}
if 'required_keys' in kwargs:
required_keys = kwargs['required_keys']
del kwargs['required_keys']
acceptable_statuses = [200]
if 'acceptable_statuses' in kwargs:
acceptable_statuses = kwargs['acceptable_statuses']
del kwargs['acceptable_statuses']
exception_handler = log_response_error
if 'exception_handler' in kwargs:
exception_handler = kwargs['exception_handler']
del kwargs['exception_handler']
response, exception = requests_call(verb, url, **kwargs)
if response == None:
exception_handler('No', call_name, exception, verb, url, **kwargs)
return (False, exception)
try:
response_json = response.json()
except BaseException as e:
logger.error(f'response.status_code {response.status_code} but calling json() raised exception')
# an exception raised at this point can't truthfully lead to a "No response" message... so say "bad"
exception_handler('Bad', call_name, e, verb, url, **kwargs)
return (False, response)
status_ok = response.status_code in acceptable_statuses
if not status_ok:
response.added_message = f'status code was {response.status_code}'
log_response_error('Bad', call_name, response, verb, url, **kwargs)
return (False, response)
check_result = check_keys(required_keys, response_json, response)
if not check_result:
log_response_error('Anomalous', call_name, response, verb, url, **kwargs)
return (check_result, response)
示例调用:
success, deliverable = utilities.process_json_requests_call('get',
f'{ES_URL}{INDEX_NAME}/_doc/1',
call_name=f'checking index {INDEX_NAME}',
required_keys={'_source':{'status_text': None}})
if not success: return False
# here, we know the deliverable is a response, not an exception
# we also don't need to check for the keys being present
index_status = deliverable.json()['_source']['status_text']
if index_status != 'successfully completed':
# ... i.e. an example of a 200 response, but an error nonetheless
msg = f'Error response: ES index {INDEX_NAME} does not seem to have been built OK: cannot search'
MainWindow.the().visual_log(msg)
logger.error(f'index |{INDEX_NAME}|: deliverable.json() {json.dumps(deliverable.json(), indent=4)}')
return False
因此,例如,在缺少键“status_text”的情况下,用户看到的“可视日志”消息将是“异常响应检查索引 XYZ。请参阅日志”。 (并且日志将显示有问题的密钥)。
注意
强制kwarg:call_name;可选的 kwargs:required_keys、acceptable_statuses、exception_handler。
required_keys 字典可以嵌套到任何深度
可以通过在 kwargs 中包含一个函数 exception_handler 来完成更细粒度的异常处理(尽管不要忘记 requests_call 将记录调用详细信息、异常类型和 __str__ 以及堆栈跟踪)。
在上面,我还对可能记录的任何 kwargs 中的关键“数据”进行了检查。这是因为批量操作(例如,在 Elasticsearch 的情况下填充索引)可能包含大量字符串。例如,减少到前 500 个字符。
PS 是的,我确实知道 elasticsearch
Python 模块(requests
周围的“薄包装”)。以上所有内容仅用于说明目的。
不定期副业成功案例分享
socket.timeout
异常:github.com/kennethreitz/requests/issues/1236