ChatGPT解决这个技术问题 Extra ChatGPT

Proxying to another web service with Flask

I want to proxy requests made to my Flask app to another web service running locally on the machine. I'd rather use Flask for this than our higher-level nginx instance so that we can reuse our existing authentication system built into our app. The more we can keep this "single sign on" the better.

Is there an existing module or other code to do this? Trying to bridge the Flask app through to something like httplib or urllib is proving to be a pain.

Also this question is relevant when doing AJAX services for old browsers like IE7 which do not support cross-domain security.
What specific problem are you having with httplib?
@jd: Given that flask is on the app side of WSGI, I am not sure I get all of the data to effectively forward. For example, the Flask request object doesn't seem to include the raw request (or even the request headers) that I'd want to pass into httplib. It's not that it's impossible, it's just a pain and I was hoping for an existing module which did it already.

S
Stephen Ostermiller

I spent a good deal of time working on this same thing and eventually found a solution using the requests library that seems to work well. It even handles setting multiple cookies in one response, which took a bit of investigation to figure out. Here's the flask view function:

from flask import request, Response
import requests

def _proxy(*args, **kwargs):
    resp = requests.request(
        method=request.method,
        url=request.url.replace(request.host_url, 'new-domain.example'),
        headers={key: value for (key, value) in request.headers if key != 'Host'},
        data=request.get_data(),
        cookies=request.cookies,
        allow_redirects=False)

    excluded_headers = ['content-encoding', 'content-length', 'transfer-encoding', 'connection']
    headers = [(name, value) for (name, value) in resp.raw.headers.items()
               if name.lower() not in excluded_headers]

    response = Response(resp.content, resp.status_code, headers)
    return response

Update April 2021: excluded_headers should probably include all "hop-by-hop headers" defined by RFC 2616 section 13.5.1.


@Evan nice solution. It doesn't handle 3xx redirections, however, since the redirection url might point to the proxied host
Could someone add how you call this in a MWE app?
This is great, many thanks! (This is what one needs to get ngrok to work with both front and back ends.) But, for me request.host_url includes http:// and also a trailing slash so the replace line for me was: request.url.replace(request.host_url, 'http://new-domain.com/')
@Ire I encountered this problem and have added an edit to fix. All I did was replaced the header filter line with headers = [(name, value) if (name.lower() != 'location') else (name, value.replace('http://new-domain.com/', request.host_url)) for (name, value) in resp.raw.headers.items() if name.lower() not in excluded_headers]. This just fixes the URL in the Location header. (thanks @jbasko for pointing out the issue with the trailing slash)
You can also stream the response content, instead of reading it entirely on the server. For this replace resp.content above with resp.iter_content(chunk_size=10*1024) and add content_type=r.headers['Content-Type'] argument to Response constructor.
j
jd.

I have an implementation of a proxy using httplib in a Werkzeug-based app (as in your case, I needed to use the webapp's authentication and authorization).

Although the Flask docs don't state how to access the HTTP headers, you can use request.headers (see Werkzeug documentation). If you don't need to modify the response, and the headers used by the proxied app are predictable, proxying is staightforward.

Note that if you don't need to modify the response, you should use the werkzeug.wsgi.wrap_file to wrap httplib's response stream. That allows passing of the open OS-level file descriptor to the HTTP server for optimal performance.


Thanks, I hacked something up this afternoon. Having all sorts of problems with cookies, though, since httplib doesn't handle them particularly well. Unfortunately I think I will need to modify the response to do some simple URL rewriting (ie, to
In my case there was just one cookie to catch, so a regex did the job to parse it, it's a lot easier to setup that Python's cookie libs.
Could you provide a link to your implementation, or the code itself in the body of the answer?
Here is a related SO answer with actual implementation.
J
Joe Shaw

My original plan was for the public-facing URL to be something like http://www.example.com/admin/myapp proxying to http://myapp.internal.example.com/. Down that path leads madness.

Most webapps, particularly self-hosted ones, assume that they're going to be running at the root of a HTTP server and do things like reference other files by absolute path. To work around this, you have to rewrite URLs all over the place: Location headers and HTML, JavaScript, and CSS files.

I did write a Flask proxy blueprint which did this, and while it worked well enough for the one webapp I really wanted to proxy, it was not sustainable. It was a big mess of regular expressions.

In the end, I set up a new virtual host in nginx and used its own proxying. Since both were at the root of the host, URL rewriting was mostly unnecessary. (And what little was necessary, nginx's proxy module handled.) The webapp being proxied to does its own authentication which is good enough for now.


Some illustration to "I set up a new virtual host" would be nice.
Definitely your last paragraph. Flask's strength is not as a proxy, so when possible, it's preferable to avoid using as one. The only valid reason I can think of is that some application logic like authentication or authorization is necessary and not supported by the other application.