Making Proxied requests with Prefect

HTTP(S) Proxies

Proxies are programs that intermediate network requests between a server and a client. They are useful in a variety of privacy and security contexts and are typically found in the locked down networks of large enterprises. Let’s see how to use them with Prefect.

Setting up a proxy

If you don’t already have a proxy to connect to, it’s easy to set one up. We’ll use mitmproxy, a popular open source implementation in Python. You can install it directly, but we’ll use Docker to keep things encapsulated. Simply pull the image, create a directory, and run,

docker pull mitmproxy/mitmproxy

mkdir ~/.mitmproxy

docker run --rm -it -v ~/.mitmproxy:/home/mitmproxy/.mitmproxy -p 8080:8080 mitmproxy/mitmproxy

In the resulting window we’ll be able to inspect requests as they hit the proxy. Leave this window running and open up another terminal. We can now start making requests,

# This will NOT pass through the proxy
curl http://httpbin.org/uuid

# This WILL pass through the proxy
HTTP_PROXY=http://localhost:8080/ curl http://httpbin.org/uuid

Making HTTP requests is great, but we want to be secure and use HTTPS. In order to successfully make the request, we have to tell curl that it’s ok to trust our proxy. To do this we simply pass in the certificates in ~/.mitmproxy to our curl command,

HTTPS_PROXY=localhost:8080 curl --cacert ~/.mitmproxy/mitmproxy-ca-cert.pem https://httpbin.org/uuid

:memo: Note that curl supports a --proxy argument that we could have supplied instead of setting HTTP_PROXY or HTTPS_PROXY, but lots tools have standardized around providing these environment variables.

Proxies with Prefect

Now that we have a local proxy and know a bit about them, let’s see how Prefect comes into play. When making proxied requests in Prefect you’re mostly delegating the underlying details to your networking library. For this example, we’ll use httpx.

There are two ways to declare your proxies,

  1. By setting HTTP_PROXY, HTTPS_PROXY, or ALL_PROXY as environment variables
  2. By passing a str or dict of proxies directly to a httpx.Client instance

Let’s write a basic flow and try the first option,

import httpx
from prefect import task, flow, get_run_logger

@task
def fetch_uuid(protocol):
    resp = httpx.get(f"{protocol}://httpbin.org/uuid")
    return resp.json()["uuid"]

@flow()
def log_uuid(protocol):
    uuid = fetch_uuid(protocol)
    get_run_logger().info(uuid)

if __name__ == "__main__":
    log_uuid("http")

Running this flow with HTTP_PROXY will work as expected and show up in the proxy logs,

HTTP_PROXY=localhost:8080 python proxy.py

This is great, but again, we want to be safe and use HTTPS. Fortunately for us, we can set another environment variable, SSL_CERT_FILE to use the trust certificates for our specific proxy. Using the code above, but changing log_uuid("http") to log_uuid("https") we can now make the proxied HTTPS request,

HTTPS_PROXY=localhost:8080 SSL_CERT_FILE=~/.mitmproxy/mitmproxy-ca-cert.pem python proxy.py

Further Reading

1 Like