abhinavsingh / proxy.py

💫 Ngrok FRP Alternative • ⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework

Home Page:https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Replay from response cache

rthill91 opened this issue · comments

This is essentially a followup on #319

A solution was merged into cache-server but, so far as I can tell, hasn't ever made its way into develop or a release. Is this a dead/forgotten feature or are there still plans to see that merged?

Thanks

There are 2 separate things:

  1. Cache server for production usage
  2. Cache replay only during tests

Afaik, 2) can be achieved and already exists in some form. Sorry, I haven't tested or used this myself in a long time. See

def vcr(self) -> Generator[None, None, None]:
and
def test_proxy_vcr(self) -> None:

You might also be interested in the cache responses plugin, which has the capability to cache everything it sees passing through it (e.g. txt, mp4, json, xml, jpg etc)

Finally, using cache server in production is another story altogether. cache-server branch contains code for production cache server. Imagine Squid. Issue with a full fledged cache server is cache header management. If not done correctly, browsers/clients will behave unexpectedly. This has also been confirmed by other users on cache threads.

My hope was to use proxy.py to cache all responses during an end-to-end testing session, and then to be able to re-use those responses for subsequent test runs.

Essentially what vcrpy does, except I have requests coming from multiple processes. As near as I can tell, the cache responses plugin records the responses but has no way of replaying them later. The cache-server branch does, but hasn't been touched in some time.

It's also entirely possible I've just misunderstood how something works, but my basic setup was

proxy --plugins proxy.plugin.CacheResponsesPlugin
curl -x "http://localhost:8899" "https://www.google.com"
# disconnect network
curl -x "http://localhost:8899" "https://www.google.com"
# curl hangs, I would expect proxy.py to replay cached response here

@rthill91 You are correct. Looking at the code here, I realize VCR facility only enables cache responses plugin, which by itself is not responsible for replaying the responses.

Ref

@contextlib.contextmanager
def vcr(self) -> Generator[None, None, None]:
try:
CacheResponsesPlugin.ENABLED.set()
yield
finally:
CacheResponsesPlugin.ENABLED.clear()

Can you point the code in cache-server branch which is responsible for replaying the cache? Sorry, but I have myself not visited that branch in a while. I see, it should be relatively easy to only pull out this use-case out of cache-server branch.

Irrespective of how we achieve it, here is what will needs to be done:

  1. CacheResponses plugin must surface path to the cache per request
  2. This can be easily done using the logging context hook that proxy plugins can optionally implement. Example, see how ProgramName plugin exposes it's result here
    def on_access_log(self, context: Dict[str, Any]) -> Optional[Dict[str, Any]]:
    context.update({'client_ip': self.program_name})
    return context
  3. Once, CacheResponse plugin has surfaced the artifacts, out TestCase can then simply read/replay responses out of these cached responses.

This will still require some code to pull off end-to-end. Let me know if these pointers help. Try to see locally, if using this strategy helps. Let me know and I'll be happy to include necessary bits in the library itself.