stoplightio / prism

Turn any OpenAPI2/3 and Postman Collection file into an API server with mocking, transformations and validations.

Home Page:https://stoplight.io/open-source/prism

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow Binary Requests To Succeed in Prism Proxy Mode

travisgosselin opened this issue · comments

There are varying behaviors I have noticed when working with multipart/form-data request bodies behind the prism proxy. First off, I understand that according to your documentation you do not support multipart/form-data or binary files:
https://docs.stoplight.io/docs/prism/1593d1470e4df-concepts#content-negotiation

I'd like to understand what "support" means specifically for proxy mode? For example, does it mean that you do not validate or log any discrepancies in these request/response bodies, but still forward the bodies through? Or does it mean that the proxy is expected to completely fail if it sees a content type of multipart/form-data?

In some observation, I have noticed that:

V4.14.1 of the Prism Proxy

  • When making a request with an xlsx file it succeeds and proxies the file through (though it doesn't validate the schema as I would expect). When passing large csv or txt files it also works and proxies it along when executing the local CLI on a Windows OS.
  • When executing the same parameters via Docker container stoplight/prism:4.14.1, the exact same series of requests fail:
/usr/src/prism/node_modules/split2/index.js:44
      push(this, this.mapper(list[i]))
                      ^
SyntaxError: Unexpected token � in JSON at position 259901
    at Transform.parse [as mapper] (<anonymous>)
    at Transform.transform [as _transform] (/usr/src/prism/node_modules/split2/index.js:44:23)
    at Transform._read (/usr/src/prism/node_modules/readable-stream/lib/_stream_transform.js:166:10)
    at Transform._write (/usr/src/prism/node_modules/readable-stream/lib/_stream_transform.js:155:83)
    at doWrite (/usr/src/prism/node_modules/readable-stream/lib/_stream_writable.js:390:139)
    at writeOrBuffer (/usr/src/prism/node_modules/readable-stream/lib/_stream_writable.js:381:5)
    at Transform.Writable.write (/usr/src/prism/node_modules/readable-stream/lib/_stream_writable.js:302:11)
    at Socket.ondata (node:internal/streams/readable:754:22)
    at Socket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:315:12)

I expect that there is a platform-specific reason this functions properly on Windows but not within the Linux-based container provided?

V5.3.1 of the Prism Proxy

When making a request with an xlsx file it will fail with the following error and the proxy will shut down even while executing locally on the same Windows platform:

Recv failure: Connection was reset

Passing other csv or txt payloads seems to work as long as there are no special characters in the payload (such as %) which causes the prism proxy to crash with the same error. It would seem no tagged container exists for testing it on Docker yet... but I did try "master" tag and it resulted in similar failure as well.

What is Supported?

Even though the documentation notes there is no support for multipart/form-data, I do see this recently completed merge that is noted for release in v5.2.0: #2321 . Not sure if this is intended to support the use cases I describe above or if it is something more targeted.

Your guidance in indicating if multipart/form-data in part or in whole can be used to pass through the proxy would be appreciated (even if just not validated would be fine), including expectations on usage on different platforms (windows vs linux).

Thanks for the question @travisgosselin - I'm getting clarification from the team so we can update the docs to be more clear.

Thinking about future enhancements... is your use case primarily xlsx, csv, and txt uploads on linux?

Thanks for the question @travisgosselin - I'm getting clarification from the team so we can update the docs to be more clear.

Thinking about future enhancements... is your use case primarily xlsx, csv, and txt uploads on linux?

Thanks for the quick response @ryotrellim.
Yes I'd say that xlsx and csv are our primary use cases of content we hoped to have as multipart uploads. I was hoping that if content-type in the spec was set to binary it wouldn't matter the specific extension or file types (agnostic). We will have some image uploads coming soon too for example.

One more question @travisgosselin:

When making a request with an xlsx file it succeeds and proxies the file through (though it doesn't validate the schema as I would expect)

Can you say more about what validation you expect? Do you have an example OAS? AFAIk, we haven't historically validated non-json objects.

hmmm... the way I wrote that, I think it can be interpreted as the opposite of what I was saying. I'll try to clarify:

Original Intent

When making a request with an xlsx file it succeeds and proxies the file through (though it doesn't validate the schema as I would expect)

I intended to indicate that the proxy actually worked just fine with large xlsx files... and that it does not validate the schema at all of the multipart payloads, but that is expected...as I didn't think it should validate that type of content. The intent was not to indicate that the schema should be validated.

Further Consideration

At this point, I'd be happy with support of just passing the multipart payload through without any validation. But, there is an opportunity I think to validate the multipart form names for binary formatted data, and potentially deeper validation for the rest of the data. Consider an endpoint like this in OAS:

/users/imports:
    post:
      summary: "Queues a Bulk User Import"
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/User"
           multipart/form-data:
            schema:
                profileImage:
                   type: "string"
                   format: "binary"

In the above example, I can send a CURL request and indicate the profileImage named in the multipart upload:

curl -X POST -H "Content-Type: multipart/form-data" -F "profileImage=@LocalFilePath" /users/imports

However, I noticed when testing with Prism, that the name within the "schema" is not validated... in this case profileImage. So the following request is completely valid and does not raise any validation warnings:

curl -X POST -H "Content-Type: multipart/form-data" -F "image=@LocalFilePath" /users/imports

Of course, the server itself may file if looking for a particular named file of profileImage. But generally speaking, if I pass any named binary information I would expect it to be a name defined in the OAS.

Additionally, non-binary data could be validated as a future enhancement potentially. Further validation or attempt to validate could be determined based on contentType specification as well. My reference for the OAS usage of multipart is here: https://swagger.io/docs/specification/describing-request-body/multipart-requests/

However, at this time, my hope and necessary MVP is to just at minimum proxy the multipart/form-data through without any validation to at least keep validation working on this endpoint (without bypassing it) for other content types... like the example I have above also allows for application/json to be passed as normal.

this might not be really related to multipart but more about handling binary files - see #2349

The fix for this has been released in version 5.3.2.