Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload image issue #464

Closed
dfloresgonz opened this issue Jul 27, 2018 · 53 comments
Closed

Upload image issue #464

dfloresgonz opened this issue Jul 27, 2018 · 53 comments

Comments

@dfloresgonz
Copy link

dfloresgonz commented Jul 27, 2018

My image seems to be corrupted when uploaded to server, It only works with .txt files.

I'm using sls 1.27, EspressJS and multiparty
Is there a way to upload images without any problems?

Thanks.

function uploadImage(req, res) {
    let multiparty = require('multiparty');
    let fs = require('fs');
    let form = new multiparty.Form();
    form.parse(req, function(err, fields, files) {
        console.log(files);
        //uploading file...
        })).then(result => {
            res.status(200).send({ msj : 'Images were uploaded'});
        }).catch(err => {
            res.status(500).send(err);
        });
    });
}
const serverless = require('serverless-http');
const express = require('express')
const app = express()
require('./middlewares/authenticated');

var bodyParser = require('body-parser');
var cors       = require('cors');

app.use(cors({'origin' : '*'}));
app.use(bodyParser.urlencoded({ extended : false }) )
   .use(bodyParser.json());
service: my-backend

provider:
  name: aws
  runtime: nodejs6.10
  stage: prod
  region: us-east-1

plugins:
  - serverless-offline
  - serverless-s3-local

functions:
  app:
    handler: index.handler
    events:
      - http: ANY /
      - http: 'ANY {proxy+}'
  upload:
    handler: index.handler
    events:
      - http: 'POST /api/registro/uploadImage/'
@evangow
Copy link

evangow commented Sep 4, 2018

@dfloresgonz did you ever get this working? I'm running into the same issue.

@dfloresgonz
Copy link
Author

@evangow I read that sls offline doesnt support binary types so it will never work, but only offline. I used base64 to make it work on local env.

@evangow
Copy link

evangow commented Sep 4, 2018

@dfloresgonz Are you converting it to base64 on the client side, then sending it to the server? Or, are you using something like the the gist linked below to parse the event and pass it onto serverless-http?

Would you happen to have a code sample or gist you could share if you're doing it on the server side? I think I could probably figure it out on the client side, but I've been banging my head against the wall for hours looking through issues trying to sort this out.

Parser Gist: https://gist.github.com/lteacher/9ef1c7bc5908418b30a18719521ff3c7#file-parsers-js-L12-L41
^ Found in this issue: #230

@janwirth
Copy link

same here using

curl -X POST -F 'file=@/home/jan/figub/pass.mp3;type=audio/mpeg' localhost:3333/upload -H "Content-Type: audio/mpeg" --verbose -H "Accept: audio/mpeg"

@janwirth
Copy link

With both serverless-offline-python and serverless-offline on a nodejs environment the body contains payload which is ~30% larger than the original file. A diff between the binary files showed perfect identity up until two thirds - where in the resulting file a bunch of data is appended.

@malnor
Copy link

malnor commented Dec 3, 2018

Same here, cannot get it working, we looked through all the other code time after time without understanding what was happening. It would be good to at least throw something if it is not supported - took us more than a day to find this problem.

@dherault
Copy link
Owner

dherault commented Jan 8, 2019

We would gladly accept a PR on this. It shouldn't be much, just a Hapijs tweaking.

@janwirth
Copy link

janwirth commented Jan 8, 2019

FYI - I chose a different architecture. The lambda now just requests a temporary URL from my file bucket. The file bucket can be hidden behind an internal (transparent) redirect.

@arnas
Copy link
Contributor

arnas commented Feb 15, 2019

I have encountered this issue and after extensive investigation, I found that https://github.com/dherault/serverless-offline/blob/master/src/index.js#L490 causes the issue.

If I am trying to upload an image via multipart/form-data it is converted toString('binary') ( which seems is deprecated). Then it's passed to express (in my case), the request goes through https://www.npmjs.com/package/multer, which is decoded incorrectly and the image becomes corrupted. But if I commented out that toString call, express with its middleware correctly handles the raw buffer.

It was added with #394 as a workaround? As there behavior that binary data was converted to a utf8 string.

At least from initial testing, it seems that serverless offline works fine without 490 line. Maybe @dherault and @lteacher @daniel-cottone could comment if it's still required?

@lteacher
Copy link
Contributor

If the line is removed then i assume it would work, because the line does some magic check to ensure the toString('utf8') doesnt happen for multipart/form-data. So if it didn't ever do that the encoding detect thing wouldn't be needed.

Actually I haven't been using serverless for 2 years but we still have a service running that parses multi-part form data, though it was using an old branch.... I did switch to the latest serverless-offline just now and found that it worked ok still.

So regarding the above question, @arnas I also removed the line to see and as expected it worked just fine for me. That wasn't too surprising though because the check added was because of the toString('utf8') and that was to fix another issue so I have no idea if that issue would be a problem when removing that line. See #224

I also spent a bit of time checking out what the whole problem is around here and it was painful... I guess at the end of the day the issue seems to come down to the encoding (since I wasn't able to get an exact answer with code links to provide). So probably for most of the above, its something like... an issue because the parsing magic is done on pipe and the encoding has been set to binary string and its not a buffer when it comes to your logic.

For sure its just the above referenced line that converts to string, so if its removed and it doesn't break whatever that old issue was then it would be ok. Otherwise you could try setting the encoding, there is some check that happens for if it is an actual buffer so it would probably still work both online and in offline...

So summary try locally setting the encoding... In this gist the write function is called and given the encoding. In the examples from above the magical middleware you are all using is doing it in the pipe so try maybe setting req.setEncoding('binary') before passing it to the parse... which is a random guess since i didn't test it.

@arnas
Copy link
Contributor

arnas commented Feb 15, 2019

Well, it seems that this issue only appears for images, my colleague has tested and for example, uploading .txt works fine. It's a bit strange but from my finding, only images are affected.

@lteacher I have checked the pr #224 and at least form the first look seems that toString() is unnecessary as to prevent hapis.js from parsing payload you can simply set payload to parse to false.
And I am assuming if hapi parse is set to false function toString() should do nothing as payload already is a string.

@lteacher
Copy link
Contributor

lteacher commented Feb 16, 2019

@arnas sorry I dont get exactly what you mean with the last sentence.

Problem History

  • app - preventing hapi from parsing payloads; resolves #177. #224 added the toString for whatever reason. This change breaks everyone who has data for upload etc as it converts to 'utf8'.
  • Payload conversion converts data to utf8 #230 added function to ensure that just for multipart/form-data the encoding will be chosen that doesn't destroy the data, which is then 'binary'.
  • In this issue thread, people are piping the req data without any set encoding but the data was converted to string, so that wont work. To resolve it without changing here they need to set the encoding as it will find its not a buffer and process as a string and it will not be the correct encoding as it will default to utf8 (side note, I saw in the multiparty code that it will throw an error if you try to set encoding)

Fixing in this repo

So to resolve the issue based on what you said above then the best thing is to not do any toString and remove this logic but you might need to fix that issue from #224 some other way, maybe with that parse option or whatever

Fixing original author issues (assuming the fix is not done here in rep)

  • Dont use multiparty.
  • When using multer set the req encoding
  • When using formidable pass the encoding as an option
  • When using something else i dunno, check the source code and find out how to set the encoding

Also for original author if using wrapping packages with this kind of thing like the file data etc you need to check closely everywhere on the packages you are using like this doc over here. As I recall (but might have changed over last 2 years) you can only get binary content to AWS lambda functions in base64 anyway (but of course even the base64 string is destroyed if it is encoded incorrectly to utf8).

@arnas I assume that you have success already on AWS and that you use something like serverless-http.

@arnas
Copy link
Contributor

arnas commented Feb 16, 2019

@lteacher I have update the wording of my previous comment last sentence.

For fixing repo I believe that would be the best way. I am hoping that removing toString while hapi payload parse is to false will be enough. I will look into it if I have time.

@lteacher I am using serverless-http, but I havent deployed it to aws as I amasuming it will work without much problem as atleast from documentation aws api gateway supports that https://aws.amazon.com/about-aws/whats-new/2016/11/binary-data-now-supported-by-api-gateway/ .

@lteacher
Copy link
Contributor

lteacher commented Feb 18, 2019

@arnas oh well should be np if using that serverless-http

Actually I was thinking of this again for some reason and I had another look and noticed that the parse option is actually used already over here. However, I actually didn't notice before when i added that detectEncoding but that payload is just needed as a string for the JSON.parse in there. There is some createLambdaProxyContext which uses the payload but someone haxed something in to not parse if its not a string there (using rawPayload).

Theres a velocity template method too seems to assume the payload is json so dunno about that since sometimes its not already, otherwise could be able to just even move that stringifying stuff into the relevant scope where its needed which is like if it needs to be parsed per last reference?

@arnas
Copy link
Contributor

arnas commented Feb 22, 2019

@lteacher I have a bit of debugging and from the first look it seems that this issues is not only from serverless-offline side.
Basically I have created plain handler.js which uses serverless-http to transfer requests to express and multer to deal with multipart data. https://gist.github.com/arnas/d8fed4b78ff2940a3b390e754090cea9
Without changing any dependencies I have confirmed that txt files were encoded and decoded correctly, but there was an issue with png files. They have grown about 30 percentage bigger. (I think somebody mentioned that).
Also, I have confirmed that if I encode buffer and decode it afterward via Buffer(binaryString, 'binary') everything is fine. But as we are giving serverless-http string it uses Buffer.from(binaryString, 'utf-8') and corrupts file unless it was a text file :/ https://github.com/dougmoscrop/serverless-http/blob/c70957b47ac66e363c7179a6e0a4cd8c6d78a2ee/lib/request.js#L15.

tl;dr the bug happens because serverless-http wrongly encodes request.

Suggestions for solution

  • I haven't looked into these velocity tempaltes and I am not sure what are they for, but I believe that encoding to binary string are not required. So I would suggest to remove them and assume that the lambda handler will be able to handle raw buffer.

  • Wait till and if serverless-http will fix it from their side. There were some issues with binary files Image/png is not serving properly by serverless-http dougmoscrop/serverless-http#50, but i will raise a new one.

I could create a pr for a first point if this approch is acceptable with lib maintainers.

Side note rawPayload is always equal to https://github.com/dherault/serverless-offline/blob/master/src/index.js#L490

@lteacher
Copy link
Contributor

lteacher commented Feb 23, 2019

Hey @arnas I think that you are missing some setup on AWS side. If you want to use binary content you need to do some different setup per these docs: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings.html

Thats why I posted this thing from serverless-http

I knew about that Buffer.from(binaryString, 'utf-8') when i last looked and posted my response but I didnt post it here because its not relevant if the AWS setup is correct. That is because on AWS side if you are using binary content it should be coming base64 encoded and there is a flag its called isBase64Encoded. Actually I looked in the serverless-http code lol and I saw this flag so I assumed they used the flag but.... actually now I see they detect it or something so instead you have to provide some option? to set binary: true

If you look here in this gist you can see the actual property that your aws lambda-proxy binary content enabled handler will receive on the event. That is not a magical value I added for the gist, sadly that is an AWS thing that gets attached to the event.

I actually don't use serverless-http so dunno about that option. I didn't think it would be nice to include express or koa etc inside the this stuff. I have a separate package I created to make it nice for adding parsing etc, then I use the exact gist I posted above for parsing before the file handler.

In summary I think you can change your AWS setup to correctly handle binary content, then you are going to start recieving base64 encoded blobs in serverless-http. Then you need to be sure that serverless-http is expecting it to be isBase64Encoded. After that it should already work on AWS. Then to get serverless-offline working is where the actual issues are left because it doesn't do the base64 encoding magic that the AWS stuff is doing so thats all this entire issue so far.

@hqnarrate
Copy link

I'm running into this issue using serverless-http + serverless-offline and multer-s3 for handing multipart/form-data when uploading images.

@lteacher do you have a recommendation for getting serverless-offline to correctly handle this? My service works when deployed on AWS through api-gateway. For local development it is not.. Referring to your last comment above: "Then to get serverless-offline working is where the actual issues are left because it doesn't do the base64 encoding magic that the AWS stuff is doing so thats all this entire issue so far."

@lteacher
Copy link
Contributor

@hqnarrate Sorry im just not up to date on this issue any more, there seems to be a PR #784 that was working to address this somewhat through the config options.

Looking back over the comments it does seem like there is a way to refactor the offending line out but I just don't have any time to investigate that at the moment and I don't use any of the packages mentioned in these issues except serverless-offline so I can't really test any resolution.

Maybe @cmuto09 can give an update on where the PR referenced above is at, it looks like it needs rebasing at minimum.

@dnalborczyk
Copy link
Collaborator

dnalborczyk commented Oct 25, 2019

@hqnarrate @lteacher a bit of a problem is that @cmuto09 PR depends on the https://www.npmjs.com/package/serverless-apigw-binary plugin. but it seems serverless supports this now as well.

@hqnarrate
Copy link

@dnalborczyk - i was not clear. My service is working fine when deployed to AWS environment by setting the apiGateway.binaryMediaTypes correctly. It is when I'm running on offline mode with 'serverless offline start', my uploads would become corrupted.

@dnalborczyk
Copy link
Collaborator

dnalborczyk commented Oct 25, 2019

@hqnarrate sorry, I think I haven't been clear. 😄 I knew what you meant. The links above are pointers for me (or anyone else getting to it) for the implementation.

@pvsvamsi
Copy link

@dnalborczyk Any update or workarounds on this issue?

@abinhho
Copy link

abinhho commented Apr 3, 2020

My solution: https://stackoverflow.com/a/61003498/9585130

Just add this to serverless.yml.

provider:
  apiGateway:
    binaryMediaTypes:
      - '*/*'

And I'm uing "aws-serverless-express-binary": "^1.0.1"

@kennyhyun
Copy link

kennyhyun commented Jul 23, 2020

I had a similar issue with using serverless-offline + aws-serverless-express + multer

It said Error: Unexpected end of multipart data

image

whereas serverless-http has no problem.

I made a simple example, upload-to-serverless-offline
And I will try to figure out anything need to be fixed.


I could confirm that aws-serverless-express-binary is working as @abinhho mentioned here

@AkshayRawatt
Copy link

did anyone found the solution on this issue as uploading a form-data with pdf attached, on using multer , getting buffer size increased where as when i run same on node work's properly i.e without serverless.

@federicobarera
Copy link

I can confirm the following setup does not work in serverless offline but works once deployed to AWS.
Related to multipart uploads

"apollo-server": "^3.1.2",
"apollo-server-express": "^3.1.2",
"apollo-server-lambda": "^3.1.2",
"graphql-upload": "^12.0.0",
"serverless-offline": "^8.0.0",
provider:
  name: aws
  runtime: nodejs14.x
  apiGateway:
    binaryMediaTypes:
      - '*/*'
plugins:
  - serverless-offline
  - serverless-webpack

@deniapps
Copy link

"apollo-server-express": "^3.1.2",
"apollo-server-lambda": "^3.1.2",

How do you use "apollo-server-express": "^3.1.2" and "apollo-server-lambda": "^3.1.2" together?

I saw your comments on this thread: jaydenseric/graphql-upload#155 (comment)
I thought you used 'express' as @vaunus suggested:

const handler = apolloServer.createHandler({
  expressAppFromMiddleware(middleware) {
    const app = express()
    app.use(graphqlUploadExpress())
    app.use(middleware)
    return app
  }
})

@federicobarera
Copy link

Please disregard apollo-server-express . We are hosting apollo server on both self-hosted express server and as a lambda. apollo-server-lambda is built on top of apollo-server-express so lots of modules are compatible.

For the sake of this issue apollo-server-express is irrelevant. I am hosting as mentioned from vaunus in the other topic.
Due to this issue with serverless-offline and file uploads we are using the self hosted version for local dev

@UmerMIB
Copy link

UmerMIB commented Oct 8, 2021

I can confirm the following setup does not work in serverless offline but works once deployed to AWS. Related to multipart uploads

"apollo-server": "^3.1.2",
"apollo-server-express": "^3.1.2",
"apollo-server-lambda": "^3.1.2",
"graphql-upload": "^12.0.0",
"serverless-offline": "^8.0.0",
provider:
  name: aws
  runtime: nodejs14.x
  apiGateway:
    binaryMediaTypes:
      - '*/*'
plugins:
  - serverless-offline
  - serverless-webpack

Thank you so much bro it works for me, I was stuck in this issue for couple of WEEKS

@kirankjolly
Copy link

I can confirm the following setup does not work in serverless offline but works once deployed to AWS. Related to multipart uploads

"apollo-server": "^3.1.2",
"apollo-server-express": "^3.1.2",
"apollo-server-lambda": "^3.1.2",
"graphql-upload": "^12.0.0",
"serverless-offline": "^8.0.0",
provider:
  name: aws
  runtime: nodejs14.x
  apiGateway:
    binaryMediaTypes:
      - '*/*'
plugins:
  - serverless-offline
  - serverless-webpack

Are you sure there will no way to work in serverless offline. Might be I am also having same issue https://stackoverflow.com/questions/69810751/image-file-is-not-viewing-uploaded-by-s3

@mateo2181
Copy link

@UmerMIB can you share your handler to run Graphql with AWS Lambda? I'm using apollo-server-lambda and it's working but I couldn't run the mutation to save images in S3 correctly. I have that config with apiGateway in serverless.yml too.

@UmerMIB
Copy link

UmerMIB commented Nov 17, 2021

@UmerMIB can you share your handler to run Graphql with AWS Lambda? I'm using apollo-server-lambda and it's working but I couldn't run the mutation to save images in S3 correctly. I have that config with apiGateway in serverless.yml too.

import { initApi } from './bootstrap';
import config from './config';
import { graphqlUploadExpress } from 'graphql-upload';
import Express, { Application } from 'express';
const { ApolloServer } = require('apollo-server-lambda');

let server;
const createHandler = async () => {
    console.log('Serverless', 'Initializing...');
    const schema = await initApi();

    // Create Lambda compatible GraphQL Server
    console.log('Serverless', 'Creating Server...');
    server = new ApolloServer({
        schema,
        playground: {
            endpoint: `/${config.env}/graphql`,
        },
    });

    console.log('Serverless', 'server', server);

    return server.createHandler({
        expressAppFromMiddleware(middleware) {
            const app: Application = Express();
            app.use(graphqlUploadExpress());
            app.use(middleware);

            return app;
        },
        expressGetMiddlewareOptions: {
            cors: {
                origin: '*',
                credentials: false,
            },
        },
    });
};

exports.handler = async (event, context, callback) => {
    const lambda = await createHandler();
    // Warmup event will keep lambda and db connection live
    if (event.source === 'serverless-plugin-warmup') {
        console.log('---------- Warmup Event ----------')
        return {}
    }
    return lambda(event, context, () => {});
};

The schema

export const initApi = async () => {
    try {
        // Use {force: true} to reset DB
        //TODO: must be shift in deployement step. Sync with all api call
        // await sequelize.sync();
        // console.log("Sync completed.");

        // Testing connection
        await sequelize.authenticate();
        console.log('Connection has been established successfully.');
    } catch (error) {
        console.error('Unable to connect to the database:', error);
    }

    // Builds the GraphQL Schema
    if (!schema) {
        console.log("Schema doesn't exist, generating it");
        schema = await buildSchema({
            resolvers: [
                XXX
            ],
            emitSchemaFile: false,
            validate: false,
            scalarsMap: [{ type: Date, scalar: GraphQLISODateTimeOrString }],
        });
        console.log(schema);
    }

    return schema;
};

@mateo2181 is it enough code to share?

@mateo2181
Copy link

Thanks so much @UmerMIB! Yes, actually my handler is quite similar to that one. I think the problem is with serverless. I'm testing with Firecamp to be able to send images, I send two variables (file and petId) but when I check the logs in CloudWatch I see:
body: 'LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS1hYjY1N2Y4MzE5ODFjYjU0D......'

I have my serverless with the params that @kirankjolly comments above.

@mateo2181
Copy link

mateo2181 commented Nov 17, 2021

Screenshot from 2021-11-17 07-59-47

This is part of the request I see in CloudWatch.

I always get a timeout. (that I set in serverless.yml in 30 sec).

@alex88
Copy link

alex88 commented Jan 15, 2022

Is there a workaround on this?

@federicobarera
Copy link

For whomever is still messing around with this, small advice, is not worth it. You will shortly realize that lambdas have 6MB upload limits, so you better start to look into alternatives. This dude put together a good article: https://theburningmonk.com/2020/04/hit-the-6mb-lambda-payload-limit-heres-what-you-can-do/

@alex88
Copy link

alex88 commented Jan 16, 2022

Why is it not worth it, my customer has to upload an excel that I have to process, so instead of having a simple function receive a few hundred kB files, I have to deal with s3, send him a presigned url or hook into s3... Also considering that's something that works just fine on AWS, it just doesn't work locally

@g0dzillaa
Copy link

any update on this?

This is still a valid issue.

@kennyhyun
Copy link

BTW, I was patching node_modules/serverless-offline/dist/events/http/HttpServer.js

for my own purpose

-      request.payload = request.payload && request.payload.toString(encoding);
+      if (request.payload && encoding !== 'binary') {
+        request.payload = request.payload.toString(encoding);
+      }

@obaqueiro
Copy link

@kennyhyun patch also worked for me. Thanks!

@alvincrisuy
Copy link

@kennyhyun this works for me as well!

@C4n4rd0
Copy link

C4n4rd0 commented Jan 18, 2024

BTW, I was patching node_modules/serverless-offline/dist/events/http/HttpServer.js

for my own purpose

-      request.payload = request.payload && request.payload.toString(encoding);
+      if (request.payload && encoding !== 'binary') {
+        request.payload = request.payload.toString(encoding);
+      }

Can you create a PR for this to be merged?

@C4n4rd0
Copy link

C4n4rd0 commented Jan 19, 2024

In fact, I think the best is to add a section in the serverless-offline configuration to let the user self define how the payload should be encoded based on the content type.

@kennyhyun
Copy link

basically, this package is mainly for testing serverless (for example lambda) locally. so this package would be enough to support usual way for serverless functions.

I believe it is discouraged to send any binary data to serverless functions.
if that payload was not that big, base64 encoding should be an alternative.

I was doing this in a dodgy way to use this package in some special case, not only for testing. if someone need this, I think it should take their own risk. so I won't send a PR.

I agree with this BTW

For whomever is still messing around with this, small advice, is not worth it. You will shortly realize that lambdas have 6MB upload limits, so you better start to look into alternatives. This dude put together a good article: https://theburningmonk.com/2020/04/hit-the-6mb-lambda-payload-limit-heres-what-you-can-do/

@corentindesfarges
Copy link

corentindesfarges commented Apr 30, 2024

BTW, I was patching node_modules/serverless-offline/dist/events/http/HttpServer.js

for my own purpose

-      request.payload = request.payload && request.payload.toString(encoding);
+      if (request.payload && encoding !== 'binary') {
+        request.payload = request.payload.toString(encoding);
+      }

Thank you, works perfectly in v13.3.3!

Edit : This PR appears to fix the issue. :-)

@DorianMazur
Copy link
Collaborator

Fixed by #1776

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.