API: Extremely Poor Docker Resource Utilization Efficiency #2730

palisadoes · 2024-12-02T22:39:18Z

Describe the bug

We run a demonstration instance of Talawa-API on a GoDaddy VPS server running Ubuntu. It has the following resources:

1 core
2 GB of RAM
40 GB of disk

Other information:

The demo instance is intended to create an evaluation environment for new GtiHub contributors and users alike as they decide to use Talawa. The DB of the demo instance gets reset every day.
Talawa API runs natively on this VPS server with acceptable performance with one user. The load average is approximately 1, which is the target value for a system with only 1 core.
When Talawa API runs on the server using docker. The load average reaches 130, the swap process is the top CPU resource user. The system is so overloaded that only one ssh session at a time is achievable.

The purpose of this issue is to find ways to tune all Talawa-API Dockerfile and app configurations to lower its CPU and RAM utilization by at least 75%

With the current Docker performance very few developers or end users will want to try Talawa themselves.
This has been a recurring issue with Talawa API. The poor performance threatens the success of our current MongoDB based MVP.

To Reproduce
Steps to reproduce the behavior:

Run Talawa-API on a system
See excessive resource utilization

Expected behavior

Acceptable usage information such that it can run easily on a mid-range laptop without impacting its performance

Actual behavior

Poor performance

Screenshots

Additional details
Add any other context or screenshots about the feature request here.

Potential internship candidates

Please read this if you are planning to apply for a Palisadoes Foundation internship

Student Internship Programs talawa#359

prayanshchh · 2024-12-03T07:07:05Z

can u please assign, I want to work on this issue but I will need guidance

varshith257 · 2024-12-03T08:11:27Z

This mostly related of reducing docker image size

prayanshchh · 2024-12-06T03:19:55Z

Diffrent ways to approach this issue

1. Multi-Stage Builds
Using a multi-stage build can help separate the build and runtime environments, ensuring that only production-ready artifacts are included in the final image. This can be achieved by:

Installing dependencies and building the application in the first stage.
Copying only the necessary files (e.g., dist, node_modules) into a minimal runtime stage.

2. Optimizing Base Images
Switching to optimized base images can dramatically reduce size:

Baseline Image (Full Node.js): ~900 MB
Using Multi-Stage with Slim: ~400–500 MB
Using Multi-Stage with Alpine: ~250–300 MB
With Distroless: ~150–200 MB

3. Using Compression Tools
Tools like docker-slim can further compress the final image by analyzing and stripping unused dependencies and files:
With docker-slim: ~100–150 MB.

please suggest a method that doesn't impact comaptibility with codebase

palisadoes · 2024-12-06T03:30:15Z

@prayanshchh

Please investigate the best solution and propose it after testing on your system. It's not just RAM, but also ways to reduce the CPU overhead.

prayanshchh · 2024-12-06T04:01:32Z

alright sir

vasujain275 · 2024-12-07T17:35:50Z

@palisadoes @prayanshchh

The main problem I found with the API is that we have to run it in dev mode in the production Docker environment because our build process for the Talawa API is broken, so we can't use npm run start. If we resolve the build issue, we can drastically improve performance and security of the docker container.

I think @varshith257 also tried to solve the build process issue a few months back, any upadates on that?

palisadoes · 2024-12-07T18:00:00Z

Would this PR by @adithyanotfound provide any insights?

Refactored DockerFile to improve efficiency talawa-admin#2607

palisadoes · 2024-12-07T18:00:59Z

@vasujain275 Why do you say the build process is broken? Can you create an issue for someone else to try to fix it?

prayanshchh · 2024-12-07T19:31:20Z

Would this PR by @adithyanotfound provide any insights?

Refactored DockerFile to improve efficiency talawa-admin#2607

Yes this helps, I will start my work on this in two days, have got end sem exams

prayanshchh · 2024-12-14T07:53:10Z

am unassigning myself from the issue due to lack of progress

PurnenduMIshra129th · 2024-12-14T19:36:21Z

@palisadoes plz assign me

PurnenduMIshra129th · 2024-12-17T18:30:30Z

@palisadoes what is the load average if the api runs without docker means what is the performance . I need this because i will only focus to improve to docker performance.If not then i have to use profiler to measure what is the exact issue is it related to docker container or in code sue unOptimized query.

PurnenduMIshra129th · 2024-12-17T19:00:29Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism. And one doubt is how i give more load to this api because at the time of testing l am the only user .

vasujain275 · 2024-12-17T19:08:47Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism.

We don't need k8s
Multistage builds and lightweight base image will not help, we already have multi stage builds with alpine images. The main issue is our build process.
@palisadoes Due to my end semester exams right now I am not able to create that Graphql build Error Issue that is the main performance blocker on this. I will get to in 2-3 days once my exams end. Sorry for the delay.
I think we should close the docker performance related issues as they create unnecessary confusion. Our docker images are well optimised. The main issue is that we are running our api in dev mode in them, once the build is fixed we can modify the docker files to see the performance improvements.

PurnenduMIshra129th · 2024-12-17T19:20:09Z

Build related issue means i don't get means u are saying about unnecessary node modules or something like this are in build at the time of building the docker image there which are causing the issue. I need futher calrity. And in above u commented u are not able to run npm run start it is working fine because api service is starting

palisadoes · 2024-12-17T20:25:01Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism.
1. We don't need k8s

2. Multistage builds and lightweight base image will not help, we already have multi stage builds with alpine images. The main issue is our build process.

3. @palisadoes Due to my end semester exams right now I am not able to create that Graphql build Error Issue that is the main performance blocker on this. I will get to in 2-3 days once my exams end. Sorry for the delay.

4. I think we should close the docker performance related issues as they create unnecessary confusion. Our docker images are well optimised. The main issue is that we are running our api in dev mode in them, once the build is fixed we can modify the docker files to see the performance improvements.

OK.

PurnenduMIshra129th · 2024-12-17T21:02:01Z

@palisadoes i run a load test on the server with docker and with out docker on the configuration of duration of 30 sec and 2 req/sec and found means total of 60 request will be made in 30 sec in this scenerio both have equal successRate . But when i run the same test for same duration but with different request rate like 5 req/sec means 150 request in 30 sec got the result of slightly better performance of server with out docker . But the thing is server can't handle 150 request in 30 sec as many of request is under processing and not completed the request out of this only 40 request is successful.And if u want run the docker on low end service for a small user base like in 60 sec it makes 50 to 60 (considerable factor like medicore device 4gb of ram and 4core ) it will handle the request easily if talwa-api will reduce its cpu excessive task and if we limit the cpu usage also it will handle but some slowness will be there in this scenerio. What u say?

palisadoes · 2024-12-20T21:46:33Z

@PurnenduMIshra129th please coordinate with @vasujain275

There appears to be multiple causes. The application is clearly over using resources.

Here is additional information.

Cloud Based API Instance for Developers #1428 (comment)

PurnenduMIshra129th · 2024-12-21T19:38:33Z

@vasujain275 yes u are correct build process is broken . After build it is not working properly . Also when i try to run npm run prod it is not running gives multiple error. U have any thoughts on this ? should we have use import instead of require.

palisadoes · 2024-12-26T18:26:29Z

The PR was merged. We now need to:

Deploy the API and Admin instances
Determine the best develop / production strategy for the API

PurnenduMIshra129th · 2024-12-26T18:28:02Z

@palisadoes ok working.

PurnenduMIshra129th · 2024-12-26T18:29:40Z

@palisadoes for this i have also done for develop-postgress so what i have to do now ? will i make a pr for develop branch or develop postgress.

PurnenduMIshra129th · 2024-12-27T14:37:19Z

@adithyanotfound why it is not working npm run prod ? any suggestion?

that file is not present what it try to finds

adithyanotfound · 2024-12-27T15:49:36Z

@adithyanotfound why it is not working npm run prod ? any suggestion?

@PurnenduMIshra129th please run npm run generate:ssl-private-key to generate certs before running npm run prod

palisadoes · 2024-12-27T20:24:03Z

@adithyanotfound why it is not working npm run prod ? any suggestion?

@PurnenduMIshra129th please run npm run generate:ssl-private-key to generate certs before running npm run prod

@adithyanotfound

Is this documented? If not, please open a PR to do so.
I'm assuming that this was done in the GitHub Action pull-request.yml file. Was this done?

adithyanotfound · 2024-12-28T03:57:17Z

@adithyanotfound

Is this documented? If not, please open a PR to do so.

I'm assuming that this was done in the GitHub Action pull-request.yml file. Was this done?

I'll update the docs.
Yes it was done in Github actions pull-request.yml file.

palisadoes · 2024-12-28T15:08:14Z

@palisadoes for this i have also done for develop-postgress so what i have to do now ? will i make a pr for develop branch or develop postgress.

Please work on this so that it's done correctly

Remove the Development environment from the code base #2800

Take a look at the develop-postgres branch and see if it's appropriate to make any adjustments. You'll need to coordinate with @xoldd as he's working on it behind the scenes

PurnenduMIshra129th · 2024-12-28T15:09:49Z

@palisadoes ok

palisadoes · 2024-12-30T18:33:36Z

This will be an interesting issue. Get ready for the migration!

MinIO Async Deletion Service - develop-postgres branch #2808

PurnenduMIshra129th · 2025-01-01T20:41:53Z

@palisadoes one doubt is that in the development branch there is no need to maintain two separate docker-compose.dev and docker-compose.prod only one is sufficient .can we delete one and write the logic in one file also and give commands for common for all it will start . I will make two pull request one for develop, and for develop-postgres.

palisadoes · 2025-01-01T20:47:55Z

Only one docker file in develop.
1. Update INSTALLATION.md
The develop-postgres is a complete rewrite of the API. There is already a docker file there.
1. compose.yml
2. https://github.com/PalisadoesFoundation/talawa-api/tree/develop-postgres
No work needs to be done on develop-postgres

PurnenduMIshra129th · 2025-01-01T20:50:24Z

@palisadoes yes i want to do the same for develop branch.If not required then i will change the build process of prod and its compose file with limit restriction.It will work same

PurnenduMIshra129th · 2025-01-01T21:14:58Z

see this screenshots after removing some unUsed nodemodules there is massive decrese in image size in prod which will be useful further

palisadoes · 2025-01-01T21:47:45Z

Should we add a GitHub workflow test to search for unused packages in package.json if it would make the long term health of the code better? Your explanation gives the impression we should to reduce the docker image size

PurnenduMIshra129th · 2025-01-02T19:46:41Z

@palisadoes yes we can do it will very helpful and effictive against reducing the size image size.

palisadoes · 2025-01-02T20:27:24Z

Please add that workflow to your PR

PurnenduMIshra129th · 2025-01-02T20:50:08Z

@palisadoes ok will do.

PurnenduMIshra129th · 2025-01-05T18:41:34Z

@palisadoes issue i am faced is also an important issue related to .dockerignore in this what is happening is that inside .dockerignore is not actuallly ignoring when it is building the file.As a result videos ,document section and all static images which are not required is also present in our docker images which is increasing docker image size also this is issue no 1. And second issue is after removing dev dependcies from production the code is not working which it shouldn't . Can i open issue for both of these. And in feature request wil i open a issue to create a workflow regarding this finding the unnecceasary nodemodules or packages to remove.

PurnenduMIshra129th · 2025-01-05T21:34:01Z

@palisadoes check the pr and tell what i have to change.

varshith257 · 2025-01-11T16:20:29Z

It will be a mess if we don't differentiate between dev and prod env's that's the purpose of where dev and prod docker files are introduced. I haven't gone through the whole conversation but again I want to likely restrict them, so we had two docker files :

The dev image needs to be what it is for in the codebase don't choose to reduce image size or something for what we have it will result in a rabbithole in dev mode to work locally. Mostly the image also should be the standard docker image not the slim versions of them as it will not be helpful.
For Cloud instance or deployment we use prod image( currently it had used dev image due to building issues in prod env but unrelated to optimizing dev for it). Here you can experiment with what you can optimize anyway, multistage, alpine/slim images, etc., if we can start to prod env is not going to take more than 200MB which is best still now

varshith257 · 2025-01-11T17:01:41Z

@vasujain275, @SiddheshKukade, @varshith257

Is there any way we can get a non-docker instance of the API and Admin apps running using the develop branch?

We'll also need the API to reset its database every 24 hours as we originally planned

The code will need to be updated and the apps restarted whenever there is an update as originally planned

We want to feature this as part of our GSoC 2025 organization application to help us get selected again. It's really important.

Yes, we have ways for it. Can we move this discussion to #maintainers channel?

vasujain275 · 2025-01-11T17:05:21Z

@varshith257 I am almost done with deploying docker instance of api on the vps, the only thing left to fix is caddy. Do you have any idea why are we using self signed certificates for localhost? They are causing problems with caddy in docker setup

varshith257 · 2025-01-11T17:13:53Z

I have mentioned the solution and its root cause clearly in Slack in a discussion. I have tagged you there

vasujain275 · 2025-01-11T17:23:54Z

@varshith257 following up the discussion on slack

PurnenduMIshra129th · 2025-01-11T17:55:37Z

@varshith257 what to do you exactly want can you clarify the points again? Do you want dev and prod image separately if this is the problem then i can the revert the changes no problem is there still my changes is there . so just tell the points where i have to work.

palisadoes added the bug Something isn't working label Dec 2, 2024

github-actions bot added feature request unapproved Unapproved for Pull Request labels Dec 2, 2024

varshith257 removed the unapproved Unapproved for Pull Request label Dec 3, 2024

varshith257 assigned prayanshchh Dec 3, 2024

palisadoes mentioned this issue Dec 3, 2024

Cloud Based API Instance for Developers #1428

Open

palisadoes changed the title ~~Extremely Poor Docker Resource Utilization Efficiency~~ API: Extremely Poor Docker Resource Utilization Efficiency Dec 4, 2024

palisadoes added good first issue Good for newcomers and removed feature request labels Dec 4, 2024

prayanshchh removed their assignment Dec 14, 2024

Cioppolo14 assigned PurnenduMIshra129th Dec 14, 2024

palisadoes mentioned this issue Dec 27, 2024

workflow which ensures that talawa Api app starts in docker #2759

Merged

adithyanotfound mentioned this issue Dec 28, 2024

Updated documentation to include production setup #2798

Merged

palisadoes mentioned this issue Dec 28, 2024

Remove the Development environment from the code base #2800

Open

PurnenduMIshra129th mentioned this issue Jan 5, 2025

Bugfix-Docker is consuming too much resource (PR for Devlop Branch) #2828

Open

API: Extremely Poor Docker Resource Utilization Efficiency #2730

API: Extremely Poor Docker Resource Utilization Efficiency #2730

Comments

palisadoes commented Dec 2, 2024

prayanshchh commented Dec 3, 2024

varshith257 commented Dec 3, 2024

prayanshchh commented Dec 6, 2024 • edited Loading

palisadoes commented Dec 6, 2024

prayanshchh commented Dec 6, 2024

vasujain275 commented Dec 7, 2024

palisadoes commented Dec 7, 2024

palisadoes commented Dec 7, 2024

prayanshchh commented Dec 7, 2024

prayanshchh commented Dec 14, 2024

PurnenduMIshra129th commented Dec 14, 2024

PurnenduMIshra129th commented Dec 17, 2024

PurnenduMIshra129th commented Dec 17, 2024 • edited Loading

vasujain275 commented Dec 17, 2024 • edited Loading

PurnenduMIshra129th commented Dec 17, 2024 • edited Loading

palisadoes commented Dec 17, 2024

PurnenduMIshra129th commented Dec 17, 2024

palisadoes commented Dec 20, 2024

PurnenduMIshra129th commented Dec 21, 2024

palisadoes commented Dec 26, 2024

PurnenduMIshra129th commented Dec 26, 2024

PurnenduMIshra129th commented Dec 26, 2024

PurnenduMIshra129th commented Dec 27, 2024 • edited Loading

adithyanotfound commented Dec 27, 2024 • edited Loading

palisadoes commented Dec 27, 2024

adithyanotfound commented Dec 28, 2024

palisadoes commented Dec 28, 2024

PurnenduMIshra129th commented Dec 28, 2024

palisadoes commented Dec 30, 2024

PurnenduMIshra129th commented Jan 1, 2025

palisadoes commented Jan 1, 2025

PurnenduMIshra129th commented Jan 1, 2025 • edited Loading

PurnenduMIshra129th commented Jan 1, 2025

palisadoes commented Jan 1, 2025

PurnenduMIshra129th commented Jan 2, 2025

palisadoes commented Jan 2, 2025

PurnenduMIshra129th commented Jan 2, 2025

PurnenduMIshra129th commented Jan 5, 2025

PurnenduMIshra129th commented Jan 5, 2025

varshith257 commented Jan 11, 2025 • edited Loading

varshith257 commented Jan 11, 2025

vasujain275 commented Jan 11, 2025

varshith257 commented Jan 11, 2025

vasujain275 commented Jan 11, 2025

PurnenduMIshra129th commented Jan 11, 2025

prayanshchh commented Dec 6, 2024 •

edited

Loading

PurnenduMIshra129th commented Dec 17, 2024 •

edited

Loading

vasujain275 commented Dec 17, 2024 •

edited

Loading

PurnenduMIshra129th commented Dec 17, 2024 •

edited

Loading

PurnenduMIshra129th commented Dec 27, 2024 •

edited

Loading

adithyanotfound commented Dec 27, 2024 •

edited

Loading

PurnenduMIshra129th commented Jan 1, 2025 •

edited

Loading

varshith257 commented Jan 11, 2025 •

edited

Loading