-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal error in Airnode feed #207
Comments
The error message doesn't tell much :/ I couldn't find related issues for In regards to Github URLs we need to make sure that the production ones are immutable. (I've assigned both of us on this issue for now) |
Unfortunately, we don't have any metrics. |
For debugging purpose, I activated It's charged extra so it wouldn't be enabled by default. But it's easy to enable it in CF template: "AppCluster": {
"Type": "AWS::ECS::Cluster",
"Properties": {
"ClusterName": "AirnodeFeedCluster-<SOME_ID>",
+ "ClusterSettings": [
+ {
+ "Name": "containerInsights",
+ "Value": "enabled"
+ }
+ ]
}
} |
We should be able to see at least CPU and memory usage, but when I was stress testing the container it was able to handle the load so I'd be surprised if it was caused by this. Hopefully, we will be able to reproduce it again with more insights. |
The same error occurred in TwelveData's deployment too. |
Thanks, for reference this is the error in Grafana. The service seems operational again after AWS restart, so I suspect there is some memory leak. I will try to reproduce it and fix it. |
Btw. it seems that the message in Grafana is trimmed out. E.g. the error message pasted in this issue contains more information and suggests race condition inside Node.js. |
Happened again with Finage. |
I created an issue on Node.js repo nodejs/node#51652 and hope someone responds. An idea would be to try migrating to a different Node.js image (or version). Especially, there are some mentions to use the Slim package instead of Alpine. |
It happend to coinpaprika too. One possibly useful information is, it happens with the Airnode feeds that include more data feeds. |
I wonder if this will happen with a configuration that excludes the Grafana log shipping stuff |
A good idea. AFAIK they use different C libraries 👌 This definitely looks like a runtime issue. |
I'm currently trying to recreate this issue by simulating lots of feeds. |
It was already mentioned but doesn't seem to be related to memory. I managed to limit RAM for a local airnode-feed and make it crash due to out of memory and the error looks different:
You can reproduce this by using this script in "dev": "NODE_OPTIONS=--max-old-space-size=100 nodemon --ext ts,js,json,env --exec \"pnpm ts-node src/index.ts\"", |
I've been running it locally (docker images built from With so many feeds I've noticed there's a bottleneck when running post-processing, so it could be useful to maybe put that logic in a worker thread in future. |
So it's been running locally now with 15 000 feeds for about 4 hours and it hasn't died - so it may be something specific to AWS or the RAM allocation (which affects processor resources). I'll try with reduced RAM. |
It ran overnight with 15000 feeds and a reduced-speed CPU to try and simulate resource constraints. It still didn't crash, so I'm thinking even more that this may be specific to AWS. I'm now running it with fuzzed responses from the data-provider API: randomly every 3rd API response is corrupted and every 2nd response is delayed by 0 to 3000 ms. I'll let it run like this for a few hours. |
It's been running for three days, 15k feeds, some fuzzing and it's still running, no crashes, so... this is a hard bug to trace 😆 |
It happened again in TwelveData's Airnode feed. |
My local instance eventually crashed because it ran out of log space (400 GBs) - so I haven't been able to recreate this locally. Upgrading to Node 20 may help. |
Let's close this one, otherwise it's going to remain on the board forever. After crashing, the service restarts so we are not affected much by this as of now. We've tried using a different Node image (didn't help) and upgraded Node version (not confirmed whether it helps). |
Has this happened again with the updated Node version? Just curious. |
I think it did |
API providers' current deployments are |
Nodary Airnode feed process failed with this error and because we read config from raw Github URL and change the deployment file's location (from
candidate-deployments
to theactive-deployments
) after the deployment, CF tried to redeploy the app but wget kept throwing 404 error.As a solution to this, I will update CF EntryPoint in a way to try
candidate-deployments
path first if fails try active-deployments.@Siegrift for visibility.
The text was updated successfully, but these errors were encountered: