Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guv daemon dying in-flight #70

Closed
randomsock opened this issue Jul 9, 2015 · 9 comments
Closed

guv daemon dying in-flight #70

randomsock opened this issue Jul 9, 2015 · 9 comments

Comments

@randomsock
Copy link

This is different from #64, and more serious, because it's happened randomly on 2 different installations so far and taken everything down.

Node 0.12.4
guvnor 3.5.12

From /var/log/guvnor/guvnor.error.log at the time it disappeared according to our monitor: (both incidents same)

{ date: 'Thu Jul 09 2015 03:58:11 GMT+0200 (CEST)',
  process:
   { pid: 32696,
     uid: 0,
     gid: 3017,
     cwd: '/var/run/guvnor',
     execPath: '/usr/local/bin/node',
     version: 'v0.12.4',
     argv:
      [ '/usr/local/bin/node',
        '/usr/local/lib/node_modules/guvnor/lib/daemon/index.js' ],
     memoryUsage: { rss: 177487872, heapTotal: 132897792, heapUsed: 20059776 } },
  os:
   { loadavg: [ 0.0615234375, 0.0869140625, 0.02880859375 ],
     uptime: 4191812 },
  trace:
   [ { column: 3,
       file: '_stream_writable.js',
       function: 'afterWrite',
       line: 361,
       method: null,
       native: false },
     { column: 7,
       file: '_stream_writable.js',
       function: 'onwrite',
       line: 352,
       method: null,
       native: false },
     { column: 5,
       file: '_stream_writable.js',
       function: 'TLSSocket.WritableState.onwrite',
       line: 105,
       method: 'WritableState.onwrite',
       native: false },
     { column: 12,
       file: 'net.js',
       function: 'WriteWrap.afterWrite',
       line: 787,
       method: 'afterWrite',
       native: false } ],
  stack:
   [ 'TypeError: object is not a function',
     '    at afterWrite (_stream_writable.js:361:3)',
     '    at onwrite (_stream_writable.js:352:7)',
     '    at TLSSocket.WritableState.onwrite (_stream_writable.js:105:5)',
     '    at WriteWrap.afterWrite (net.js:787:12)' ],
  level: 'error',
  message: 'uncaughtException: object is not a function',
  timestamp: '2015-07-09T01:58:11.523Z' }
@achingbrain
Copy link
Member

Hmm, not great. dnode is wrapped in a TLS socket so that could be why it's coming from the TLSSocket stream.

Do you know what was happening on the server when it crashed?

@randomsock
Copy link
Author

Nothing in particular. It was 3am local time, so pretty much dead zone even for our international users.

I've checked through various other logs and they all appear normal, if quiet. Nginx for example reverse-proxies to local services and was seeing nothing but our automated monitoring at the time, then the upstream to guvnor services suddenly stopped accepting (Connection Refused, as you'd expect).

@randomsock
Copy link
Author

Any update on this? It's still happening, only occasionally but enough to put pressure on us to write a job to monitor and restart guv, which kinda defeats the object to some degree.

Again, nothing usual happening or reported at the time, just that same trace in guvnor.error.log.

@Removed-5an
Copy link

Same issue:

{
   "date":"Sat Oct 10 2015 09:28:41 GMT+0200 (CEST)",
   "process":{
      "pid":8683,
      "uid":0,
      "gid":1001,
      "cwd":"/run/guvnor",
      "execPath":"/home/sanity/.nvm/versions/node/v4.1.1/bin/node",
      "version":"v4.1.1",
      "argv":[
         "/home/sanity/.nvm/versions/node/v4.1.1/bin/node                                                               ",
         "/home/sanity/.nvm/versions/node/v4.1.1/lib/node_modules/guvnor/lib/daemon/index.js"
      ],
      "memoryUsage":{
         "rss":44298240,
         "heapT                                                               otal":25778944,
         "heapUsed":15580496
      }
   },
   "os":{
      "loadavg":[
         1.3232421875,
         1.60498046875,
         1.6416015625
      ],
      "uptime":726480
   },
   "trace":[
      {
         "                                                               column":3,
         "file":"_stream_writable.js",
         "function":"afterWrite",
         "line":346,
         "method":null,
         "native":false
      },
      {
         "column":7,
         "file":"_stream_writable.js",
         "function":"onwrite",
         "line":337,
         "method":null,
         "native":false
      },
      {
         "column":5,
         "file":"_stream_writable.js                                                               ",
         "function":"TLSSocket.WritableState.onwrite",
         "line":89,
         "method":"WritableState.onwrite",
         "native":false
      },
      {
         "column":12,
         "fil                                                               e":"net.js",
         "function":"WriteWrap.afterWrite",
         "line":772,
         "method":"afterWrite",
         "native":false
      }
   ],
   "stack":[
      "TypeError: object                                                                is not a function",
      "    at afterWrite (_stream_writable.js:346:3)",
      "    at onwrite (_stream_writable.js:337:7)",
      "    at TL                                                               SSocket.WritableState.onwrite (_stream_writable.js:89:5)",
      "    at WriteWrap.afterWrite (net.js:772:12)"
   ],
   "level":"error",
   "m                                                               essage":"uncaughtException: object is not a function",
   "timestamp":"2015-10-10T07:28:41.116Z"
}

@vuza
Copy link

vuza commented Nov 30, 2015

Same thing happened to me twice (on two servers) last 5 days, nothing remarkable happened on the servers, when errors occurred.

Any help would be appreciated.

Regards,
Marlon

{  
   "date":"Mon Nov 30 2015 09:19:47 GMT+0100 (CET)",
   "process":{  
      "pid":19109,
      "uid":0,
      "gid":1006,
      "cwd":"/run/guvnor",
      "execPath":"/usr/local/bin/node",
      "version":"v0.12.7",
      "argv":[  
         "/usr/local/bin/node",
         "/usr/local/lib/node_modules/guvnor/lib/daemon/index.js"
      ],
      "memoryUsage":{  
         "rss":88379392,
         "heapTotal":72061696,
         "heapUsed":11517336
      }
   },
   "os":{  
      "loadavg":[  
         0.0029296875,
         0.0146484375,
         0.04541015625
      ],
      "uptime":2836570
   },
   "trace":[  
      {  
         "column":3,
         "file":"_stream_writable.js",
         "function":"afterWrite",
         "line":361,
         "method":null,
         "native":false
      },
      {  
         "column":7,
         "file":"_stream_writable.js",
         "function":"onwrite",
         "line":352,
         "method":null,
         "native":false
      },
      {  
         "column":5,
         "file":"_stream_writable.js",
         "function":"TLSSocket.WritableState.onwrite",
         "line":105,
         "method":"WritableState.onwrite",
         "native":false
      },
      {  
         "column":12,
         "file":"net.js",
         "function":"WriteWrap.afterWrite",
         "line":787,
         "method":"afterWrite",
         "native":false
      }
   ],
   "stack":[  
      "TypeError: object is not a function",
      "    at afterWrite (_stream_writable.js:361:3)",
      "    at onwrite (_stream_writable.js:352:7)",
      "    at TLSSocket.WritableState.onwrite (_stream_writable.js:105:5)",
      "    at WriteWrap.afterWrite (net.js:787:12)"
   ],
   "level":"error",
   "message":"uncaughtException: object is not a function",
   "timestamp":"2015-11-30T08:19:47.081Z"
}

@randomsock
Copy link
Author

@5an1ty @vuza - use PM2 instead, works well and is properly supported

@alanshaw
Copy link
Member

Wow, when open source goes bad. I'm pretty sure that wasn't necessary @randomsock.

Could this possibly be the issue with Node.js that was fixed very recently:
https://nodejs.org/en/blog/vulnerability/december-2015-security-releases/#cve-2015-8027-denial-of-service-vulnerability

@randomsock
Copy link
Author

No offence intended @alanshaw, but with no sign of any real development in months despite issues outstanding, including almost all of my own dating back to April, I kinda figured this project was stale. In the meantime, PM2 has at last matured and is now production grade - remembering that we originally ditched PM2 out of frustration in favour of guvnor that showed so much promise. My comment is entirely substantiated and was intended only as a suggestion to the others who have a job to do and are still concerned about guvnor's stability. If you can demonstrate that that Node issue is the cause of this particular problem then that's great, but there's still the bigger picture.

@achingbrain
Copy link
Member

@randomsock I appreciate your frustration with this issue, however I've been unable to replicate it so it's been rather hard to fix.

The issue seems to be that the socket connection has lost it's context somewhere deep inside node core. It craps out invoking cb which is stored in the state variable - so it sounds similar to the bug that @alanshaw linked to. If you could try upgrading node and see if you still have the problem that would be helpful.

Failing that, if you could attempt to provide some way to replicate the problem I will gladly look at it - if you provide a PR to fix the problem I will definitely merge it. This is open source - it's what we make it and everyone's time is voluntary. We all have jobs - if you are concerned about guvnor's stability and it's affecting your ability to do your job then please donate some time and help out by looking into the issue because, you know, given enough eyeballs all bugs are shallow and all that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants