Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ws all_events api disconnects immediately #59183

Closed
dbellavista opened this issue Dec 21, 2020 · 6 comments · Fixed by #63062
Closed

[BUG] ws all_events api disconnects immediately #59183

dbellavista opened this issue Dec 21, 2020 · 6 comments · Fixed by #63062
Assignees
Labels
Bug broken, incorrect, or confusing behavior Regression The issue is a bug that breaks functionality known to work in previous releases. severity-high 2nd top severity, seen by most users, causes major problems

Comments

@dbellavista
Copy link
Contributor

dbellavista commented Dec 21, 2020

Description
The websocket API all_events instantly disconnects after establishing the ws connection.

Setup

# Salt master configuration
file_roots:
  base:
    - /opt/states
log_level: debug
sharedsecret: ssecret
external_auth:
  sharedsecret:
    sharedsecret1:
      - .*
      - '@wheel'
      - '@runner'
      - '@jobs'
# Salt api configuration
sharedsecret: ssecret
log_level: debug
external_auth:
  sharedsecret:
    sharedsecret1:
      - .*
      - '@wheel'
      - '@runner'
      - '@jobs'

rest_tornado:
    port: 8100
    backlog: 128
    ssl_crt: /etc/pki/tls/certs/localhost.crt
    ssl_key: /etc/pki/tls/certs/localhost.key
    debug: False
    disable_ssl: False
    webhook_disable_auth: False
    cors_origin: null
    websockets: True
salt-master -l debug -d -c /etc/salt/master
salt-api -l debug -d -c /etc/salt/api

Steps to Reproduce the behavior
I'm following the API guide to login, obtain the token and then openeing the API with:

const Websocket = require('ws');
const source = new Websocket(
  'wss://localhost:8100/all_events/58bb55a47ce22ccd42367d6f31ebf7ba0543b2b67eb3afb27ec65040096a48ce',
  {rejectUnauthorized: false}
);
source.onopen = function () {
  console.log("Opened");
  source.send('websocket client ready');
};
source.onerror = function (e) {
  console.log('OnError', e);
};
source.onmessage = function (e) {
  console.log('OnData', e);
};
source.onclose = function (e) {
  console.log('OnClose', e.type, e.code, e.wasClean, e.reason);
};

Expected behavior
The websocket should remain open. Instead I immediately get:

$ node test.js
Opened
OnClose close 1005 true

Salt api log:

2020-12-21 13:45:38,193 [salt.loaded_140319559971024.int.netapi.rest_tornado.saltnado_websockets:323 ][DEBUG   ][430] In the websocket get method
2020-12-21 13:45:38,193 [tornado.access                                                         :2064][INFO    ][430] 101 GET /all_events/58bb55a47ce22ccd42367d6f31ebf7ba0543b2b67eb3afb27ec65040096a48ce (172.23.0.1) 0.63ms
2020-12-21 13:45:38,197 [salt.loaded_140319559971024.int.netapi.rest_tornado.saltnado_websockets:348 ][DEBUG   ][430] Got websocket message websocket client ready
2020-12-21 13:45:38,197 [salt.loaded_140319559971024.int.netapi.rest_tornado.saltnado_websockets:362 ][INFO    ][430] Error! Ending server side websocket connection. Reason = 
2020-12-21 13:45:38,198 [salt.loaded_140319559971024.int.netapi.rest_tornado.saltnado_websockets:377 ][DEBUG   ][430] In the websocket close method

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
           Salt: 3002.1
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 2.7.3
      docker-py: Not Installed
          gitdb: 2.0.6
      gitpython: 3.0.7
         Jinja2: 2.10.1
        libgit2: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.6.2
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: Not Installed
   pycryptodome: 3.6.1
         pygit2: Not Installed
         Python: 3.8.5 (default, Jul 28 2020, 12:59:40)
   python-gnupg: 0.4.5
         PyYAML: 5.3.1
          PyZMQ: 18.1.1
          smmap: 2.0.5
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.3.2
 
System Versions:
           dist: ubuntu 20.04 focal
         locale: utf-8
        machine: x86_64
        release: 5.9.14-100.fc32.x86_64
         system: Linux
        version: Ubuntu 20.04 focal

Additional context

It seems that when first entering the main loop in

the get_event call is resolved immediately with a TimeoutException because the request is finished. It could be that when replying with the websocket headers at

self.handler.finish()

The request is involuntary finished, resulting in the subsequent close of the connection?

@dbellavista dbellavista added the Bug broken, incorrect, or confusing behavior label Dec 21, 2020
@dbellavista
Copy link
Contributor Author

dbellavista commented Dec 21, 2020

By the way, it works fine on bionic, 2019.2

salt --versions-report # bionic 2019.2 (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
           Salt: 2019.2.8
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 3.6.9 (default, Apr 18 2020, 01:56:04)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.4.3
            ZMQ: 4.2.5
 
System Versions:
           dist: Ubuntu 18.04 bionic
         locale: ANSI_X3.4-1968
        machine: x86_64
        release: 5.9.14-100.fc32.x86_64
         system: Linux
        version: Ubuntu 18.04 bionic

@sagetherage sagetherage assigned Ch3LL and dwoz and unassigned Ch3LL Jan 4, 2021
@sagetherage
Copy link
Contributor

@dbellavista ok so it is a regression from v2019 to 3002.1, did it work on any version in between?

@sagetherage sagetherage added the Regression The issue is a bug that breaks functionality known to work in previous releases. label Jan 28, 2021
@dbellavista
Copy link
Contributor Author

Hi, just tried with 3000 and 3001 on ubuntu 18.04 and it didn't work.

@dbellavista
Copy link
Contributor Author

Just one question for now. Since the 2019 has been removed from the repo, is there an alternative way to use the WS apis?
They are part of our production workflow and we have to upgrade salt in a few weeks.

@sagetherage
Copy link
Contributor

v2019 have been moved to https://archive.repo.saltproject.io/, but if the WS APIs work - since we vendored tornado you have seen the issues (#50699 (comment)), but it should work, since we don't support this version we aren't testing this specific use case, but would be interested to know if that works for you.

@dbellavista
Copy link
Contributor Author

Thanks @sagetherage for for the repo link, I totally missed it!

And also thank you for reminding me of that issue. I tried with

pip3 install tornado==4.4.3
mv /usr/lib/python3/dist-packages/salt/ext/tornado /usr/lib/python3/dist-packages/salt/ext/tornado_old
cp -r /usr/local/lib/python3.8/dist-packages/tornado /usr/lib/python3/dist-packages/salt/ext/tornado

and it works as expected.

@sagetherage sagetherage assigned dwoz and unassigned garethgreenaway Feb 9, 2021
@sagetherage sagetherage added Silicon v3004.0 Release code name and removed needs-triage labels Feb 9, 2021
@sagetherage sagetherage added this to the Silicon milestone Feb 9, 2021
@sagetherage sagetherage added severity-high 2nd top severity, seen by most users, causes major problems phase-plan labels Feb 9, 2021
@dwoz dwoz modified the milestones: Silicon, Phosphorus Aug 30, 2021
@dwoz dwoz added Phosphorus v3005.0 Release code name and version and removed Silicon v3004.0 Release code name labels Aug 30, 2021
@dwoz dwoz removed the Phosphorus v3005.0 Release code name and version label Apr 18, 2022
@dwoz dwoz removed this from the Phosphorus v3005.0 milestone Apr 18, 2022
@dwoz dwoz added this to the Sulphur v3006.0 milestone Apr 18, 2022
@Ch3LL Ch3LL assigned MKLeb and unassigned dwoz Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Regression The issue is a bug that breaks functionality known to work in previous releases. severity-high 2nd top severity, seen by most users, causes major problems
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants