Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP freezes when losing client connection #54

Closed
Laxilef opened this issue Dec 21, 2023 · 1 comment
Closed

ESP freezes when losing client connection #54

Laxilef opened this issue Dec 21, 2023 · 1 comment
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@Laxilef
Copy link
Contributor

Laxilef commented Dec 21, 2023

Hi
I noticed something strange. If a client disconnects from the telnet server suddenly, the ESP goes into an endless loop. At the same time, I don’t see the watchdog triggering and ESP reboots.

ESP comes to life if:

  1. forcibly disconnect the ESP client on the wifi router
  2. reconnect to the telnet server

It feels like there is somewhere stuck in an infinite loop with a call to yield() because the watchdog timer is not firing.
This can be easily tested:

  1. We use ESPTelnet and Pubsubclient
  2. Every 5 seconds we write something in the mqtt topic and in telnet
  3. Connect to the telnet server
  4. Disable wifi on the telnet client without terminating the session in the telnet client
  5. We see that mqtt client on ESP has stopped publishing messages

I tested this on ESP8266 and ESP32, the behavior is the same. If I end the session in the telnet client, this does not occur.

It looks like the TCP connection is stuck. It appears that the TCP connection is waiting for a response from the client. But I didn’t see any bugs in your code, performKeepAliveCheck should disconnect the client, but something prevents it from doing this.

Do you have any ideas?

@Laxilef
Copy link
Contributor Author

Laxilef commented Dec 21, 2023

While researching I found out that freeze occurs after calling Stream::write(). This is due to a timeout in ClientContext::_write_from_source():

    size_t _write_from_source(const char* ds, const size_t dl)
    {
        assert(_datasource == nullptr);
        assert(!_send_waiting);
        _datasource = ds;
        _datalen = dl;
        _written = 0;
        _op_start_time = millis();
        do {
            if (_write_some()) {
                _op_start_time = millis();
            }

            if (_written == _datalen || _is_timeout() || state() == CLOSED) {
                if (_is_timeout()) {
                    DEBUGV(":wtmo\r\n");
                }
                _datasource = nullptr;
                _datalen = 0;
                break;
            }

            _send_waiting = true;
            // will resume on timeout or when _write_some_from_cb or _notify_error fires
            esp_delay(_timeout_ms, [this]() { return this->_send_waiting; });
            _send_waiting = false;
        } while(true);

        if (_sync)
            wait_until_acked();

        return _written;
    }

I'll check some things and add PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants