-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent PyPortal crashes #1
Comments
Had one hang inside
PyPortal had been up for over 18612 seconds (a bit over 5 hours) at the time of the hang, with 10 second refreshes on the status text, and 120 second refreshes on the image. Had already added garbage collection calls after each status refresh, which had no effect on this hang. Hang occurred while communicating with Adafruit.io image conversion service. Did not see evidence of any response in the debugging logs. Normally, I get I assume that there would be a socket timeout due to |
Had a second issue where the adafruit.io service correctly calculates the BMP size, but the
This one occurred with after roughly 7000 seconds of uptime. The previous |
ok so the second thing is fixed (well, at least it gives you an error) |
Further requests that failed:
Further requests that succeeded:
|
Around 2019-04-13 19:19:07.745 -0500, did see no response from Adafruit.io, but got an 'ESP32 not responding' with a timeout instead of an infinite hang. The exception raised by PyPortal was handled correctly, and after a total of 7 'ESP32 not responding' messages (both for Adafruit.io and my own server), things went back to normal. Around 2019-04-13 21:33:48.848 -0500, got no response from Adafruit.io, no timeouts, and an infinite hang (up until 2019-04-14 08:11 -0500):
Looks like the Adafruit.io service did make a request of my web server, and got a 200 response and the entire image, so whatever failed had to have been after the image download, and before the converted image would have been returned.
As of 2019-04-14 09:49:22.174 -0500, moved my original image from a work web server to an offsite VPS. Will leave the stress test running the rest of the weekend to see if hangs are limited to the work web server or any security hardware at the campus perimeter. |
Still having intermittent hangs that appear to be independent of the backend server, and regardless of if adafruit.io is involved. With PyPortal
With PyPortal
So response headers never get printed.
Waiting to see which of these print on a hang. |
dya know what is the lowest level line it hangs at? |
Not yet. Haven't seen it hang after adding the debugging lines. I end up leaving it plugged in on one or another Mac at home or at work and leaving |
11.3 hours of uninterrupted use at home, going to try again at work. |
4.5 hours uninterrupted at work (missed carrying it in yesterday morning), >14 hours uninterrupted at home. Either the bug was external to the PyPortal, unexpectedly fixed in a dependency, or doesn't show up on the original source |
Finally got a hang about an hour ago. Got the
|
After adding additional debugging print statements, finally got another hang last night: Failing logs:
From print("Data written to socket")
line = sock.readline()
#print(line)
line = line.split(None, 2)
status = int(line[1])
reason = ""
if len(line) > 2:
reason = line[2].rstrip()
print("Reading lines from socket") From def readline(self):
"""Attempt to return as many bytes as we can up to but not including '\r\n'"""
print("Socket readline")
while b'\r\n' not in self._buffer:
# there's no line already in there, read some more
avail = min(_the_interface.socket_available(self._socknum), MAX_PACKET)
if avail:
self._buffer += _the_interface.socket_read(self._socknum, avail)
firstline, self._buffer = self._buffer.split(b'\r\n', 1)
gc.collect()
return firstline Only lines possibly causing an infinite loop/hang are the while loop in
|
No further hangs yet, but I've replaced my ad hoc debugging statements with passing |
Got my hang a bit ago. With all the ESP debugging enabled, a working request:
And a failing request:
where the |
Updated to ESP32 firmware 1.3.0 to see if that makes any difference. |
yea looks like it just 'hangs' - we should detect this and do a hard reset |
I'm assuming this is the same root cause as in Fixed infinite loop when socket readline fails. |
At times, the PyPortal will crash with part of a background displayed. Currently stress-testing the existing code with more frequent
wget
andfetch
calls to see what exceptions are thrown and how best to handle them (including soft reboots if things go completely sideways).The text was updated successfully, but these errors were encountered: