Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.5 - e2guardian exiting unexpectedly #794

Closed
opoplawski opened this issue Feb 2, 2024 · 10 comments
Closed

v5.5 - e2guardian exiting unexpectedly #794

opoplawski opened this issue Feb 2, 2024 · 10 comments

Comments

@opoplawski
Copy link
Contributor

We are seeing e2guardian exiting unexpectedly a couple times a day on a few of our machines. e.g.:

Feb 02 08:41:34 e2guardian[733458]: hw1516: SSL Error: error:1408F10B:SSL routines:ssl3_get_record:wrong version number at:
Feb 02 08:41:36 systemd[1]: e2guardian.service: Succeeded.
Feb 02 08:41:37 systemd[1]: e2guardian.service: Service RestartSec=100ms expired, scheduling restart.
Feb 02 08:41:37 systemd[1]: e2guardian.service: Scheduled restart job, restart counter is at 21.
Feb 02 08:41:37 systemd[1]: Stopped E2guardian Web filtering.
Feb 02 08:41:37 systemd[1]: Starting E2guardian Web filtering...

I don't think the SSL Error message is related - they seem "normal". Nothing else is getting logged. There is no indication of a segfault or similer.

Seeing with 5.5.4, and git 83afd5a

@opoplawski
Copy link
Contributor Author

Apparently it is receiving a SIGABRT:

770438 stat("/etc/pki/tls/certs/5806f4dd.0", 0x145f2f40f410) = -1 ENOENT (No such file or directory)
770438 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
770438 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [HUP USR1 PIPE TERM], 8) = 0
770438 gettid()                         = 770438
770438 tgkill(769313, 770438, SIGABRT)  = 0
770438 rt_sigprocmask(SIG_SETMASK, [HUP USR1 PIPE TERM], NULL, 8) = 0
770438 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=769313, si_uid=0} ---
769313 <... rt_sigtimedwait resumed> <unfinished ...>) = ?
771325 +++ killed by SIGABRT +++

But not sure why that's not yielding a core dump.

@philipianpearce
Copy link
Contributor

philipianpearce commented Feb 5, 2024 via email

@opoplawski
Copy link
Contributor Author

Yeah, I've done everything I can to enable core dumps. But it seems like a core dump isn't even triggered, not that the dump is suppressed. I'm completely baffled by this behavior.

openssl-1.1.1k-12.el8_9.x86_64

This is strace output, so I don't know what code it is in.

# cat /usr/lib/systemd/system/e2guardian.service
[Unit]
Description=E2guardian Web filtering
After=network.target

[Service]
TasksMax=infinity
LimitNOFILE=65535
LimitSTACK=infinity:infinity
Type=forking
IgnoreSIGPIPE=no
GuessMainPID=no
#ExecStart=/usr/bin/strace -f --quiet -e '!read,write,poll,sendto,umask,getpid,fcntl,futex,shutdown,getsockopt,setsockopt,accept,fstat' -o /tmp/e2guardian.strace /usr/sbin/e2guardian
ExecStart=/usr/sbin/e2guardian
ExecReload=/usr/sbin/e2guardian -r
UMask=027

[Install]
WantedBy=multi-user.target

@opoplawski
Copy link
Contributor Author

It does seem SSL related though. While it isn't crashing every time, all the exits seem to have a connection to a whatsapp.net address as the last entry in requests.log:

 https://3.33.221.48:443 hw1185: PROXY   CONNECT 0
       0       -       1       0       - 
https://3.33.221.48:443 hw333: PROXY    CONNECT 0
       0       -       1       0       - 
https://3.33.252.61:443 hw1429: PROXY   CONNECT 0
       0       -       1       0       -

@opoplawski
Copy link
Contributor Author

I've managed to get a backtrace with gdb:

Thread 558 "e2guardian" received signal SIGABRT, Aborted.
[Switching to Thread 0x14e8b78bc700 (LWP 677343)]
0x000014e949cd2acf in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x000014e949cd2acf in raise () from /lib64/libc.so.6
#1  0x000014e949ca5ea5 in abort () from /lib64/libc.so.6
#2  0x0000557c687ba3d8 in std::__replacement_assert (
    __file=__file@entry=0x557c688e0f40 "/usr/include/c++/8/bits/stl_deque.h", 
    __line=__line@entry=1463, 
    __function=__function@entry=0x557c688e1f80 <std::deque<String, std::allocator<String> >::front()::__PRETTY_FUNCTION__> "std::deque<_Tp, _Alloc>::reference std::deque<_Tp, _Alloc>::front() [with _Tp = String; _Alloc = std::allocator<String>; std::deque<_Tp, _Alloc>::reference = String&]", 
    __condition=__condition@entry=0x557c688e0f68 "__builtin_expect(!this->empty(), true)")
    at /usr/include/c++/8/x86_64-redhat-linux/bits/c++config.h:2409
#3  0x0000557c687f3f1c in std::deque<String, std::allocator<String> >::front (
    this=0x14e8b78a80c0) at /usr/include/c++/8/bits/stl_deque.h:1461
#4  HTTPHeader::checkheader (this=0x14e8b78a80c0, allowpersistent=true) at HTTPHeader.cpp:957
#5  0x0000557c687f5190 in HTTPHeader::in (this=this@entry=0x14e8b78a80c0, 
    sock=sock@entry=0x14e91c53f810, allowpersistent=allowpersistent@entry=true)
    at HTTPHeader.cpp:1894
#6  0x0000557c687cdcbf in ConnectionHandler::handleConnection (this=0x14e8b78bbb70, 
    peerconn=..., ip=..., ismitm=true, proxysock=..., 
    dystat=@0x557c68b1b018: 0x557c68b1c3c0 <dstat>) at ConnectionHandler.cpp:675
#7  0x0000557c687d4496 in ConnectionHandler::goMITM (this=0x14e8b78bbb70, checkme=..., 
    proxysock=..., peerconn=..., persistProxy=@0x14e8b78b0cad: false, 
    authed=@0x14e8b78b0cac: true, persistent_authed=@0x14e8b78bbc11: true, ip=..., 
    dystat=@0x557c68b1b018: 0x557c68b1c3c0 <dstat>, clientip="10.10.30.231", transparent=false)
    at ConnectionHandler.cpp:2622
#8  0x0000557c687d102a in ConnectionHandler::handleConnection (this=0x14e8b78bbb70, 
    peerconn=..., ip=..., ismitm=false, proxysock=..., 
    dystat=@0x557c68b1b018: 0x557c68b1c3c0 <dstat>) at ConnectionHandler.cpp:1223
#9  0x0000557c687db4f7 in ConnectionHandler::handlePeer (this=this@entry=0x14e8b78bbb70, 
    peerconn=..., ip=..., dystat=@0x557c68b1b018: 0x557c68b1c3c0 <dstat>, 
    lc_type=lc_type@entry=1) at ConnectionHandler.cpp:570
#10 0x0000557c68811d97 in handle_connections (tindex=<optimized out>) at FatController.cpp:397
#11 0x000014e94a6a5b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6
#12 0x000014e94b52f1ca in start_thread () from /lib64/libpthread.so.0
#13 0x000014e949cbde73 in clone () from /lib64/libc.so.6
(gdb) up
#1  0x000014e949ca5ea5 in abort () from /lib64/libc.so.6
(gdb) up
#2  0x0000557c687ba3d8 in std::__replacement_assert (
    __file=__file@entry=0x557c688e0f40 "/usr/include/c++/8/bits/stl_deque.h", 
    __line=__line@entry=1463, 
    __function=__function@entry=0x557c688e1f80 <std::deque<String, std::allocator<String> >::front()::__PRETTY_FUNCTION__> "std::deque<_Tp, _Alloc>::reference std::deque<_Tp, _Alloc>::front() [with _Tp = String; _Alloc = std::allocator<String>; std::deque<_Tp, _Alloc>::reference = String&]", 
    __condition=__condition@entry=0x557c688e0f68 "__builtin_expect(!this->empty(), true)")
    at /usr/include/c++/8/x86_64-redhat-linux/bits/c++config.h:2409
2409    /usr/include/c++/8/x86_64-redhat-linux/bits/c++config.h: No such file or directory.
(gdb) up
#3  0x0000557c687f3f1c in std::deque<String, std::allocator<String> >::front (
    this=0x14e8b78a80c0) at /usr/include/c++/8/bits/stl_deque.h:1461
1461    /usr/include/c++/8/bits/stl_deque.h: No such file or directory.
(gdb) up
#4  HTTPHeader::checkheader (this=0x14e8b78a80c0, allowpersistent=true) at HTTPHeader.cpp:957
957         if (header.front().after("HTTP/").startsWith("1.1")) {
(gdb) print header
$1 = std::deque with 0 elements
(gdb) print header.front()
$2 = {<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >> = " \031\224", <No data fields>}
(gdb) print header.front().after("HTTP/")
$3 = {<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >> = "", <No data fields>}
(gdb) print header.front().after("HTTP/").startsWith("1.1")
Invalid cast.

So perhaps we are not handling corrupt header information safely.

@opoplawski
Copy link
Contributor Author

Maybe your header size check here (https://github.com/e2guardian/e2guardian/blob/v5.5/src/HTTPHeader.cpp#L875) needs to get extended? Or you need intermediate checks in your if (header.front().after("HTTP/").startsWith("1.1")) statement. Latter probably safer.

I'm compiling with -D_GLIBCXX_ASSERTIONS which is the default for Fedora packages. See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_macros.html

@opoplawski
Copy link
Contributor Author

Hmm, I thought that re-enabling dbshowheader() would give some clues as the what the headers where, but I just see [141B blob data].

philipianpearce added a commit that referenced this issue Feb 6, 2024
@philipianpearce
Copy link
Contributor

@opoplawski - I think I might have a fix in latest v5.5 - can you test?

It looks as if whatsapp is faking the first header line but not completing the headers with a blank line. I.e. Using https channel but not running http protocol across it.
The HTTPHeader.in() function had a check for a minimum number of header lines of 1 when it should have been 2 (at least first line plus a blank line). This resulted in the first line being pop-ed off the deque as it was also the last line so the condition in checkheader was using a null String pointer.

The effect of this will be that whatapp calls will be dropped.

Thanks for all the traces - really helpful!

@opoplawski
Copy link
Contributor Author

So far so good. I don't really have a trigger, but it's been running for a while with no more crashes despite whatsapp still running.

@opoplawski
Copy link
Contributor Author

Just to follow up - haven't seen any more crashes with this in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants