Bug #757
Updated by Tobias Brunner almost 11 years ago
Encounter a strongswan server crash issue on production system with more than 300 users online.
Here is the ipsec frontend output:
<pre>
/var/log/strongswan# ipsec start --nofork
Starting strongSwan 5.2.1dr1 IPsec [starter]...
charon (12295) started after 40 ms
*** buffer overflow detected ***: /usr/lib/ipsec/charon terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7fc6a44c2e67]
/lib/x86_64-linux-gnu/libc.so.6(+0x109d60)[0x7fc6a44c1d60]
/lib/x86_64-linux-gnu/libc.so.6(+0x10ae1e)[0x7fc6a44c2e1e]
/usr/lib/ipsec/libradius.so.0(+0x3002)[0x7fc69ede2002]
/usr/lib/ipsec/libradius.so.0(+0x3731)[0x7fc69ede2731]
/usr/lib/ipsec/plugins/libstrongswan-eap-radius.so(+0x3a97)[0x7fc69efe9a97]
/usr/lib/ipsec/plugins/libstrongswan-xauth-eap.so(+0xe2c)[0x7fc69e9d9e2c]
/usr/lib/ipsec/libcharon.so.0(+0x51cb1)[0x7fc6a49e5cb1]
/usr/lib/ipsec/libcharon.so.0(+0x482e3)[0x7fc6a49dc2e3]
/usr/lib/ipsec/libcharon.so.0(+0x26faf)[0x7fc6a49bafaf]
/usr/lib/ipsec/libcharon.so.0(+0x214a7)[0x7fc6a49b54a7]
/usr/lib/ipsec/libstrongswan.so.0(+0x2b733)[0x7fc6a4e3b733]
/usr/lib/ipsec/libstrongswan.so.0(+0x3aea0)[0x7fc6a4e4aea0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc6a477ee9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc6a44ac31d]
======= Memory map: ========
</pre>
we focus on free radius settings in strongswan.conf and found a interesting issue:
if we set sockets = 20, and use ss -s to monitor udp socket usage, when user load up, it increase from 20 to 40 and stop.
if we set sockets = 1000, and ss -s found udp usage from 1000 to 1040 and strongswan crashed.
By look into the code, sockets parameter is a socket pool for radius client and it's a maxim limited. request larger than that will need wait until old one free.
How is this affect the charon stability?
And for such core dump issue, do you have any good way to debug or suggestion for the stability under high load.
Here is the ipsec frontend output:
<pre>
/var/log/strongswan# ipsec start --nofork
Starting strongSwan 5.2.1dr1 IPsec [starter]...
charon (12295) started after 40 ms
*** buffer overflow detected ***: /usr/lib/ipsec/charon terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7fc6a44c2e67]
/lib/x86_64-linux-gnu/libc.so.6(+0x109d60)[0x7fc6a44c1d60]
/lib/x86_64-linux-gnu/libc.so.6(+0x10ae1e)[0x7fc6a44c2e1e]
/usr/lib/ipsec/libradius.so.0(+0x3002)[0x7fc69ede2002]
/usr/lib/ipsec/libradius.so.0(+0x3731)[0x7fc69ede2731]
/usr/lib/ipsec/plugins/libstrongswan-eap-radius.so(+0x3a97)[0x7fc69efe9a97]
/usr/lib/ipsec/plugins/libstrongswan-xauth-eap.so(+0xe2c)[0x7fc69e9d9e2c]
/usr/lib/ipsec/libcharon.so.0(+0x51cb1)[0x7fc6a49e5cb1]
/usr/lib/ipsec/libcharon.so.0(+0x482e3)[0x7fc6a49dc2e3]
/usr/lib/ipsec/libcharon.so.0(+0x26faf)[0x7fc6a49bafaf]
/usr/lib/ipsec/libcharon.so.0(+0x214a7)[0x7fc6a49b54a7]
/usr/lib/ipsec/libstrongswan.so.0(+0x2b733)[0x7fc6a4e3b733]
/usr/lib/ipsec/libstrongswan.so.0(+0x3aea0)[0x7fc6a4e4aea0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc6a477ee9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc6a44ac31d]
======= Memory map: ========
</pre>
we focus on free radius settings in strongswan.conf and found a interesting issue:
if we set sockets = 20, and use ss -s to monitor udp socket usage, when user load up, it increase from 20 to 40 and stop.
if we set sockets = 1000, and ss -s found udp usage from 1000 to 1040 and strongswan crashed.
By look into the code, sockets parameter is a socket pool for radius client and it's a maxim limited. request larger than that will need wait until old one free.
How is this affect the charon stability?
And for such core dump issue, do you have any good way to debug or suggestion for the stability under high load.