Project

General

Profile

Bug #816

Updated by Tobias Brunner over 5 years ago

Hi,

I observed a race condition in charon between threads while it processed an IKE_SA_INIT request. The race is probably triggered by system overload. The IKE_SA_INIT was retransmitted by the remote side one time. The end result is that charon processed both the original and retransmitted IKE_SA_INIT, so the tunnel (IKE_SA) was deleted locally while the remote side thinked it was up. Please see log excerpt below for details.

<pre>
---> first IKE_SA_INIT received
Nov 15 12:06:10 localhost charon: 01[NET] received packet => 432 bytes @ 0x7fc6662c33b0

---> IKE_SA_INIT handled by thread 22
Nov 15 12:06:12 localhost charon: 22[NET] received packet: from 2dec::1001[500] to 2dec::1002[500] (432 bytes)
Nov 15 12:06:12 localhost charon: 22[ENC] parsed IKE_SA_INIT request 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) ]

---> IKE_SA_INIT response sent out (note high processing time due to system overload):
Nov 15 12:06:17 localhost charon: 03[NET] sending packet: from 2dec::1002[500] to 2dec::1001[500]

---> retransmitted IKE_SA_INIT coming in (this was sent by remote side before it got the response, and then it was sitting in local socket input queue):
Nov 15 12:06:17 localhost charon: 01[NET] received packet => 432 bytes @ 0x7fc6662c33b0

---> In the meantime the remote side received the IKE_SA_INIT response and sent an IKE_AUTH request.
---> IKE_AUTH coming in:
Nov 15 12:06:19 localhost charon: 01[NET] received packet => 444 bytes @ 0x7fc6662c33b0

---> IKE_AUTH is processed:
Nov 15 12:06:20 localhost charon: 21[NET] received packet: from 2dec::1001[500] to 2dec::1002[500] (444 bytes)
Nov 15 12:06:20 localhost charon: 21[ENC] parsed IKE_AUTH request 1 [ IDi IDr AUTH SA TSi TSr N(MULT_AUTH) N((40960)) ]

---> IKE_AUTH response sent back:
Nov 15 12:06:30 localhost charon: 21[ENC] generating IKE_AUTH response 1 [ IDr AUTH SA TSi TSr ]
Nov 15 12:06:30 localhost charon: 03[NET] sending packet: from 2dec::1002[500] to 2dec::1001[500]

---> retransmitted IKE_SA_INIT is processed -- this should not happen! the order has now changed.
Nov 15 12:06:30 localhost charon: 11[NET] received packet: from 2dec::1001[500] to 2dec::1002[500] (432 bytes)

---> As a result the local tunnel is cleared, and the other side will just see a response and think everything is OK.
Nov 15 12:06:30 localhost charon: 11[IKE] received message ID 0, expected 2. Ignored
Nov 15 12:06:30 localhost charon: 11[IKE] IKE_SA tunnel_autotest_ipv6_0[38] state change: ESTABLISHED => DESTROYING
</pre>



I took some looks at the code and interpret the logs like this: The second IKE_SA_INIT was scheduled to thread 11, but before this thread had a chance to check out the SA, the other thread 21 checked out the SA and started to process the IKE_AUTH message. So even though the jobs were scheduled in the right order, they were scheduled to 2 different threads, and nothing prevented the thread processing the IKE_AUTH job to start before the IKE_SA_INIT handler thread. This might happen under heavy overload only, but seems like a real problem nevertheless.

Another related issue can be the exception with IKE_SA_INIT messages in task_manager_v2.c after the "received message ID %d, expected %d. Ignored" printout. Without this check, what is not present in strongswan 4, this case would probably work fine.

Can you provide an opinion or fix/workaround for this issue?

Regards,
Imre Keri

Back