Project

General

Profile

Bug #853

Issue with rekeying and updown script

Added by Andreas Braun about 5 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Category:
libcharon
Target version:
Start date:
20.02.2015
Due date:
Estimated time:
Affected version:
5.2.1
Resolution:
Fixed

Description

Hello.

We are having a big issue regarding IPsec connection stability when it comes to rekey collisions and the updown script. We are using the firewall distribution IPFire, which uses Strongswan 5.2.1 for IPsec connectivity.

The problem is, when a rekey collision occurs, the updown script gets executed with the client-down function, which removes iptable firewall rules. This in turn prevents any further data from going through the tunnel. IPsec connections will remain. Routes also remain fine. Only manually restarting the tunnel helps in this situation. Setting different rekey times on each side to prevent a collision did not help, because that gave us other problems.

These are the connection settings from ipsec.conf (both sides use equal settings):

version 2

conn %default
        keyingtries=%forever

include /etc/ipsec.user.conf (which is empty)

conn VpnEssenFrankfurt
        left=xxx.xxx.xxx.178
        leftsubnet=192.168.241.0/24
        leftfirewall=yes
        lefthostaccess=yes
        right=xxx.xxx.xxx.254
        rightsubnet=10.2.34.0/24
        leftid="xxx.xxx.xxx.178" 
        rightid="xxx.xxx.xxx.254" 
        ike=aes128-sha-modp2048,aes128-sha-modp1536,aes128-sha-modp1024
        esp=aes128-sha1-modp2048,aes128-sha1-modp1536,aes128-sha1-modp1024
        keyexchange=ikev2
        ikelifetime=8h
        keylife=1h
        compress=yes
        dpdaction=restart
        dpddelay=120
        dpdtimeout=30
        authby=secret
        auto=start

rekeymargin, rekeyfuzz are not set, so they should be at default values.

I also created a test setup using VMs with the same software. To reproduce rekey collisions and investigate a little further I changed the following settings:

keylife=1m
rekeymargin=50s
rekeyfuzz=0%

Before rekey collision:

Chain IPSECFORWARD (1 references)
target     prot opt source               destination
RETURN     all  --  10.1.34.0/24         10.2.34.0/24         policy match
dir in pol ipsec reqid 1 proto 50
MARK       all  --  10.2.34.0/24         10.1.34.0/24         policy match
dir out pol ipsec reqid 1 proto 50 MARK set 0x32

Chain IPSECINPUT (1 references)
target     prot opt source               destination
ACCEPT     esp  --  10.1.100.178         10.1.100.254
ACCEPT     ah   --  10.1.100.178         10.1.100.254
ACCEPT     4    --  10.1.100.178         10.1.100.254
ACCEPT     4    --  10.1.100.178         10.1.100.254         policy match
dir in pol ipsec reqid 1 proto 50
RETURN     all  --  10.1.34.0/24         10.2.34.0/24         policy match
dir in pol ipsec reqid 1 proto 50
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:4500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:4500

Chain IPSECOUTPUT (1 references)
target     prot opt source               destination
MARK       all  --  10.2.34.0/24         10.1.34.0/24         policy match
dir out pol ipsec reqid 1 proto 50 MARK set 0x32

After rekey collision:

Chain IPSECFORWARD (1 references)
target     prot opt source               destination

Chain IPSECINPUT (1 references)
target     prot opt source               destination
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:4500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:500
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:4500

Chain IPSECOUTPUT (1 references)
target     prot opt source               destination

Here are some log entries when it happens:

09:50:49 charon:  15[IKE] detected CHILD_REKEY collision with CHILD_REKEY 
09:50:49 charon:  15[ENC] generating CREATE_CHILD_SA response 19 [ N(IPCOMP_SUP) SA No KE TSi TSr ] 
09:50:49 charon:  15[NET] sending packet: from 10.1.100.254[4500] to 10.1.100.178[4500] (476 bytes) 
09:50:49 charon:  11[NET] received packet: from 10.1.100.178[4500] to 10.1.100.254[4500] (476 bytes) 
09:50:49 charon:  11[ENC] parsed CREATE_CHILD_SA response 5 [ N(IPCOMP_SUP) SA No KE TSi TSr ] 
09:50:49 charon:  11[IKE] received IPCOMP_SUPPORTED notify 
09:50:49 charon:  11[CHD]   using AES_CBC for encryption 
09:50:49 charon:  11[CHD]   using HMAC_SHA1_96 for integrity 
09:50:49 charon:  11[CHD] adding inbound ESP SA 
09:50:49 charon:  11[CHD]   SPI 0xce1d4d5e, src 10.1.100.178 dst 10.1.100.254 
09:50:49 charon:  11[KNL] adding SAD entry with SPI 0000cb33 and reqid {1}  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL]   using compression algorithm IPCOMP_DEFLATE 
09:50:49 charon:  11[KNL] adding SAD entry with SPI ce1d4d5e and reqid {1}  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL]   using encryption algorithm AES_CBC with key size 128 
09:50:49 charon:  11[KNL]   using integrity algorithm HMAC_SHA1_96 with key size 160 
09:50:49 charon:  11[KNL]   using replay window of 32 packets 
09:50:49 charon:  11[CHD] adding outbound ESP SA 
09:50:49 charon:  11[CHD]   SPI 0xca7c0116, src 10.1.100.254 dst 10.1.100.178 
09:50:49 charon:  11[KNL] adding SAD entry with SPI 000015f8 and reqid {1}  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL]   using compression algorithm IPCOMP_DEFLATE 
09:50:49 charon:  11[KNL] adding SAD entry with SPI ca7c0116 and reqid {1}  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL]   using encryption algorithm AES_CBC with key size 128 
09:50:49 charon:  11[KNL]   using integrity algorithm HMAC_SHA1_96 with key size 160 
09:50:49 charon:  11[KNL]   using replay window of 32 packets 
09:50:49 charon:  11[KNL] policy 10.2.34.0/24 === 10.1.34.0/24 out  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] policy 10.1.34.0/24 === 10.2.34.0/24 in  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] policy 10.1.34.0/24 === 10.2.34.0/24 fwd  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] policy 10.2.34.0/24 === 10.1.34.0/24 out  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] updating policy 10.2.34.0/24 === 10.1.34.0/24 out  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL] policy 10.1.34.0/24 === 10.2.34.0/24 in  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] updating policy 10.1.34.0/24 === 10.2.34.0/24 in  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL] policy 10.1.34.0/24 === 10.2.34.0/24 fwd  (mark 0/0x00000000) already exists, increasing refcount 
09:50:49 charon:  11[KNL] updating policy 10.1.34.0/24 === 10.2.34.0/24 fwd  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL] getting a local address in traffic selector 10.2.34.0/24 
09:50:49 charon:  11[KNL] using host 10.2.34.1 
09:50:49 charon:  11[KNL] using 10.1.100.178 as nexthop to reach 10.1.100.178/32 
09:50:49 charon:  11[KNL] 10.1.100.254 is on interface red0 
09:50:49 charon:  11[IKE] CHILD_SA VpnFrankfurtEssenTest{1} established with SPIs ce1d4d5e_i ca7c0116_o and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  11[IKE] CHILD_SA VpnFrankfurtEssenTest{1} established with SPIs ce1d4d5e_i ca7c0116_o and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  11[IKE] CHILD_SA rekey collision won, deleting old child 
09:50:49 charon:  11[IKE] reinitiating already active tasks 
09:50:49 charon:  11[IKE]   CHILD_REKEY task 
09:50:49 charon:  11[KNL] querying SAD entry with SPI c78986b4  (mark 0/0x00000000) 
09:50:49 charon:  11[KNL] querying SAD entry with SPI c5afd499  (mark 0/0x00000000) 
09:50:49 charon:  11[IKE] closing CHILD_SA VpnFrankfurtEssenTest{1} with SPIs c78986b4_i (0 bytes) c5afd499_o (0 bytes) and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  11[IKE] closing CHILD_SA VpnFrankfurtEssenTest{1} with SPIs c78986b4_i (0 bytes) c5afd499_o (0 bytes) and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  11[IKE] sending DELETE for ESP CHILD_SA with SPI c78986b4 
09:50:49 charon:  11[ENC] generating INFORMATIONAL request 6 [ D ] 
09:50:49 charon:  11[NET] sending packet: from 10.1.100.254[4500] to 10.1.100.178[4500] (76 bytes) 
09:50:49 charon:  13[NET] received packet: from 10.1.100.178[4500] to 10.1.100.254[4500] (76 bytes) 
09:50:49 charon:  13[ENC] parsed INFORMATIONAL request 20 [ D ] 
09:50:49 charon:  13[IKE] received DELETE for ESP CHILD_SA with SPI c4a8cd3a 
09:50:49 charon:  13[KNL] querying SAD entry with SPI c950c19c  (mark 0/0x00000000) 
09:50:49 charon:  13[KNL] querying SAD entry with SPI c4a8cd3a  (mark 0/0x00000000) 
09:50:49 charon:  13[IKE] closing CHILD_SA VpnFrankfurtEssenTest{1} with SPIs c950c19c_i (0 bytes) c4a8cd3a_o (0 bytes) and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  13[IKE] closing CHILD_SA VpnFrankfurtEssenTest{1} with SPIs c950c19c_i (0 bytes) c4a8cd3a_o (0 bytes) and TS 10.2.34.0/24 === 10.1.34.0/24  
09:50:49 charon:  13[IKE] sending DELETE for ESP CHILD_SA with SPI c950c19c 
09:50:49 charon:  13[IKE] CHILD_SA closed 
09:50:49 charon:  13[CHD] running updown script: 2>&1 PLUTO_VERSION='1.1' PLUTO_VERB='down-client' PLUTO_CONNECTION='VpnFrankfurtEssenTest' PLUTO_INTERFACE='red0' PLUTO_REQID='1' PLUTO_PROTO='esp' 
PLUTO_UNIQUEID='1' PLUTO_ME='10.1.100.254' PLUTO_MY_ID='10.1.100.254' PLUTO_MY_CLIENT='10.2.34.0/24' PLUTO_MY_PORT='0' PLUTO_MY_PROTOCOL='0' PLUTO_PEER='10.1.100.178' PLUTO_PEER_ID='10.1.100.178' 
PLUTO_PEER_CLIENT='10.1.34.0/24' PLUTO_PEER_PORT='0' PLUTO_PEER_PROTOCOL='0' PLUTO_IPCOMP='1' /usr/libexec/ipsec/_updown 

Other host:

09:50:47 charon:  10[IKE] detected CHILD_REKEY collision with CHILD_REKEY 
09:50:47 charon:  10[ENC] generating CREATE_CHILD_SA response 5 [ N(IPCOMP_SUP) SA No KE TSi TSr ] 
09:50:47 charon:  10[NET] sending packet: from 10.1.100.178[4500] to 10.1.100.254[4500] (476 bytes) 
09:50:47 charon:  13[NET] received packet: from 10.1.100.254[4500] to 10.1.100.178[4500] (476 bytes) 
09:50:47 charon:  13[ENC] parsed CREATE_CHILD_SA response 19 [ N(IPCOMP_SUP) SA No KE TSi TSr ] 
09:50:48 charon:  13[IKE] CHILD_SA VpnEssenFrankfurtTest{5} established with SPIs c4a8cd3a_i c950c19c_o and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  13[IKE] CHILD_SA VpnEssenFrankfurtTest{5} established with SPIs c4a8cd3a_i c950c19c_o and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  13[IKE] CHILD_SA rekey collision lost, deleting rekeyed child 
09:50:48 charon:  13[IKE] closing CHILD_SA VpnEssenFrankfurtTest{5} with SPIs c4a8cd3a_i (0 bytes) c950c19c_o (0 bytes) and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  13[IKE] closing CHILD_SA VpnEssenFrankfurtTest{5} with SPIs c4a8cd3a_i (0 bytes) c950c19c_o (0 bytes) and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  13[IKE] sending DELETE for ESP CHILD_SA with SPI c4a8cd3a 
09:50:48 charon:  13[ENC] generating INFORMATIONAL request 20 [ D ] 
09:50:48 charon:  13[NET] sending packet: from 10.1.100.178[4500] to 10.1.100.254[4500] (76 bytes) 
09:50:48 charon:  03[NET] received packet: from 10.1.100.254[4500] to 10.1.100.178[4500] (76 bytes) 
09:50:48 charon:  03[ENC] parsed INFORMATIONAL request 6 [ D ] 
09:50:48 charon:  03[IKE] received DELETE for ESP CHILD_SA with SPI c78986b4 
09:50:48 charon:  03[IKE] closing CHILD_SA VpnEssenFrankfurtTest{5} with SPIs c5afd499_i (0 bytes) c78986b4_o (0 bytes) and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  03[IKE] closing CHILD_SA VpnEssenFrankfurtTest{5} with SPIs c5afd499_i (0 bytes) c78986b4_o (0 bytes) and TS 10.1.34.0/24 === 10.2.34.0/24  
09:50:48 charon:  03[IKE] sending DELETE for ESP CHILD_SA with SPI c5afd499 
09:50:48 charon:  03[IKE] CHILD_SA closed 
09:50:48 charon:  03[IKE] detected CHILD_REKEY collision with CHILD_DELETE 
09:50:48 charon:  03[ENC] generating INFORMATIONAL response 6 [ D ] 
09:50:48 charon:  03[NET] sending packet: from 10.1.100.178[4500] to 10.1.100.254[4500] (76 bytes) 
09:50:48 charon:  05[NET] received packet: from 10.1.100.254[4500] to 10.1.100.178[4500] (76 bytes) 
09:50:48 charon:  05[ENC] parsed INFORMATIONAL response 20 [ D ] 
09:50:48 charon:  05[IKE] received DELETE for ESP CHILD_SA with SPI c950c19c 
09:50:48 charon:  05[IKE] CHILD_SA closed 

Although I am not a programmer, I dug around in the source code and might have found a relevant place. Maybe it's not working right here.

File: child_delete.c

static status_t destroy_and_reestablish(private_child_delete_t *this)
[...]
        /* signal child down event if we are not rekeying */
        if (!this->rekeyed)
        {
            charon->bus->child_updown(charon->bus, child_sa, FALSE);
        }
[...]

If you need any more information, I will be glad to provide. Thank you!


Related issues

Has duplicate Issue #959: Invalid call to ipsec _updown after session rekey when there is a rekey collisionClosed17.05.2015

Associated revisions

Revision 7f2a20a4 (diff)
Added by Tobias Brunner about 4 years ago

child-rekey: Suppress updown event when deleting redundant CHILD_SAs

When handling a rekey collision we might have to delete an already
installed redundant CHILD_SA (or expect the other peer to do so).
We don't want to trigger updown events for these as neither do we do
so for successfully rekeyed CHILD_SAs.

Fixes #853.

History

#1 Updated by Martin Willi almost 5 years ago

  • Has duplicate Issue #959: Invalid call to ipsec _updown after session rekey when there is a rekey collision added

#2 Updated by Tom Rymes over 4 years ago

Are there any updates on this issue? We were bitten by this yesterday and I am looking to get confirmation that this is/is-not a problem with Strongswan. This has been discussed a little on the IPFire forum as well: http://forum.ipfire.org/viewtopic.php?f=27&t=12628

#3 Updated by Tobias Brunner over 4 years ago

  • Tracker changed from Issue to Bug
  • Status changed from New to Feedback

Please try the patch in the 853-updown-rekey-collision branch. It tries to suppress the updown event during rekeying in case an installed redundant CHILD_SA is deleted.

#4 Updated by Tom Rymes over 4 years ago

With help from The IPFire development team, I have been able to test the patch and it does appear to have fixed the issue, as I have had a tunnel up for nearly 24-hours and hundreds of collisions that is still working. Thank you!

#5 Updated by Andreas Braun over 4 years ago

Tom Rymes wrote:

With help from The IPFire development team, I have been able to test the patch and it does appear to have fixed the issue, as I have had a tunnel up for nearly 24-hours and hundreds of collisions that is still working. Thank you!

Hey Tom, good to see some progress on this topic. Can you tell me how to test the patch with IPFire?

Thank you.

#6 Updated by Tom Rymes over 4 years ago

Andreas: I opened Bug 10998 on the IPFire Bugzilla here: https://bugzilla.ipfire.org/show_bug.cgi?id=10998

Michael posted a patched version of Strongswan there, along with instructions to install.

#7 Updated by Tobias Brunner over 4 years ago

  • Category set to libcharon
  • Target version set to 5.4.0

With help from The IPFire development team, I have been able to test the patch and it does appear to have fixed the issue, as I have had a tunnel up for nearly 24-hours and hundreds of collisions that is still working. Thank you!

Great, thanks for testing. I'll line this up for the next release.

#8 Updated by Tobias Brunner about 4 years ago

  • Status changed from Feedback to Closed
  • Assignee set to Tobias Brunner
  • Resolution set to Fixed

Also available in: Atom PDF