Project

General

Profile

Bug #1421

IKEv1: "deleting redundant CHILD_SA" is triggering updown event

Added by Daniel Gollub over 4 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Category:
libcharon
Target version:
Start date:
20.04.2016
Due date:
Estimated time:
Affected version:
5.3.5
Resolution:
Fixed

Description

I spotted that with charon (strongswan 5.3.5) the deletion of redundant CHILD_SA triggers updown events:

Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[KNL] received a XFRM_MSG_EXPIRE 
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[KNL] creating rekey job for CHILD_SA ESP/0xce28dadf/10.10.2.3
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[MGR] checkout IKE_SA
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[MGR] IKE_SA peer-10.10.2.2-tunnel-vti[2] successfully checked out
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[IKE] <peer-10.10.2.2-tunnel-vti|2> deleting redundant CHILD_SA peer-10.10.2.2-tunnel-vti{15}
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[IKE] <peer-10.10.2.2-tunnel-vti|2> queueing QUICK_DELETE task
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[IKE] <peer-10.10.2.2-tunnel-vti|2> activating new tasks
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[IKE] <peer-10.10.2.2-tunnel-vti|2>   activating QUICK_DELETE task
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[KNL] <peer-10.10.2.2-tunnel-vti|2> querying SAD entry with SPI ce28dadf  (mark 2415919105/0xffffffff)
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 05[JOB] watcher got notification, rebuilding
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 05[JOB] watcher going to poll() 5 fds
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[KNL] <peer-10.10.2.2-tunnel-vti|2> querying SAD entry with SPI c5b8d457  (mark 2415919105/0xffffffff)
Apr 20 10:04:41 vm-apr12vti-3 charon[21015]: 08[IKE] <peer-10.10.2.2-tunnel-vti|2> closing CHILD_SA peer-10.10.2.2-tunnel-vti{15} with SPIs ce28dadf_i (0 bytes) c5b8d457_o (1344 bytes) and TS 0.0.0.0/0 === 0.0.0.0/0
Apr 20 10:04:41 vm-apr12vti-3 custom-updown-script[21145]: UPDOWN: down-client: 10.10.2.2 0.0.0.0/0 == 10.10.2.2 -- 10.10.2.3 == 0.0.0.0/0

Commit 7f2a20a4f45dc1a1806aa4fd2ef39ce21d586ba4 is suppressing updown events for IKEv2. Should updown events also suppressed for IKEv1 in case of deletion of an redundant CHILD_SAs?

Associated revisions

Revision a01eb5e4 (diff)
Added by Tobias Brunner over 4 years ago

ikev1: Don't call updown hook etc. when deleting redundant CHILD_SAs

Fixes #1421.

History

#1 Updated by Tobias Brunner over 4 years ago

  • Tracker changed from Issue to Bug
  • Description updated (diff)
  • Status changed from New to Feedback
  • Target version set to 5.5.0

Commit 7f2a20a4f45dc1a1806aa4fd2ef39ce21d586ba4 is suppressing updown events for IKEv2. Should updown events also suppressed for IKEv1 in case of deletion of an redundant CHILD_SAs?

Yes, makes sense. I guess we could set the CHILD_SA to state CHILD_REKEYED before queuing the quick-delete task. I did so in the 1421-ikev1-delete-redundant branch.

#2 Updated by Daniel Gollub over 4 years ago

Tobias Brunner wrote:

Commit 7f2a20a4f45dc1a1806aa4fd2ef39ce21d586ba4 is suppressing updown events for IKEv2. Should updown events also suppressed for IKEv1 in case of deletion of an redundant CHILD_SAs?

Yes, makes sense. I guess we could set the CHILD_SA to state CHILD_REKEYED before queuing the quick-delete task. I did so in the 1421-ikev1-delete-redundant branch.

Verified the proposed fix on that branch. updown/down event is gone with that patch in place.

Thanks!

#3 Updated by Tobias Brunner over 4 years ago

  • Status changed from Feedback to Closed
  • Assignee set to Tobias Brunner
  • Resolution set to Fixed

#4 Updated by Daniel Gollub about 2 years ago

I just run in a different issue which makes me wonder if NOT calling updown hooks on redundant CHILD_SAs is actually the right thing.

The custom updown script is doing some reference counting and that counting got off, due to the suppressed updown event since one of the CHILD_SA got marked as redundant.
I just hit this with IKEv1 (which is obviously not the right thing either. But I assume the behavior of updown should be sort of consistent)

Artificial test environment with very low lifetime, IKEv1, introducing overlapping CHILD_SAs:

i: Jun 12 22:00:44 R1 charon[5258]: 12[IKE] <peer-1.1.1.7-tunnel-0|1> CHILD_SA peer-1.1.1.7-tunnel-0{1} established with SPIs cf1ea216_i c6096812_o and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown:up, refcount:1

i: Jun 12 22:00:45 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|1> CHILD_SA peer-1.1.1.7-tunnel-0{2} established with SPIs c5776796_i cc6f3811_o and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown:up, refcount:2

r: Jun 12 22:00:56 R1 charon[5258]: 04[IKE] <peer-1.1.1.7-tunnel-0|2> detected rekeying of CHILD_SA peer-1.1.1.7-tunnel-0{1}
r: Jun 12 22:00:56 R1 charon[5258]: 14[IKE] <peer-1.1.1.7-tunnel-0|2> CHILD_SA peer-1.1.1.7-tunnel-0{3} established with SPIs c516f834_i cb6ec996_o and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount update, due to "detected rekeying" 

r: Jun 12 22:00:59 R1 charon[5258]: 04[IKE] <peer-1.1.1.7-tunnel-0|1> CHILD_SA peer-1.1.1.7-tunnel-0{4} established with SPIs ce287bbb_i c96c2fe4_o and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: up, refcount:3

i: Jun 12 22:01:03 R1 charon[5258]: 10[KNL] creating rekey job for CHILD_SA ESP/0xc516f834/2.2.2.104
i: Jun 12 22:01:03 R1 charon[5258]: 10[IKE] <peer-1.1.1.7-tunnel-0|2> deleting redundant CHILD_SA peer-1.1.1.7-tunnel-0{3}
no refcount update, due to "detected rekeying" 

i: closing CHILD_SA peer-1.1.1.7-tunnel-0{4} with SPIs ce287bbb_i (0 bytes) c96c2fe4_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:2

r: Jun 12 22:01:16 R1 charon[5258]: 13[IKE] <peer-1.1.1.7-tunnel-0|2> detected rekeying of CHILD_SA peer-1.1.1.7-tunnel-0{2}
r: Jun 12 22:01:16 R1 charon[5258]: 10[IKE] <peer-1.1.1.7-tunnel-0|2> CHILD_SA peer-1.1.1.7-tunnel-0{5} established with SPIs c8f06b88_i c7439a43_o and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount update, due to "detected rekeying" 

r: Jun 12 22:02:15 R1 charon[5258]: 08[IKE] <peer-1.1.1.7-tunnel-0|2> detected rekeying of CHILD_SA peer-1.1.1.7-tunnel-0{5}
r: Jun 12 22:02:15 R1 charon[5258]: 05[IKE] <peer-1.1.1.7-tunnel-0|2> CHILD_SA peer-1.1.1.7-tunnel-0{6} established with SPIs c6847e5c_i c8981cbc_o and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount update, due to "detected rekeying" 

i: Jun 12 22:02:31 R1 charon[5258]: 05[IKE] <peer-1.1.1.7-tunnel-0|2> CHILD_SA peer-1.1.1.7-tunnel-0{7} established with SPIs caec7f41_i c4b80951_o and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:3

i: Jun 12 22:02:49 R1 charon[5258]: 04[IKE] <peer-1.1.1.7-tunnel-0|2> CHILD_SA peer-1.1.1.7-tunnel-0{8} established with SPIs ccd00bee_i c0d351ad_o and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:4

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{1} with SPIs cf1ea216_i (0 bytes) c6096812_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount updown. already rekeyed.

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{2} with SPIs c5776796_i (0 bytes) cc6f3811_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount updown. already rekeyed.

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{3} with SPIs c516f834_i (0 bytes) cb6ec996_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount updown. already rekeyed -> redundant

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{5} with SPIs c8f06b88_i (0 bytes) c7439a43_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
no refcount updown. already rekeyed.

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{6} with SPIs c6847e5c_i (0 bytes) c8981cbc_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:3

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{7} with SPIs caec7f41_i (0 bytes) c4b80951_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:2

i: Jun 12 22:03:19 R1 charon[5258]: 15[IKE] <peer-1.1.1.7-tunnel-0|2> closing CHILD_SA peer-1.1.1.7-tunnel-0{8} with SPIs ccd00bee_i (0 bytes) c0d351ad_o (0 bytes) and TS 192.168.68.0/24 === 192.168.70.0/24
child-updown: down, refcount:1

Rekeying associations:
1 => 3
2 => 5 => 6
4 up/down without getting rekeyed
7 up/down without getting rekeyed
8 up/down without getting rekeyed

charon.delete_rekeyed is not set, but should be also not relevant to updown triggers.

In this artificial sample I guess the expected behavior should be that during termination, the actual CHILD_SA deletion of: 8, 7, 6, 4 and 3 (the redundant one) should result in an updown DOWN event.

Originally when I reported this for IKEv1, I was under the impression that there should be no IKE protocol version specific difference. I wonder if IKEv2 and 7f2a20a4f45dc1a1806aa4fd2ef39ce21d586ba4 are facing the same problem (trying to reproduce that for IKEv2 now).

#5 Updated by Tobias Brunner about 2 years ago

Rekeying associations:
1 => 3
2 => 5 => 6
4 up/down without getting rekeyed
7 up/down without getting rekeyed
8 up/down without getting rekeyed

Looks like there are redundant CHILD_SAs in the first place here (1 and 2), which might negatively affect this. And what version did you try? There were some changes to this since you originally created the ticket (e.g. e873544080).

I wonder if IKEv2 and 7f2a20a4f45dc1a1806aa4fd2ef39ce21d586ba4 are facing the same problem (trying to reproduce that for IKEv2 now).

No, IKEv2 has proper CHILD_SA rekeying with collision handling. We know which CHILD_SA is rekeyed (it's noted in the CREATE_CHILD_SA request) and how potential collisions (if both peers rekey the same CHILD_SA concurrently) have to be resolved (i.e. which peer deletes which of the CHILD_SAs).

#6 Updated by Daniel Gollub about 2 years ago

Yes, there is initially a redundant CHILD_SA, caused by some custom daemon interfacing with VICI. Which is a flaw for sure. I wonder if in some corner cases (collisions, re-transmits combined with collisions) something like this could happen with well-working VICI/stroke applications.

The commit message of e873544080 looks promising.

This testing was performed on strongswan 5.5.1 (+patches). I have 5.6.3 builds available as well, will try to reproduce the original problem wit that.

Thanks for the pointer on e873544080 and the clarification on the expected IKEv2 behavior.

Also available in: Atom PDF