Project

General

Profile

Issue #422

xfrm policy install failure

Added by Piyush Patel almost 6 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel-interface
Affected version:
5.1.0
Resolution:
No feedback

Description

This is tried on 5.1.0. I noticed that I was seeing following types of logs:

2013-09-27 09:24:10 05[CFG] unable to install policy 192.168.65.198/32 === 10.20.0.1/32 out (mark 0/0x00000000) for reqid 4, the same policy for reqid 1 exists

I was able to narrow it down to following scenario:

- user on device#1 makes connection#1
- device#1 goes away without disconnecting
- same user connects on device#2
- device#2 disconnects
- user goes back on device#1 and makes connection#2
- device#1 goes away without disconnecting
- same user connects on device#2 and I see the above mentioned logs.

I'm guessing that because device#1 doesn't disconnect cleanly the xfrm policies are hanging around. It seems the IP assigned to device#1 are getting recycled and when device#2 gets that IP the xfrm policy can't be installed.

I'm not sure if this is a problem or if there is a config I can do to resolve the issue. At some level the system knows enough to recycle the IP address so it could in theory cleanup the corresponding policy.

I'm attaching the logs I captured on strongswan. I've captured each connection and teardown followed by ipsec status, ipsec leases, ip xfrm policy command outputs. You can search for "=========".

The configuration on the server is:
config setup
uniqueids=never

  1. test
    conn test
    authby=xauthpsk
    xauth=server
    xauth_identity=test
    keyexchange=ikev1 #######
    left=%defaultroute #######
    right=%any
    rightsourceip=10.20.0.0/24 #######
    auto=add #########

The configuration on the client is:
config setup
uniqueids = never

conn test
authby=xauthpsk
keyexchange=ikev1
xauth=client
xauth_identity=test ##################
left=%defaultroute
leftsourceip=%config ##################
#right=172.16.144.27
right=192.168.65.198
auto=add ###################

policy_error.log (41.9 KB) policy_error.log Piyush Patel, 27.09.2013 20:33

Associated revisions

Revision 85489d39
Added by Martin Willi over 4 years ago

Merge branch 'reqid-alloc'

With these changes, charon dynamically allocates reqids for CHILD_SAs. This
allows the reuse of reqids for identical policies, and basically allows multiple
CHILD_SAs with the same selectors. As reqids do not uniquely define a CHILD_SA,
a new unique identifier for CHILD_SAs is introduced, and the kernel backends
use a proto/dst/SPI tuple to identify CHILD_SAs.

charon-tkm is not yet updated and expires are actually broken with this merge.
As some significant refactorings are required, this is fixed using a separate
merge.

References #422, #431, #463.

Revision 94eb09ac
Added by Martin Willi over 4 years ago

Merge branch 'reqid-alloc'

With these changes, charon dynamically allocates reqids for CHILD_SAs. This
allows the reuse of reqids for identical policies, and basically allows multiple
CHILD_SAs with the same selectors. As reqids do not uniquely define a CHILD_SA,
a new unique identifier for CHILD_SAs is introduced, and the kernel backends
use a proto/dst/SPI tuple to identify CHILD_SAs.

charon-tkm is not yet updated and expires are actually broken with this merge.
As some significant refactorings are required, this is fixed using a separate
merge.

References #422, #431, #463.

History

#1 Updated by Martin Willi almost 6 years ago

Hi,

Security Associations (1 up, 0 connecting):
        test[3]: ESTABLISHED 10 seconds ago, 192.168.65.198[192.168.65.198]...192.168.65.208[192.168.65.208]
        test{1}:  INSTALLED, TUNNEL, ESP SPIs: c8a9e7a9_i c3327e82_o
        test{1}:   192.168.65.198/32 === 10.20.0.1/32
        test{3}:  INSTALLED, TUNNEL, ESP SPIs: cb058e4c_i c14ff77d_o
        test{3}:   192.168.65.198/32 === 10.20.0.2/32

The problem arises from the fact that we don't know if client is re-authenticating the ISAKMP_SA or if it is creating a new one. As the connection is coming from the same host, we have to assume that this is a re-authentication attempt. Therefore we take over the old Quick Mode to the new ISAKMP_SA.

Because the client is creating a Quick Mode for the same selectors, the policies clash and installation fails.

While we could check for any virtual IPs in the selectors before taking over Quick Modes during ISAKMP_SA re-authentication, this is very difficult: that migration process is (and must) be triggered before the Mode Config exchange.

Another idea would be to use the INITIAL_CONTACT notify; but it seems that we currently don't send it as client.

Maybe we'd have to rethink reqid allocation and reuse them more aggressively to avoid this kind of issue. Not sure yet.

Regards
Martin

#2 Updated by Piyush Patel almost 6 years ago

Thanks for the update. I should have realized that hooking the policy cleanup into to the place where the vip is released would not be prudent.

#3 Updated by Tobias Brunner almost 6 years ago

  • Status changed from New to Feedback
  • Assignee set to Martin Willi

#4 Updated by Martin Willi over 4 years ago

Maybe we'd have to rethink reqid allocation and reuse them more aggressively to avoid this kind of issue. Not sure yet.

With the referenced merge of the reqid-alloc branch to master, reqids get allocated globally for the negotiated selectors. This allows identical Quick Modes between hosts (and is actually used for overlapping IKEv2 reauthentication). Likely that it fixes your policy conflict issues, feedback welcome.

Regards
Martin

#5 Updated by Tobias Brunner about 3 years ago

  • Category set to kernel-interface
  • Status changed from Feedback to Closed
  • Resolution set to No feedback

Also available in: Atom PDF