Project

General

Profile

Issue #3633

Connection retries whenever a problem occures

Added by Clément Speybrouck 5 days ago. Updated 4 days ago.

Status:
Closed
Priority:
Normal
Category:
configuration
Affected version:
5.7.2
Resolution:
No change required

Description

Hi !

I am installing server for clients and I need these servers to be connected to my IPSEC server no matter what happened between the client and the server.
A reboot of the physical server should bring back the tunnel; if the connection is lost (eg. because of a firewall), the client should timeout and retry to connect; if my server has a problem, the client should be aware of it and tries to reconnect.

I encountered one problem: in the Fortigate I can disconnect a client by bringing down the Phase 2 selectors for a particular client but my IPSEC client doesn't try to recreate the tunnel, making it inaccessible.

A bit of context: I'm using strongswan as the IPSEC Client on Debian 10 and a Fortigate as my IPSEC server.

Here's my ipsec.conf:

config setup

conn client1
    keyexchange=ikev2
    type=tunnel
    left=%any
    leftauth=pubkey
    leftsourceip=%config
    leftcert=/etc/ssl/certs/ipsec_client.crt
    right=XX.XX.XX.XX
    rightsubnet=10.0.0.0/8
    rightcert=/etc/ssl/certs/ipsec_server.crt
    closeaction=restart
    keyingtries=%forever
    auto=route
    dpdaction=clear
    dpddelay=5s
    dpdtimeout=10s

include /var/lib/strongswan/ipsec.conf.inc

I found a workaround with a bash script which is called every minute

#!/bin/bash
/usr/bin/ping -c1 $1 &>/dev/null
if [[ ${?} != 0 ]]
then
  /usr/sbin/ipsec down client1
  /usr/sbin/ipsec reload
  /usr/bin/ping -c1 $1 &>/dev/null
  exit 1
fi

Can you tell me if what I'm trying to do is feasible by strongswan or if I should keep my workaround ?

Thank you !

dpd_probe.png (159 KB) dpd_probe.png Clément Speybrouck, 20.11.2020 10:31
log_fortinet.png (230 KB) log_fortinet.png Clément Speybrouck, 20.11.2020 10:31
rekeying.PNG (289 KB) rekeying.PNG Clément Speybrouck, 20.11.2020 12:14

History

#1 Updated by Tobias Brunner 5 days ago

  • Status changed from New to Feedback

I encountered one problem: in the Fortigate I can disconnect a client by bringing down the Phase 2 selectors for a particular client but my IPSEC client doesn't try to recreate the tunnel, making it inaccessible.

Does that not trigger a DELETE to the client, and there the closeaction? Or does that just not do anything on the IKE layer and is about the same as removing the IPsec SAs and policies in the Linux kernel via ip xfrm (i.e. without notifying the IKE daemon)? If so, this can't be detected on anything but the upper layers (like via ICMP).

Why do you call reload? (It's just generally a bad idea.)

#2 Updated by Clément Speybrouck 5 days ago

I'm not skilled with IPSEC, so here's the log after I bring down phase2:

Nov 19 16:11:20 server charon: 16[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
Nov 19 16:11:20 server charon: 16[ENC] parsed INFORMATIONAL request 0 [ D ]
Nov 19 16:11:20 server charon: 16[IKE] received DELETE for ESP CHILD_SA with SPI ca09d63a
Nov 19 16:11:20 server charon: 16[IKE] closing CHILD_SA client1{38} with SPIs cdd26491_i (168 bytes) ca09d63a_o (1008 bytes) and TS 10.16.91.2/32 === 10.0.0.0/8
Nov 19 16:11:20 server charon: 16[IKE] sending DELETE for ESP CHILD_SA with SPI cdd26491
Nov 19 16:11:20 server charon: 16[IKE] CHILD_SA closed
Nov 19 16:11:20 server charon: 16[IKE] establishing CHILD_SA client1{39} reqid 30
Nov 19 16:11:20 server charon: 16[ENC] generating CREATE_CHILD_SA request 2 [ SA No TSi TSr ]
Nov 19 16:11:20 server charon: 16[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (256 bytes)
Nov 19 16:11:20 server charon: 16[ENC] generating INFORMATIONAL response 0 [ D ]
Nov 19 16:11:20 server charon: 16[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (80 bytes)
Nov 19 16:11:20 server charon: 13[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
Nov 19 16:11:20 server charon: 13[ENC] parsed CREATE_CHILD_SA response 2 [ N(NO_PROP) ]
Nov 19 16:11:20 server charon: 13[IKE] received NO_PROPOSAL_CHOSEN notify, no CHILD_SA built
Nov 19 16:11:20 server charon: 13[IKE] failed to establish CHILD_SA, keeping IKE_SA

I am using "reload" but I can also use "restart". These are the only ways to recreate the tunnel after bringing down phase2.

#3 Updated by Tobias Brunner 5 days ago

I'm not skilled with IPSEC, so here's the log after I bring down phase2:

OK, so a DELETE is sent. The problem is that peer has some issue with the CREATE_CHILD_SA request that re-creates the CHILD_SA. Do you have a log of the peer?

Note that in general using closeaction with trap policies could lead to duplicates (if traffic concurrently hits the trap policy, the acquire from the kernel will trigger another CHILD_SA), might be better to just let traffic (i.e. ICMP) initiate the SA again after it got closed by the peer.

I am using "reload" but I can also use "restart". These are the only ways to recreate the tunnel after bringing down phase2.

Why? What's the problem otherwise?

#4 Updated by Clément Speybrouck 5 days ago

I don't have logs of my server at that moment, I'm sorry.

So I commented "closeaction" the log is different but I still can't ping one of my asset.

Nov 19 16:47:41 server charon: 15[IKE] received DELETE for ESP CHILD_SA with SPI ca09d63d
Nov 19 16:47:41 server charon: 15[IKE] closing CHILD_SA client1{2} with SPIs ce26595e_i (168 bytes) ca09d63d_o (336 bytes) and TS 10.16.91.2/32 === 10.0.0.0/8
Nov 19 16:47:41 server charon: 15[IKE] sending DELETE for ESP CHILD_SA with SPI ce26595e
Nov 19 16:47:41 server charon: 15[IKE] CHILD_SA closed
Nov 19 16:47:41 server charon: 15[ENC] generating INFORMATIONAL response 0 [ D ]
Nov 19 16:47:41 server charon: 15[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (80 bytes)
Nov 19 16:49:14 server charon: 08[KNL] creating delete job for CHILD_SA ESP/0x00000000/IP_SERVER
Nov 19 16:49:14 server charon: 08[JOB] CHILD_SA ESP/0x00000000/IP_SERVER not found for delete
Nov 19 16:50:30 server charon: 13[KNL] creating acquire job for policy IP_CLIENT/32[udp/55491] === 10.13.2.1/32[udp/1025] with reqid {1}
Nov 19 16:50:30 server charon: 13[CFG] ignoring acquire, connection attempt pending

But because I need to have an always up tunnel, shouldn't it better to have:

auto=start
closeaction=restart
dpdaction=restart

Well my wording might be misleading: when I bring down the phase 2, the tunnel is still seen as established:

Status of IKE charon daemon (strongSwan 5.7.2, Linux 4.19.0-12-amd64, x86_64):
  uptime: 7 minutes, since Nov 19 16:46:17 2020
  malloc: sbrk 1617920, mmap 0, used 660368, free 957552
  worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 2
  loaded plugins: charon aes rc2 sha2 sha1 md5 mgf1 random nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs8 pkcs12 pgp dnskey sshkey pem openssl fips-prf gmp agent xcbc hmac gcm attr kernel-netlink resolve socket-default connmark stroke updown counters
Listening IP addresses:
  IP_CLIENT
Connections:
      client1:  %any...IP_SERVER  IKEv2
      client1:   local:  [C=X, ST=X, L=X, O=X, OU=X, CN=X] uses public key authentication
      client1:    cert:  "C=X, ST=X, L=X, O=X, OU=X, CN=X" 
      client1:   remote: [C=X, ST=X, L=X, O=X, OU=X, CN=X] uses public key authentication
      client1:    cert:  "C=X, ST=X, L=X, O=X, OU=X, CN=X" 
      client1:   child:  dynamic === 10.0.0.0/8 TUNNEL
Routed Connections:
      client1{1}:  ROUTED, TUNNEL, reqid 1
      client1{1}:   IP_CLIENT/32 === 10.0.0.0/8
Security Associations (1 up, 0 connecting):
      client1[1]: ESTABLISHED 7 minutes ago, IP_CLIENT[C=X, ST=X, L=X, O=X, OU=X, CN=X]...IP_SERVER[C=X, ST=X, L=X, O=X, OU=X, CN=X]
      client1[1]: IKEv2 SPIs: 056de5439b32986c_i* c24fcdbad3f30db4_r, public key reauthentication in 2 hours
      client1[1]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_2048

#5 Updated by Tobias Brunner 5 days ago

I don't have logs of my server at that moment, I'm sorry.

Makes debugging this difficult. Does CHILD_SA rekeying work (you can manually initiate one via ipsec stroke rekey client1{})? If not, then the peer might expect a DH exchange for CHILD_SAs (which is not relevant for the CHILD_SA created with the IKE_AUTH exchange, see ExpiryRekey).

So I commented "closeaction" the log is different but I still can't ping one of my asset.

[...]

That looks weird. If a CHILD_SA was successfully established for a previous acquire, you should not see the last message. However, if that failed with NO_PROPOSAL_CHOSEN that might explain it.

But because I need to have an always up tunnel, shouldn't it better to have:
[...]

You could do that, but if fatal errors occurs (e.g. authentication or proposal/traffic selector failures) the SA won't get recreated (as is the case here if the peer rejects the CREATE_CHILD_SA request). You'd need a script to avoid that (e.g. via vici). And without trap policies, you also have to install drop policies or firewall rules to avoid traffic leaks when no SA/policies exist.

Well my wording might be misleading: when I bring down the phase 2, the tunnel is still seen as established:
[...]

That's only the IKE_SA ([]) the CHILD_SAs would be listed below that ({}). You can see example output in the test results.

#6 Updated by Clément Speybrouck 4 days ago

Attached to this message (log_fortigate.png), the log of the fortigate when I bring down phae 2. (The missing IP addresses are my client's IP)

A keepalive packet is sent at a regular interval, keeping the tunnel up. Moreover I saw that at some point, the Fortigate send a DPD probe (dpd_probe.png).

I have issued "ipsec stroke rekey client1{}" but it does nothing.

You could do that, but if fatal errors occurs (e.g. authentication or proposal/traffic selector failures) the SA won't get recreated (as is the case here if the peer rejects the CREATE_CHILD_SA request). You'd need a script to avoid that (e.g. via vici). And without trap policies, you also have to install drop policies or firewall rules to avoid traffic leaks when no SA/policies exist.

Alrigth I'll keep my current configuration then.

#7 Updated by Tobias Brunner 4 days ago

Attached to this message (log_fortigate.png), the log of the fortigate when I bring down phae 2. (The missing IP addresses are my client's IP)

The delete is not the interesting part. Important is why it returns an error notify when the client tries to recreate the CHILD_SA with a CREATE_CHILD_SA request.

A keepalive packet is sent at a regular interval, keeping the tunnel up. Moreover I saw that at some point, the Fortigate send a DPD probe (dpd_probe.png).

Note that keepalives (len=1) are only used to keep NAT mappings alive if a device is behind a NAT. They are dropped by the other end (on Linux actually by the kernel, the IKE daemon never sees them). That's different from DPDs, which are actually acked by the peer (they are empty INFORMATIONAL messages that have to be replied to). However, DPDs only ensure that an IKE_SA exists, if you delete CHILD_SAs, the DPDs are completely unaffected (hence you need actual upper layer protocols to ensure you have a usable IPsec tunnel).

I have issued "ipsec stroke rekey client1{}" but it does nothing.

There is no output by the tool. Did you check the log? It should initiate a CHILD_SA rekeying.

#8 Updated by Clément Speybrouck 4 days ago

The delete is not the interesting part. Important is why it returns an error notify when the client tries to recreate the CHILD_SA with a CREATE_CHILD_SA request.

I unfortunately don't have more log... Either from the client or the server.

There is no output by the tool. Did you check the log? It should initiate a CHILD_SA rekeying.

Here's the log:

Nov 20 10:11:24 server charon: 11[CFG] received stroke: rekey 'client1{}'

And that's it. I have no logs on the server.

#9 Updated by Tobias Brunner 4 days ago

Here's the log:
[...]

Nothing more? Like establishing CHILD_SA client1{.} reqid .? Is there actually a CHILD_SA (see ipsec statusall) before you use that command? Is the name correct?

#10 Updated by Clément Speybrouck 4 days ago

Protocol:
  1. Connect the tunnel
    Security Associations (1 up, 0 connecting):
          client1[1]: ESTABLISHED 6 seconds ago, IP_CLIENT[XXX]...IP_SERVER[XXX]
          client1[1]: IKEv2 SPIs: ce7a1892c1228d98_i* 605affcb87a6ca83_r, public key reauthentication in 2 hours
          client1[1]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_2048
          client1{2}:  INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cf471e88_i ca09d9be_o
          client1{2}:  AES_CBC_256/HMAC_SHA2_256_128, 0 bytes_i, 252 bytes_o (3 pkts, 3s ago), rekeying in 44 minutes
          client1{2}:   10.16.91.2/32 === 10.0.0.0/8
    
  2. Phase 2 down
    Security Associations (1 up, 0 connecting):
          client1[1]: ESTABLISHED 3 minutes ago, IP_CLIENT[XXX]...IP_SERVER[XXX]
          client1[1]: IKEv2 SPIs: ce7a1892c1228d98_i* 605affcb87a6ca83_r, public key reauthentication in 2 hours
          client1[1]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_2048
    
  3. Execute "ipsec stroke rekey client1{}"
    Nov 20 10:43:16 server charon: 13[CFG] received stroke: rekey 'client1{}'
    Nov 20 10:43:57 server charon: 15[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
    Nov 20 10:43:57 server charon: 15[ENC] parsed INFORMATIONAL request 3 [ ]
    Nov 20 10:43:57 server charon: 15[ENC] generating INFORMATIONAL response 3 [ ]
    Nov 20 10:43:57 server charon: 15[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (80 bytes)
    Nov 20 10:44:57 server charon: 14[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
    Nov 20 10:44:57 server charon: 14[ENC] parsed INFORMATIONAL request 4 [ ]
    Nov 20 10:44:57 server charon: 14[ENC] generating INFORMATIONAL response 4 [ ]
    

    The INFORMATIONAL requests are continuously received, even without the rekeying.

#11 Updated by Tobias Brunner 4 days ago

  1. Connect the tunnel
    [...]
  2. Phase 2 down
    [...]
  3. Execute "ipsec stroke rekey client1{}"

No, no, you have to do that before closing the CHILD_SA. You can't rekey a CHILD_SA that doesn't exist.

The INFORMATIONAL requests are continuously received, even without the rekeying.

Yes, as I said, DPD has nothing to do with the CHILD_SAs.

#12 Updated by Clément Speybrouck 4 days ago

My apologies, I didn't understand.
So here are the logs:

Client's log:

Nov 20 11:09:30 server charon: 16[IKE] establishing CHILD_SA client1{3} reqid 2
Nov 20 11:09:30 server charon: 16[ENC] generating CREATE_CHILD_SA request 2 [ N(REKEY_SA) SA No TSi TSr ]
Nov 20 11:09:30 server charon: 16[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (272 bytes)
Nov 20 11:09:30 server charon: 07[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
Nov 20 11:09:30 server charon: 07[ENC] parsed CREATE_CHILD_SA response 2 [ N(NO_PROP) ]
Nov 20 11:09:30 server charon: 07[IKE] received NO_PROPOSAL_CHOSEN notify, no CHILD_SA built
Nov 20 11:09:30 server charon: 07[IKE] failed to establish CHILD_SA, keeping IKE_SA
Nov 20 11:09:30 server charon: 07[IKE] CHILD_SA rekeying failed, trying again in 9 seconds
Nov 20 11:09:39 server charon: 08[IKE] establishing CHILD_SA client1{4} reqid 2
Nov 20 11:09:39 server charon: 08[ENC] generating CREATE_CHILD_SA request 3 [ N(REKEY_SA) SA No TSi TSr ]
Nov 20 11:09:39 server charon: 08[NET] sending packet: from IP_CLIENT[4500] to IP_SERVER[4500] (272 bytes)
Nov 20 11:09:39 server charon: 09[NET] received packet: from IP_SERVER[4500] to IP_CLIENT[4500] (80 bytes)
Nov 20 11:09:39 server charon: 09[ENC] parsed CREATE_CHILD_SA response 3 [ N(NO_PROP) ]
Nov 20 11:09:39 server charon: 09[IKE] received NO_PROPOSAL_CHOSEN notify, no CHILD_SA built
Nov 20 11:09:39 server charon: 09[IKE] failed to establish CHILD_SA, keeping IKE_SA
Nov 20 11:09:39 server charon: 09[IKE] CHILD_SA rekeying failed, trying again in 13 seconds

Server log attached (rekeying.png)

#13 Updated by Tobias Brunner 4 days ago

Server log attached (rekeying.png)

Great, that confirms the issue I mentioned above (#3633#note-5). The peer is expecting a DH exchange (you can see its proposal listing two DH groups but not MODP_NONE or a second proposal without DH group). So please configure esp=aes256-sha256-modp2048 in order to send a proposal the peer will accept. That should fix rekeying and also the recreation of the CHILD_SA via closeaction=restart (but note that there could still be the issue of creating duplicate SAs if trap policies are also used, so maybe better use ICMPs to initiate it again).

#14 Updated by Clément Speybrouck 4 days ago

Alright, everything restart as wanted ! Thank you very much !

#15 Updated by Tobias Brunner 4 days ago

  • Category set to configuration
  • Status changed from Feedback to Closed
  • Assignee set to Tobias Brunner
  • Resolution set to No change required

Clément Speybrouck wrote:

Alright, everything restart as wanted ! Thank you very much !

Also available in: Atom PDF