StrongSwan-sourced DPDs coming from wrong IP address
Background: We have a StrongSwan-based VPN gateway, to which multiple remote road-warrior endpoints are connected. The Gateway is setup such that the external facing interface has two ip addresses, one that is unique in the system, and one that is shared with a backup Gateway in a failover cluster. (If one node fails, ownership of the shared IP transfers to the remaining node.) "ip addr" shows the following for the interface when it's active:
inet 10.49.1.2/24 brd 10.49.1.255 scope global eth0
inet 10.49.1.10/24 scope global secondary eth0
The Gateway is behind a NATting firewall, which translates the private 10.49.1.10 address to a world routable address. 10.49.1.2 is not translated.
The 10.49.1.10 address is the one that StrongSwan is configured to listen on, as set in the "left" parameter of the relevant connection profile in ipsec.conf.
When we connect, everything seems to be fine using the secondary address. StrongSwan lists it in its "listening on" list, and remote endpoints are able to connect to it. "ipsec statusall" lists the 10.49.1.10 address as the left endpoint of the IKE SAs. If the remote endpoint sends a DPD packet to the secondary address, it successfully receives a response sourced from that address.
However, we're seeing that when the gateway sends a DPD packet of its own, the source IP is the primary address (10.49.1.2) instead of the expected 10.49.1.10. This is causing problems, because the incorrectly-sourced DPD packet can't get past the intervening firewall and NAT step, so the remote endpoint never receives it. If no traffic comes from the endpoint before the timeout, the SA is cleared and the endpoint is disconnected.
Thanks in advance for any help you can give.
#1 Updated by Brian Pruss over 8 years ago
From the logs, it appears that StrongSwan is using the global routing table to source DPD packets instead of the SA's source IP. I found an entry in charon.log under KNL saying "getting address to reach <remote endpoint address>", and then several entries later saying that it's sending packets from 10.49.1.2.
#3 Updated by Tobias Brunner over 8 years ago
- Status changed from New to Feedback
I found an entry in charon.log under KNL saying "getting address to reach <remote endpoint address>", and then several entries later saying that it's sending packets from 10.49.1.2.
Before that message you should see why it did the lookup. In general, charon uses the addresses it already knows, unless there is a route/interface/address change that may cause a new route/source address lookup. Since 5.0.1 charon tries to keep the current source address if possible. In earlier releases, though, it simply used the first address that seemed usable (according to the routes returned by the kernel), which was not necessarily the same one. I suppose it's this behavior that causes problems here. If an upgrade is no option for you, tweaking the routing tables might work (using the charon.ignore_routing_tables option in strongswan.conf).
#4 Updated by Brian Pruss over 8 years ago
Thanks for the help you've provided. Just to put some closure on the issue, we ended up changing the routes to point the src attribute to the shared external address. This required some coordination with the clustering configuration, as the route can't be set when the specified source IP doesn't exist on the system.