Project

General

Profile

Bug #3095

updown plugin and PLUTO_INTERFACE selection

Added by Luka Logar about 1 month ago. Updated 28 days ago.

Status:
Feedback
Priority:
Normal
Category:
libcharon
Target version:
Start date:
Due date:
Estimated time:
Affected version:
5.8.0
Resolution:

Description

Hi,

I have strongSwan running on a probably typical multihome configured system (with lan and wan interfaces) using configuration with

local_address=<WAN IP>
serving the roadwarriors.

Everything is working as expected for connections originated from a wan interface/network. However, if I establish an IPSEC connection from a host connected to the lan interface (which of course is not a normal use case), wrong PLUTO_INTERFACE (eg. wan) is returned in an updown script, and firewall rules do not pass the traffic, as it is really coming from the lan interface.

I think the interface lookup should not be based on a connection's local_address IP address but rather on an interface that serves the connection request source address?

This proof of concept patch seems to fix this issue:

--- a/src/libcharon/plugins/updown/updown_listener.c
+++ b/src/libcharon/plugins/updown/updown_listener.c
@@ -288,7 +288,8 @@
              config->get_name(config));
     if (up)
     {
-        if (charon->kernel->get_interface(charon->kernel, me, &iface))
+        host_t *true_me = charon->kernel->get_source_addr(charon->kernel, other, NULL);
+        if (charon->kernel->get_interface(charon->kernel, true_me ? true_me : me, &iface))
         {
             cache_iface(this, child_sa->get_reqid(child_sa), iface);
         }
@@ -296,6 +297,7 @@
         {
             iface = NULL;
         }
+        DESTROY_IF(true_me);
     }
     else
     {

History

#1 Updated by Tobias Brunner about 1 month ago

  • Tracker changed from Issue to Bug
  • Category set to libcharon
  • Status changed from New to Feedback
  • Assignee set to Tobias Brunner
  • Target version set to 5.8.1

What you get there is not really the "true" local address, because the kernel is perfectly fine using whatever local address on any interface (i.e. internal clients actually connect to the WAN IP, if not, the daemon would reject the connection attempt due to the configured local IP).

But using the interface the traffic will presumably take makes sense. I pushed something to the 3095-updown-iface branch. Let me know if that works for you.

Note that using the local interface in firewall rules installed via updown script won't work for mobile clients that can move from a private to a public IP (via MOBIKE) as the updown script will not be called when such a change happens.

#2 Updated by Luka Logar about 1 month ago

Hi Tobias,

your change does not seem to work:

[DBG] charon->kernel->get_nexthop(other=10.0.x.x, -1, me=193.77.y.y, &iface=pppoe-wan) = 213.250.z.z

It still returns wrong (wan) interface, whereas omitting parameter me yields correct result (the same as in my patch using get_source_addr()):
[DBG] charon->kernel->get_nexthop(other=10.0.x.x, -1, me=NULL, &iface=br-lan) = 10.0.y.y

#3 Updated by Tobias Brunner about 1 month ago

Hm, how does the routing table on your host look like?

#4 Updated by Luka Logar about 1 month ago

Nothing special:

# ip route
default via 213.250.xx.xx dev pppoe-wan proto static
10.0.0.0/24 dev br-lan proto kernel scope link src 10.0.0.254
213.250.xx.xx dev pppoe-wan proto kernel scope link src 193.77.yy.yy

#5 Updated by Tobias Brunner about 1 month ago

I see. Our custom route lookup tries to find a route with the given source address (or a route that lists an interface with that address), even if it isn't the best match on the destination. So I guess the route installed in table 220 lists the WAN interface too. It might work correctly if you switched to the kernel's native route lookup by setting charon.plugins.kernel-netlink.fwmark to an arbitrary negated mark, e.g. !42.

We could theoretically leave the source address unspecified, however, that could result in the wrong interface being returned on multi-homed hosts with e.g. two routes to the same destination via different interfaces and a tunnel pinned to a specific interface via local_address.

#6 Updated by Luka Logar 30 days ago

Here are the contents of the table 220 (10.4.0.1 is the virtual IP assigned to the (Win10) client)

# ip route show table 220
10.4.0.1 via 10.0.x.x dev br-lan proto static

However, when setting the charon.plugins.kernel-netlink.fwmark to !42 the correct interface is selected:
[DBG] charon->kernel->get_nexthop(other=10.0.x.x, -1, me=193.77.y.y, &iface=br-lan) = 10.0.z.z

but changing the lookup method by setting the fwmark setting as you suggested looks a little "hacky" to me?

#7 Updated by Tobias Brunner 29 days ago

Here are the contents of the table 220 (10.4.0.1 is the virtual IP assigned to the (Win10) client)

Hm, that's strange. It is literally the same call (see source:src/libcharon/plugins/kernel_netlink/kernel_netlink_ipsec.c#L2631). Is this with the fwmark option set or without?

but changing the lookup method by setting the fwmark setting as you suggested looks a little "hacky" to me?

It's not. It's actually the way better way to do this (in terms of quality and performance) but since we don't know how marks are used on the system we can't make it the default.

#8 Updated by Luka Logar 29 days ago

Tobias Brunner wrote:

Here are the contents of the table 220 (10.4.0.1 is the virtual IP assigned to the (Win10) client)

Hm, that's strange. It is literally the same call (see source:src/libcharon/plugins/kernel_netlink/kernel_netlink_ipsec.c#L2631). Is this with the fwmark option set or without?

Sorry, this was routing table with fwmark enabled. Routing table with fwmark disabled:

# ip route show table 220
10.4.0.1 via 213.250.xx.xx dev pppoe-wan proto static

but changing the lookup method by setting the fwmark setting as you suggested looks a little "hacky" to me?

It's not. It's actually the way better way to do this (in terms of quality and performance) but since we don't know how marks are used on the system we can't make it the default.

I was referring to the fact that setting the parameter fwmark changes the lookup method, which can not be understood from the name of it (and has also possibly some side effects - see below)?

I was playing a little more with the settings and I have noticed, that when i set fwmark (to !42) I get very unreliable traffic between the host that is connected through VPN and other hosts on the same subnet (10.0.0.0/24). It turns out, that in this case ipsec gateway is generating ICMP redirect messages which cause all kinds of problems. If I disable ICMP redirects through /proc/sys/net/ipv4/conf/*/send_redirects, then all is back to normal? But why are they generated in one case (when using fwmark) and not in another (without fwmark)?

#9 Updated by Tobias Brunner 28 days ago

Sorry, this was routing table with fwmark enabled. Routing table with fwmark disabled:

I see.

I was referring to the fact that setting the parameter fwmark changes the lookup method, which can not be understood from the name of it

I agree that it's not that intuitive. The option does have other uses (it was originally added with 5.1.1 for kernel-libipsec, the modified route lookup came later with 5.3.3), but this "side-effect" is documented (at least on the wiki).

(and has also possibly some side effects - see below)?

While it is a side-effect of the option, it's not the option itself that causes it, but of the changed route in table 220.

I was playing a little more with the settings and I have noticed, that when i set fwmark (to !42) I get very unreliable traffic between the host that is connected through VPN and other hosts on the same subnet (10.0.0.0/24). It turns out, that in this case ipsec gateway is generating ICMP redirect messages which cause all kinds of problems. If I disable ICMP redirects through /proc/sys/net/ipv4/conf/*/send_redirects, then all is back to normal? But why are they generated in one case (when using fwmark) and not in another (without fwmark)?

If the client's remote traffic selector is 0.0.0.0/0 (i.e. it routes all traffic through the tunnel) this is to be expected as it will include local traffic (you can avoid that by excluding the local LAN from the VPN e.g. via bypass-lan plugin). So the kernel will see e.g. return packets addressed to the virtual IP come in the same interface its routing table says it should go out of, which causes the ICMP redirects (there are other situations where these can occur, see ForwardingAndSplitTunneling).

Also available in: Atom PDF