Hello,
Please allow me to ask the following question regarding ARP monitoring. I have the following configuration - on one side is Mikrotik router with two LTE interfaces and on the second one is Debian Stretch server with Strongswan. Between router and server are two EoIP over IKEv2/IPsec tunnels and on the both sides are the EoIP interfaces put to logical devices in round-robin mode (bond in router’s side and team on server’s side). On Debian server I have created kernel 4.19.0-eoip with EoIP driver from here https://github.com/bbonev/eoip .
When the both LTE interfaces on router are up all is working without problems:
root@eoip:/home/ipsec# cat /etc/team0.conf { "device": "team0", "runner": {"name": "roundrobin"}, "link_watch":{ "name": "arp_ping", "interval": 100, "missed_max": 30, "source_host": "10.50.1.1", "target_host": "10.50.1.2" }, "ports": {"eoip57": {}, "eoip58": {}} }root@eoip:/home/ipsec#
root@eoip:/home/ipsec# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 10.57.10.1/32 brd 10.57.10.1 scope global lo valid_lft forever preferred_lft forever inet 10.58.10.1/32 brd 10.58.10.1 scope global lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:56:63:09 brd ff:ff:ff:ff:ff:ff inet 10.17.1.55/24 brd 10.17.1.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fe56:6309/64 scope link valid_lft forever preferred_lft forever 3: eoip57@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 4: eoip58@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 5: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 46:86:7c:5b:7c:53 brd ff:ff:ff:ff:ff:ff inet 10.50.1.1/24 scope global team0 valid_lft forever preferred_lft forever inet6 fe80::4486:7cff:fe5b:7c53/64 scope link valid_lft forever preferred_lft forever root@eoip:/home/ipsec#
Security Associations (2 up, 0 connecting): test-lte58[24]: ESTABLISHED 9 seconds ago, 10.17.1.55[10.17.1.55]...37.48.60.236[test-lte58.cz] test-lte58{3}: INSTALLED, TUNNEL, reqid 3, ESP in UDP SPIs: cb2d2ce6_i 0815d80d_o test-lte58{3}: 10.58.10.1/32 === 10.58.10.2/32 test-lte57[2]: ESTABLISHED 6 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32
root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=1 ttl=64 time=39.8 ms 64 bytes from 10.50.1.2: icmp_seq=2 ttl=64 time=37.8 ms 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=41.7 ms 64 bytes from 10.50.1.2: icmp_seq=4 ttl=64 time=38.4 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=45.2 ms ^C --- 10.50.1.2 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4015ms rtt min/avg/max/mdev = 37.808/40.609/45.252/2.689 ms root@eoip:/home/ipsec#
When one of LTE interfaces on router’s side is disabled one EoIP over IKEv2/IPsec tunnel is going down and arping on disabled tunnell has no response:
root@eoip:/home/ipsec# ipsec status Security Associations (1 up, 0 connecting): test-lte57[2]: ESTABLISHED 20 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32
root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip57 ARPING 10.50.1.2 from 10.57.10.1 eoip57 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 70.984ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 58.267ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 51.903ms ^CSent 4 probes (1 broadcast(s)) Received 3 response(s)
root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip58 ARPING 10.50.1.2 from 10.57.10.1 eoip58 ^CSent 12 probes (12 broadcast(s)) Received 0 response(s)
root@eoip:/home/ipsec# arping 10.50.1.2 -I team0 ARPING 10.50.1.2 from 10.50.1.1 team0 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.101ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.653ms ^CSent 5 probes (1 broadcast(s)) Received 2 response(s)
My problem is that the both ports of team0 interface stay up:
root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 536 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 0/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 1/30 Here is sometimes 1, sometimes 0 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec#
root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=43.8 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=37.2 ms 64 bytes from 10.50.1.2: icmp_seq=9 ttl=64 time=55.1 ms 64 bytes from 10.50.1.2: icmp_seq=11 ttl=64 time=47.7 ms ^C --- 10.50.1.2 ping statistics --- 11 packets transmitted, 4 received, 63% packet loss, time 10145ms rtt min/avg/max/mdev = 37.248/46.010/55.179/6.493 ms root@eoip:/home/ipsec#
When the second LTE interface is also disabled then both ports are going down:
root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 535 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 36/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 37/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec#
May I ask you for help me configure ARP monitoring properly?
Thank you in advance.
Petr
Hello,
On last version 1.29 I have tested above mentioned configuration with all possible combinations of validate_active, validate_inactive and send_always parameters and the result is the same - when one EoIP tunnel is disabled the both ports stay in up state when the value of missed_max is greater then 1. The only possible way is to set value of missed_max=1 and then the port is going down after disconnection. The problem is that such setting is very sensitive.
root@eoip:/home/ipsec# teamdctl team0 state -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 531 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: ce:9f:8b:0f:b2:12 ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 0/1 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: ce:9f:8b:0f:b2:12 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 2 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 112/1 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec#
In syslog file sometimes appear the following messages:
Jan 1 20:18:55 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:18:55 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:18:57 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:18:57 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:15 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:15 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:17 eoip teamd_team0[531]: Added loop callback: usock_acc_conn, 0x5595bb068c70 Jan 1 20:19:17 eoip teamd_team0[531]: usock: calling method "ConfigDump" Jan 1 20:19:17 eoip teamd_team0[531]: usock: calling method "ConfigDumpActual" Jan 1 20:19:17 eoip teamd_team0[531]: usock: calling method "StateDump" Jan 1 20:19:17 eoip teamd_team0[531]: Removed loop callback: usock_acc_conn, 0x5595bb068c70 Jan 1 20:19:21 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:21 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:21 eoip teamd_team0[531]: some periodic function calls missed (2) Jan 1 20:19:21 eoip teamd_team0[531]: some periodic function calls missed (2) Jan 1 20:19:22 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:22 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:28 eoip teamd_team0[531]: some periodic function calls missed (1) Jan 1 20:19:28 eoip teamd_team0[531]: some periodic function calls missed (1)
May I ask you where I am making mistake and if does exist some solution which would allow to use the missed_max parameter greater then 1?
Thank you for help and I wish you a happy New Year.
Petr
libteam@lists.fedorahosted.org