Hi,
I probably have an overly complicated setup here so I'm open to any suggestions.
I have two OVPN clients connecting to two OVPN servers that we want to use in an activebackup/sticky config.
client - tap0 -> server - tap0 (sticky)
client1 - tap1 -> server1 - tap1
Using teamd with activebackup/sticky/ethtool and OVPN up/down scripts that use /sbin/ip to set the interfaces up/down upon establishing or losing connectivity.
I have also tested with iptables simulating network connectivity loss (simply drop packets egress for OVPN 1194/1195 from client guest.
Is is working if I stop the client or server side and it does indeed honour the sticky tap0 interface upon returning.
If I try the same with arp_ping it does fail over to tap1 as intended but it never returns to sticky tap0.
The tap interfaces are persistent and created upon boot. I noticed we lose IPs on tap interfaces but they still continue to work as slaves under team0.
Forgive my lack of understanding here, just presenting my picture to smarter people asking for help.
Thank you.
Versions:
CentOS 7 x86_64 on Ubuntu 16.04.2 LTS hypervisor as libvirt/qemu guest
OpenVPN: 2.4.3
teamd: 1.25
libteam: 1.25
The below is my working config but if I change to:
TEAM_CONFIG='{"runner":{"name":"activebackup"},"link_watch":{"name":"arp_ping","interval":1000,"missed_max":10,"source_host":"192.168.122.132","target_host":"192.168.122.3"}}'
on client and:
TEAM_CONFIG='{"runner":{"name":"activebackup"},"link_watch":{"name":"arp_ping","interval":1000,"missed_max":10,"source_host":"192.168.122.3","target_host":"192.168.122.132"}}'
on server we have breakage.
Running tcpdump, ARP requests start by coming through to tap0 on server, change to tap1 on server upon link going down but never return to tap0 when link is restored.
In /var/log/messages client side I see the active port is changed but it never gets changed back to sticky.
Client log:
Aug 22 18:34:37 untieclient teamd: tap1: arp_ping-link went up.
Aug 22 18:34:37 untieclient teamd: Current active port: "tap0" (ifindex "4", prio "100").
Aug 22 18:34:38 untieclient teamd: tap0: Missed 11 replies (max 10).
Aug 22 18:34:38 untieclient teamd: tap0: arp_ping-link went down.
Aug 22 18:34:38 untieclient teamd: Current active port: "tap0" (ifindex "4", prio "100").
Aug 22 18:34:38 untieclient teamd: Clearing active port "tap0".
Aug 22 18:34:38 untieclient teamd: Found best port: "tap1" (ifindex "7", prio "-10").
Aug 22 18:34:38 untieclient teamd: Changed active port to "tap1".
Aug 22 18:35:27 untieclient teamd: <ifinfo_list>
Aug 22 18:35:27 untieclient teamd: 8: team0: ce:55:dd:31:fb:54: 0
Aug 22 18:35:27 untieclient teamd: 7: tap1: ce:55:dd:31:fb:54: 8
Aug 22 18:35:27 untieclient teamd: *4: tap0: ce:55:dd:31:fb:54: 8
Aug 22 18:35:27 untieclient teamd: </ifinfo_list>
Aug 22 18:35:27 untieclient teamd: <port_list>
Aug 22 18:35:27 untieclient teamd: 7: tap1: up 10Mbit FD
Aug 22 18:35:27 untieclient teamd: *4: tap0: down 0Mbit HD
Aug 22 18:35:27 untieclient teamd: </port_list>
Aug 22 18:38:49 untieclient teamd: <ifinfo_list>
Aug 22 18:38:49 untieclient teamd: 8: team0: ce:55:dd:31:fb:54: 0
Aug 22 18:38:49 untieclient teamd: 7: tap1: ce:55:dd:31:fb:54: 8
Aug 22 18:38:49 untieclient teamd: *4: tap0: ce:55:dd:31:fb:54: 8
Aug 22 18:38:49 untieclient teamd: </ifinfo_list>
Aug 22 18:38:49 untieclient teamd: <port_list>
Aug 22 18:38:49 untieclient teamd: 7: tap1: up 10Mbit FD
Aug 22 18:38:49 untieclient teamd: *4: tap0: up 10Mbit FD
Aug 22 18:38:49 untieclient teamd: </port_list>
Both client and server sides show link summary as down for tap0:
setup:
runner: activebackup
ports:
tap1
link watches:
link summary: up
instance[link_watch_0]:
name: arp_ping
link: up
down count: 0
tap0
link watches:
link summary: down
instance[link_watch_0]:
name: arp_ping
link: down
down count: 1
runner:
active port: tap1
Client VM:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:71:4e:b9 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.132/24 brd 192.168.122.255 scope global dynamic eth0
valid_lft 3519sec preferred_lft 3519sec
inet6 fe80::5054:ff:fe71:4eb9/64 scope link
valid_lft forever preferred_lft forever
4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UP qlen 100
link/ether ce:55:dd:31:fb:54 brd ff:ff:ff:ff:ff:ff
inet 172.20.35.10/30 brd 172.20.35.11 scope global tap0
valid_lft forever preferred_lft forever
inet6 fe80::cc55:ddff:fe31:fb54/64 scope link
valid_lft forever preferred_lft forever
6: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
link/ether ce:55:dd:31:fb:54 brd ff:ff:ff:ff:ff:ff
inet 192.168.18.10/30 brd 192.168.18.11 scope global team0
valid_lft forever preferred_lft forever
inet6 fe80::cc55:ddff:fe31:fb54/64 scope link
valid_lft forever preferred_lft forever
7: tap1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN qlen 100
link/ether ce:55:dd:31:fb:54 brd ff:ff:ff:ff:ff:ff
inet 172.20.36.10/30 brd 172.20.36.11 scope global tap1
valid_lft forever preferred_lft forever
inet6 fe80::589c:71ff:fe84:f27d/64 scope link
valid_lft forever preferred_lft forever
Server VM:
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:96:37:5d brd ff:ff:ff:ff:ff:ff
inet 192.168.122.3/24 brd 192.168.122.255 scope global dynamic eth0
valid_lft 3024sec preferred_lft 3024sec
inet6 fe80::5054:ff:fe96:375d/64 scope link
valid_lft forever preferred_lft forever
4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UP qlen 100
link/ether 32:ba:b3:21:0c:01 brd ff:ff:ff:ff:ff:ff
inet 172.20.35.9/30 brd 172.20.35.11 scope global tap0
valid_lft forever preferred_lft forever
inet6 fe80::30ba:b3ff:fe21:c01/64 scope link
valid_lft forever preferred_lft forever
5: tap1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UP qlen 100
link/ether 32:ba:b3:21:0c:01 brd ff:ff:ff:ff:ff:ff
inet 172.20.36.9/30 brd 172.20.36.11 scope global tap1
valid_lft forever preferred_lft forever
inet6 fe80::30ba:b3ff:fe21:c01/64 scope link
valid_lft forever preferred_lft forever
6: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
link/ether 32:ba:b3:21:0c:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.18.9/30 brd 192.168.18.11 scope global team0
valid_lft forever preferred_lft forever
inet6 fe80::30ba:b3ff:fe21:c01/64 scope link
valid_lft forever preferred_lft forever
Client configs:
[root@untieclient ~]# teamdctl team0 config dump
{
"device": "team0",
"link_watch": {
"delay_down": 20000,
"name": "ethtool"
},
"mcast_rejoin": {
"count": 1
},
"notify_peers": {
"count": 1
},
"ports": {
"tap0": {
"prio": 100,
"sticky": true
},
"tap1": {
"prio": -10
}
},
"runner": {
"name": "activebackup"
}
}
Server config:
[root@untieserver ~]# teamdctl team0 config dump
{
"device": "team0",
"link_watch": {
"delay_down": 20000,
"name": "ethtool"
},
"mcast_rejoin": {
"count": 1
},
"notify_peers": {
"count": 1
},
"ports": {
"tap0": {
"prio": 100,
"sticky": true
},
"tap1": {
"prio": -10
}
},
"runner": {
"name": "activebackup"
}
}