January 2020 - libteam - Fedora Mailing-Lists

Re: [PATCH] teamd: Disregard current state when considering port enablement

by Jiri Pirko

Wed, Nov 13, 2019 at 02:26:47PM CET, petrm(a)mellanox.com wrote: >On systems where carrier is gained very quickly, there is a race between >teamd and the kernel that sometimes leads to all team slaves being stuck in >enabled=false state. > >When a port is enslaved to a team device, the kernel sends a netlink >message marking the port as enabled. teamd's lb_event_watch_port_added() >calls team_set_port_enabled(false), because link is down at that point. The >kernel responds with a message marking the port as disabled. At this point, >there are two outstanding messages: the initial one marking port as >enabled, and the second one marking it as disabled. teamd has not processed >either of these. > >Next teamd gets the netlink message that sets enabled=true, and updates its >internal cache accordingly. If at this point ethtool link-watch wakes up, >teamd considers (in teamd_port_check_enable()) enabling the port. After >consulting the cache, it concludes the port is already up, and neglects to >do so. Only then does teamd get the netlink message informing it of setting >enabled=false. > >The problem is that the teamd cache is not synchronous with respect to the >kernel state. If the carrier takes a while to come up (as is normally the >case), this is not a problem, because teamd caches up quickly enough. But >this may not always be the case, and particularly on a simulated system, >the carrier is gained almost immediately. > >Fix this by not suppressing the enablement message. > >Signed-off-by: Petr Machata <petrm(a)mellanox.com> applied. Thanks!

3 years, 7 months

2
2
0 / 0

Re: [patch libteam] poll instead of select

by Jiri Pirko

Tue, Jan 28, 2020 at 01:11:00AM CET, jerome99(a)internet.lu wrote: >The select function cannot be used in application if the application has >already more than 1024 open files. The select will crash if an file >descriptor greater or equal than 1023 is monitored. Okay, how we can come close that? > >Signed-off-by: Jerome Freilinger <jerome99(a)internet.lu> >--- > libteamdctl/cli_usock.c | 25 +++++++++---------------- > 1 file changed, 9 insertions(+), 16 deletions(-) > >diff --git a/libteamdctl/cli_usock.c b/libteamdctl/cli_usock.c >index 0dc97ae..431b12d 100644 >--- a/libteamdctl/cli_usock.c >+++ b/libteamdctl/cli_usock.c >@@ -25,6 +25,7 @@ > #include <sys/socket.h> > #include <unistd.h> > #include <teamdctl.h> >+#include <poll.h> > #include "teamdctl_private.h" > #include "../teamd/teamd_usock_common.h" > >@@ -79,26 +80,18 @@ static int cli_usock_send(int sock, char *msg) > return 0; > } > >-#define WAIT_SEC (TEAMDCTL_REPLY_TIMEOUT / 1000) >-#define WAIT_USEC (TEAMDCTL_REPLY_TIMEOUT % 1000 * 1000) >- > static int cli_usock_wait_recv(int sock) > { >- fd_set rfds; >- int fdmax; >- int ret; >- struct timeval tv; >+ struct pollfd fds[1]; >+ >+ fds[0].fd = sock; >+ fds[0].events = POLLIN; >+ fds[0].revents = 0; >+ int ret = poll(fds, 1, TEAMDCTL_REPLY_TIMEOUT); > >- tv.tv_sec = WAIT_SEC; >- tv.tv_usec = WAIT_USEC; >- FD_ZERO(&rfds); >- FD_SET(sock, &rfds); >- fdmax = sock + 1; >- ret = select(fdmax, &rfds, NULL, NULL, &tv); >- if (ret == -1) >- return -errno; >- if (!FD_ISSET(sock, &rfds)) >+ if (ret == 0) > return -ETIMEDOUT; >+ else if (ret < 0) >+ return -errno; > return 0; > } > >-- >2.20.1 >

4 years, 2 months

1
0
0 / 0

[libteam PATCH] teamd/lacp: fix segfault due to NULL pointer dereference

by Hangbin Liu

If we set a team0 link down with lacp mode, we will call like - lacp_port_agg_unselect() - lacp_switch_agg_lead() - teamd_log_dbg() while the new_agg_lead in lacp_switch_agg_lead() may be NULL, then we will got NULL pointer dereference as we called new_agg_lead->ctx in new teamd_log_dbg(). Fix it by using agg_lead->ctx, which is safe as we referenced it in function lacp_switch_agg_lead(). Fixes: f32310b9a5cc ("libteam: wapper teamd_log_dbg with teamd_log_dbgx") Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- teamd/teamd_runner_lacp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c index 7d940b3..ec01237 100644 --- a/teamd/teamd_runner_lacp.c +++ b/teamd/teamd_runner_lacp.c @@ -634,7 +634,7 @@ static void lacp_switch_agg_lead(struct lacp_port *agg_lead, struct teamd_port *tdport; struct lacp_port *lacp_port; - teamd_log_dbg(new_agg_lead->ctx, "Renaming aggregator %u to %u", + teamd_log_dbg(agg_lead->ctx, "Renaming aggregator %u to %u", lacp_agg_id(agg_lead), lacp_agg_id(new_agg_lead)); if (lacp->selected_agg_lead == agg_lead) lacp->selected_agg_lead = new_agg_lead; -- 2.19.2

4 years, 3 months

2
1
0 / 0

[libteam PATCH] teamd: fix build error in expansion of macro teamd_log_dbgx

by Hangbin Liu

With gcc 8.3 I got the following build error: In file included from teamd_dbus.c:33: teamd_dbus.c: In function 'teamd_dbus_init': teamd.h:54:2: error: expected expression before 'if' if (val <= ctx->debug) \ ^~ teamd.h:57:37: note: in expansion of macro 'teamd_log_dbgx' #define teamd_log_dbg(ctx, args...) teamd_log_dbgx(ctx, 1, ##args) ^~~~~~~~~~~~~~ teamd_dbus.c:507:2: note: in expansion of macro 'teamd_log_dbg' teamd_log_dbg(ctx, "dbus: connected to %s with name %s", id, ^~~~~~~~~~~~~ Fix it by adding parentheses and braces around the content. Fixes: f32310b9a5cc ("libteam: wapper teamd_log_dbg with teamd_log_dbgx") Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- teamd/teamd.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/teamd/teamd.h b/teamd/teamd.h index 469b769..fb2872e 100644 --- a/teamd/teamd.h +++ b/teamd/teamd.h @@ -51,8 +51,7 @@ #define teamd_log_info(args...) daemon_log(LOG_INFO, ##args) #define teamd_log_dbgx(ctx, val, args...) \ - if (val <= ctx->debug) \ - daemon_log(LOG_DEBUG, ##args) + ({ if (val <= ctx->debug) daemon_log(LOG_DEBUG, ##args); }) #define teamd_log_dbg(ctx, args...) teamd_log_dbgx(ctx, 1, ##args) -- 2.19.2

4 years, 3 months

2
1
0 / 0

Query regarding ports arp monitoring

by petr wozniak

Hello, Please allow me to ask the following question regarding ARP monitoring. I have the following configuration - on one side is Mikrotik router with two LTE interfaces and on the second one is Debian Stretch server with Strongswan. Between router and server are two EoIP over IKEv2/IPsec tunnels and on the both sides are the EoIP interfaces put to logical devices in round-robin mode (bond in router’s side and team on server’s side). On Debian server I have created kernel 4.19.0-eoip with EoIP driver from here https://github.com/bbonev/eoip . When the both LTE interfaces on router are up all is working without problems: root@eoip:/home/ipsec# cat /etc/team0.conf { "device": "team0", "runner": {"name": "roundrobin"}, "link_watch":{ "name": "arp_ping", "interval": 100, "missed_max": 30, "source_host": "10.50.1.1", "target_host": "10.50.1.2" }, "ports": {"eoip57": {}, "eoip58": {}} }root@eoip:/home/ipsec# root@eoip:/home/ipsec# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 10.57.10.1/32 brd 10.57.10.1 scope global lo valid_lft forever preferred_lft forever inet 10.58.10.1/32 brd 10.58.10.1 scope global lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:56:63:09 brd ff:ff:ff:ff:ff:ff inet 10.17.1.55/24 brd 10.17.1.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fe56:6309/64 scope link valid_lft forever preferred_lft forever 3: eoip57@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 4: eoip58@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 5: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 46:86:7c:5b:7c:53 brd ff:ff:ff:ff:ff:ff inet 10.50.1.1/24 scope global team0 valid_lft forever preferred_lft forever inet6 fe80::4486:7cff:fe5b:7c53/64 scope link valid_lft forever preferred_lft forever root@eoip:/home/ipsec# Security Associations (2 up, 0 connecting): test-lte58[24]: ESTABLISHED 9 seconds ago, 10.17.1.55[10.17.1.55]...37.48.60.236[test-lte58.cz] test-lte58{3}: INSTALLED, TUNNEL, reqid 3, ESP in UDP SPIs: cb2d2ce6_i 0815d80d_o test-lte58{3}: 10.58.10.1/32 === 10.58.10.2/32 test-lte57[2]: ESTABLISHED 6 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32 root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=1 ttl=64 time=39.8 ms 64 bytes from 10.50.1.2: icmp_seq=2 ttl=64 time=37.8 ms 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=41.7 ms 64 bytes from 10.50.1.2: icmp_seq=4 ttl=64 time=38.4 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=45.2 ms ^C --- 10.50.1.2 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4015ms rtt min/avg/max/mdev = 37.808/40.609/45.252/2.689 ms root@eoip:/home/ipsec# When one of LTE interfaces on router’s side is disabled one EoIP over IKEv2/IPsec tunnel is going down and arping on disabled tunnell has no response: root@eoip:/home/ipsec# ipsec status Security Associations (1 up, 0 connecting): test-lte57[2]: ESTABLISHED 20 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32 root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip57 ARPING 10.50.1.2 from 10.57.10.1 eoip57 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 70.984ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 58.267ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 51.903ms ^CSent 4 probes (1 broadcast(s)) Received 3 response(s) root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip58 ARPING 10.50.1.2 from 10.57.10.1 eoip58 ^CSent 12 probes (12 broadcast(s)) Received 0 response(s) root@eoip:/home/ipsec# arping 10.50.1.2 -I team0 ARPING 10.50.1.2 from 10.50.1.1 team0 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.101ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.653ms ^CSent 5 probes (1 broadcast(s)) Received 2 response(s) My problem is that the both ports of team0 interface stay up: root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 536 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 0/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 1/30 Here is sometimes 1, sometimes 0 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec# root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=43.8 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=37.2 ms 64 bytes from 10.50.1.2: icmp_seq=9 ttl=64 time=55.1 ms 64 bytes from 10.50.1.2: icmp_seq=11 ttl=64 time=47.7 ms ^C --- 10.50.1.2 ping statistics --- 11 packets transmitted, 4 received, 63% packet loss, time 10145ms rtt min/avg/max/mdev = 37.248/46.010/55.179/6.493 ms root@eoip:/home/ipsec# When the second LTE interface is also disabled then both ports are going down: root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 535 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 36/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 37/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec# May I ask you for help me configure ARP monitoring properly? Thank you in advance. Petr

4 years, 3 months

2
1
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

libteam January 2020