December 2019 - libteam - Fedora Mailing-Lists

Re: [PATCH] teamd: Disregard current state when considering port enablement

by Jiri Pirko

Wed, Nov 13, 2019 at 02:26:47PM CET, petrm(a)mellanox.com wrote: >On systems where carrier is gained very quickly, there is a race between >teamd and the kernel that sometimes leads to all team slaves being stuck in >enabled=false state. > >When a port is enslaved to a team device, the kernel sends a netlink >message marking the port as enabled. teamd's lb_event_watch_port_added() >calls team_set_port_enabled(false), because link is down at that point. The >kernel responds with a message marking the port as disabled. At this point, >there are two outstanding messages: the initial one marking port as >enabled, and the second one marking it as disabled. teamd has not processed >either of these. > >Next teamd gets the netlink message that sets enabled=true, and updates its >internal cache accordingly. If at this point ethtool link-watch wakes up, >teamd considers (in teamd_port_check_enable()) enabling the port. After >consulting the cache, it concludes the port is already up, and neglects to >do so. Only then does teamd get the netlink message informing it of setting >enabled=false. > >The problem is that the teamd cache is not synchronous with respect to the >kernel state. If the carrier takes a while to come up (as is normally the >case), this is not a problem, because teamd caches up quickly enough. But >this may not always be the case, and particularly on a simulated system, >the carrier is gained almost immediately. > >Fix this by not suppressing the enablement message. > >Signed-off-by: Petr Machata <petrm(a)mellanox.com> applied. Thanks!

3 years, 7 months

2
2
0 / 0

[libteam PATCH] teamd/lacp: fix segfault due to NULL pointer dereference

by Hangbin Liu

If we set a team0 link down with lacp mode, we will call like - lacp_port_agg_unselect() - lacp_switch_agg_lead() - teamd_log_dbg() while the new_agg_lead in lacp_switch_agg_lead() may be NULL, then we will got NULL pointer dereference as we called new_agg_lead->ctx in new teamd_log_dbg(). Fix it by using agg_lead->ctx, which is safe as we referenced it in function lacp_switch_agg_lead(). Fixes: f32310b9a5cc ("libteam: wapper teamd_log_dbg with teamd_log_dbgx") Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- teamd/teamd_runner_lacp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c index 7d940b3..ec01237 100644 --- a/teamd/teamd_runner_lacp.c +++ b/teamd/teamd_runner_lacp.c @@ -634,7 +634,7 @@ static void lacp_switch_agg_lead(struct lacp_port *agg_lead, struct teamd_port *tdport; struct lacp_port *lacp_port; - teamd_log_dbg(new_agg_lead->ctx, "Renaming aggregator %u to %u", + teamd_log_dbg(agg_lead->ctx, "Renaming aggregator %u to %u", lacp_agg_id(agg_lead), lacp_agg_id(new_agg_lead)); if (lacp->selected_agg_lead == agg_lead) lacp->selected_agg_lead = new_agg_lead; -- 2.19.2

4 years, 3 months

2
1
0 / 0

[libteam PATCH] teamd: fix build error in expansion of macro teamd_log_dbgx

by Hangbin Liu

With gcc 8.3 I got the following build error: In file included from teamd_dbus.c:33: teamd_dbus.c: In function 'teamd_dbus_init': teamd.h:54:2: error: expected expression before 'if' if (val <= ctx->debug) \ ^~ teamd.h:57:37: note: in expansion of macro 'teamd_log_dbgx' #define teamd_log_dbg(ctx, args...) teamd_log_dbgx(ctx, 1, ##args) ^~~~~~~~~~~~~~ teamd_dbus.c:507:2: note: in expansion of macro 'teamd_log_dbg' teamd_log_dbg(ctx, "dbus: connected to %s with name %s", id, ^~~~~~~~~~~~~ Fix it by adding parentheses and braces around the content. Fixes: f32310b9a5cc ("libteam: wapper teamd_log_dbg with teamd_log_dbgx") Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- teamd/teamd.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/teamd/teamd.h b/teamd/teamd.h index 469b769..fb2872e 100644 --- a/teamd/teamd.h +++ b/teamd/teamd.h @@ -51,8 +51,7 @@ #define teamd_log_info(args...) daemon_log(LOG_INFO, ##args) #define teamd_log_dbgx(ctx, val, args...) \ - if (val <= ctx->debug) \ - daemon_log(LOG_DEBUG, ##args) + ({ if (val <= ctx->debug) daemon_log(LOG_DEBUG, ##args); }) #define teamd_log_dbg(ctx, args...) teamd_log_dbgx(ctx, 1, ##args) -- 2.19.2

4 years, 3 months

2
1
0 / 0

Query regarding ports arp monitoring

by petr wozniak

Hello, Please allow me to ask the following question regarding ARP monitoring. I have the following configuration - on one side is Mikrotik router with two LTE interfaces and on the second one is Debian Stretch server with Strongswan. Between router and server are two EoIP over IKEv2/IPsec tunnels and on the both sides are the EoIP interfaces put to logical devices in round-robin mode (bond in router’s side and team on server’s side). On Debian server I have created kernel 4.19.0-eoip with EoIP driver from here https://github.com/bbonev/eoip . When the both LTE interfaces on router are up all is working without problems: root@eoip:/home/ipsec# cat /etc/team0.conf { "device": "team0", "runner": {"name": "roundrobin"}, "link_watch":{ "name": "arp_ping", "interval": 100, "missed_max": 30, "source_host": "10.50.1.1", "target_host": "10.50.1.2" }, "ports": {"eoip57": {}, "eoip58": {}} }root@eoip:/home/ipsec# root@eoip:/home/ipsec# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 10.57.10.1/32 brd 10.57.10.1 scope global lo valid_lft forever preferred_lft forever inet 10.58.10.1/32 brd 10.58.10.1 scope global lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:56:63:09 brd ff:ff:ff:ff:ff:ff inet 10.17.1.55/24 brd 10.17.1.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fe56:6309/64 scope link valid_lft forever preferred_lft forever 3: eoip57@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 4: eoip58@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master team0 state UNKNOWN group default qlen 1000 link/ether ea:49:65:b5:ca:ce brd ff:ff:ff:ff:ff:ff 5: team0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 46:86:7c:5b:7c:53 brd ff:ff:ff:ff:ff:ff inet 10.50.1.1/24 scope global team0 valid_lft forever preferred_lft forever inet6 fe80::4486:7cff:fe5b:7c53/64 scope link valid_lft forever preferred_lft forever root@eoip:/home/ipsec# Security Associations (2 up, 0 connecting): test-lte58[24]: ESTABLISHED 9 seconds ago, 10.17.1.55[10.17.1.55]...37.48.60.236[test-lte58.cz] test-lte58{3}: INSTALLED, TUNNEL, reqid 3, ESP in UDP SPIs: cb2d2ce6_i 0815d80d_o test-lte58{3}: 10.58.10.1/32 === 10.58.10.2/32 test-lte57[2]: ESTABLISHED 6 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32 root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=1 ttl=64 time=39.8 ms 64 bytes from 10.50.1.2: icmp_seq=2 ttl=64 time=37.8 ms 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=41.7 ms 64 bytes from 10.50.1.2: icmp_seq=4 ttl=64 time=38.4 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=45.2 ms ^C --- 10.50.1.2 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4015ms rtt min/avg/max/mdev = 37.808/40.609/45.252/2.689 ms root@eoip:/home/ipsec# When one of LTE interfaces on router’s side is disabled one EoIP over IKEv2/IPsec tunnel is going down and arping on disabled tunnell has no response: root@eoip:/home/ipsec# ipsec status Security Associations (1 up, 0 connecting): test-lte57[2]: ESTABLISHED 20 minutes ago, 10.17.1.55[10.17.1.55]...37.48.35.104[test-lte57.cz] test-lte57{2}: INSTALLED, TUNNEL, reqid 2, ESP in UDP SPIs: cb1681bb_i 08519092_o test-lte57{2}: 10.57.10.1/32 === 10.57.10.2/32 root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip57 ARPING 10.50.1.2 from 10.57.10.1 eoip57 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 70.984ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 58.267ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 51.903ms ^CSent 4 probes (1 broadcast(s)) Received 3 response(s) root@eoip:/home/ipsec# arping 10.50.1.2 -I eoip58 ARPING 10.50.1.2 from 10.57.10.1 eoip58 ^CSent 12 probes (12 broadcast(s)) Received 0 response(s) root@eoip:/home/ipsec# arping 10.50.1.2 -I team0 ARPING 10.50.1.2 from 10.50.1.1 team0 Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.101ms Unicast reply from 10.50.1.2 [02:8F:77:A3:CA:D8] 37.653ms ^CSent 5 probes (1 broadcast(s)) Received 2 response(s) My problem is that the both ports of team0 interface stay up: root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 536 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 0/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 82:ae:24:87:56:9c ethtool link: 0mbit/halfduplex/up link watches: link summary: up instance[link_watch_0]: name: arp_ping link: up down count: 0 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 1/30 Here is sometimes 1, sometimes 0 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec# root@eoip:/home/ipsec# ping 10.50.1.2 PING 10.50.1.2 (10.50.1.2) 56(84) bytes of data. 64 bytes from 10.50.1.2: icmp_seq=3 ttl=64 time=43.8 ms 64 bytes from 10.50.1.2: icmp_seq=5 ttl=64 time=37.2 ms 64 bytes from 10.50.1.2: icmp_seq=9 ttl=64 time=55.1 ms 64 bytes from 10.50.1.2: icmp_seq=11 ttl=64 time=47.7 ms ^C --- 10.50.1.2 ping statistics --- 11 packets transmitted, 4 received, 63% packet loss, time 10145ms rtt min/avg/max/mdev = 37.248/46.010/55.179/6.493 ms root@eoip:/home/ipsec# When the second LTE interface is also disabled then both ports are going down: root@eoip:/home/ipsec# teamdctl team0 state view -v setup: runner: roundrobin kernel team mode: roundrobin D-BUS enabled: no ZeroMQ enabled: no debug level: 1 daemonized: yes PID: 535 PID file: /var/run/teamd/team0.pid ports: eoip57 ifindex: 3 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 36/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 eoip58 ifindex: 4 addr: 2e:c3:04:11:56:04 ethtool link: 0mbit/halfduplex/up link watches: link summary: down instance[link_watch_0]: name: arp_ping link: down down count: 1 source host: 10.50.1.1 target host: 10.50.1.2 interval: 100 missed packets: 37/30 validate_active: no validate_inactive: no send_always: no initial wait: 0 root@eoip:/home/ipsec# May I ask you for help me configure ARP monitoring properly? Thank you in advance. Petr

4 years, 3 months

2
1
0 / 0

[libteam PATCH] teamd: update ctx->hwaddr after setting ctx->ifindex to new hwaddr

by Hangbin Liu

When we add the first slave to team port, we will update ctx->ifindex with new hwaddr in function teamd_event_watch_port_added() - teamd_hwaddr_check_change(), But we didn't update the ctx->hwaddr, which will cause the first added slave set to team's init hwaddr again later. e.g. in the following functions lacp_port_set_mac() lb_event_watch_port_added() ab_hwaddr_policy_same_all_port_added(). The tdport's hwaddr will be reset based on ctx->hwaddr. Fix it by updating ctx->hwaddr when set ctx->ifindex to new hwaddr. Note: function teamd_set_hwaddr() is not considered as it will set ctx->hwaddr_explicit = true. Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- teamd/teamd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/teamd/teamd.c b/teamd/teamd.c index 6c47312..9622da1 100644 --- a/teamd/teamd.c +++ b/teamd/teamd.c @@ -867,7 +867,7 @@ static int teamd_add_ports(struct teamd_context *ctx) static int teamd_hwaddr_check_change(struct teamd_context *ctx, struct teamd_port *tdport) { - const char *hwaddr; + char *hwaddr; unsigned char hwaddr_len; int err; @@ -885,6 +885,8 @@ static int teamd_hwaddr_check_change(struct teamd_context *ctx, teamd_log_err("Failed to set team device hardware address."); return err; } + ctx->hwaddr = hwaddr; + ctx->hwaddr_len = hwaddr_len; return 0; } -- 2.19.2

4 years, 4 months

2
4
0 / 0

[libteam PATCH 0/6] move all teamd_log_dbg to teamd_log_dbgx

by Hangbin Liu

Hi Jiri, I'm not sure if I should split the patch or just post one directly. Please tell me if you feel the commit message are repeated and want only one patch. Recently some users reported that they start to see debug messages in their syslogs even with daemon_verbosity_level = LOG_INFO and without -g option. Actually this issue is there at the begining, the user would see the debug messages if they run teamd with -d option. The reason that most users did not notice this is because they are using libteam via NetworkManager, and NetworkManager run libteam in frontend. But after commit e47d5db53873 ("teamd: add an option to force log output to stdout, stderr or syslog"), NetworkManager will set TEAM_LOG_OUTPUT=syslog in the environment. At the same time libdaemon does not filter log levels if we use syslog(see function daemon_logv in libdaemon). Then all the users would see the debug messages suddenly and feels annoying. And here is the quote for daemon_set_verbosity() from libdaemon/dlog.h """ Allows to decide which messages to output on standard output/error streams. All messages are logged to syslog and this setting does not influence that. """ Since we should not limit how our user(NM) used libteam. And libdaemon is intend to not filter logs if use syslog. We'd better filter the debug message ourselves, like via -g option. So I would prefer to move all teamd_log_dbg to teamd_log_dbgx. After that, the user could decide whether to enable debug or not by themselves with -g option. Hangbin Liu (6): teamd/teamd.c: move teamd_log_dbg to teamd_log_dbgx teamd/teamd_runner_activebackup.c: move teamd_log_dbg to teamd_log_dbgx teamd/teamd_balancer.c: move teamd_log_dbg to teamd_log_dbgx teamd/teamd_runner_lacp.c: move teamd_log_dbg to teamd_log_dbgx teamd/teamd_link_watch.c: move teamd_log_dbg to teamd_log_dbgx teamd: move teamd_log_dbg to teamd_log_dbgx teamd/teamd.c | 40 ++++++++++++++-------------- teamd/teamd_balancer.c | 21 ++++++++------- teamd/teamd_dbus.c | 6 ++--- teamd/teamd_hash_func.c | 2 +- teamd/teamd_link_watch.c | 14 +++++----- teamd/teamd_lw_arp_ping.c | 12 ++++----- teamd/teamd_lw_ethtool.c | 4 +-- teamd/teamd_lw_nsna_ping.c | 2 +- teamd/teamd_lw_psr.c | 12 ++++----- teamd/teamd_lw_tipc.c | 8 +++--- teamd/teamd_per_port.c | 6 ++--- teamd/teamd_runner_activebackup.c | 18 ++++++------- teamd/teamd_runner_lacp.c | 44 +++++++++++++++---------------- teamd/teamd_usock.c | 12 ++++----- teamd/teamd_zmq.c | 10 +++---- 15 files changed, 107 insertions(+), 104 deletions(-) -- 2.19.2

4 years, 4 months

2
11
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

libteam December 2019