Wed, Nov 13, 2019 at 02:26:47PM CET, petrm(a)mellanox.com wrote:
>On systems where carrier is gained very quickly, there is a race between
>teamd and the kernel that sometimes leads to all team slaves being stuck in
>When a port is enslaved to a team device, the kernel sends a netlink
>message marking the port as enabled. teamd's lb_event_watch_port_added()
>calls team_set_port_enabled(false), because link is down at that point. The
>kernel responds with a message marking the port as disabled. At this point,
>there are two outstanding messages: the initial one marking port as
>enabled, and the second one marking it as disabled. teamd has not processed
>either of these.
>Next teamd gets the netlink message that sets enabled=true, and updates its
>internal cache accordingly. If at this point ethtool link-watch wakes up,
>teamd considers (in teamd_port_check_enable()) enabling the port. After
>consulting the cache, it concludes the port is already up, and neglects to
>do so. Only then does teamd get the netlink message informing it of setting
>The problem is that the teamd cache is not synchronous with respect to the
>kernel state. If the carrier takes a while to come up (as is normally the
>case), this is not a problem, because teamd caches up quickly enough. But
>this may not always be the case, and particularly on a simulated system,
>the carrier is gained almost immediately.
>Fix this by not suppressing the enablement message.
>Signed-off-by: Petr Machata <petrm(a)mellanox.com>
When we add the first slave to team port, we will update ctx->ifindex
with new hwaddr in function
But we didn't update the ctx->hwaddr, which will cause the first added
slave set to team's init hwaddr again later. e.g. in the following functions
The tdport's hwaddr will be reset based on ctx->hwaddr. Fix it by updating
ctx->hwaddr when set ctx->ifindex to new hwaddr.
Note: function teamd_set_hwaddr() is not considered as it will set
ctx->hwaddr_explicit = true.
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
teamd/teamd.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/teamd/teamd.c b/teamd/teamd.c
index 6c47312..9622da1 100644
@@ -867,7 +867,7 @@ static int teamd_add_ports(struct teamd_context *ctx)
static int teamd_hwaddr_check_change(struct teamd_context *ctx,
struct teamd_port *tdport)
- const char *hwaddr;
+ char *hwaddr;
unsigned char hwaddr_len;
@@ -885,6 +885,8 @@ static int teamd_hwaddr_check_change(struct teamd_context *ctx,
teamd_log_err("Failed to set team device hardware address.");
+ ctx->hwaddr = hwaddr;
+ ctx->hwaddr_len = hwaddr_len;
I'm not sure if I should split the patch or just post one directly. Please
tell me if you feel the commit message are repeated and want only one patch.
Recently some users reported that they start to see debug messages in their
syslogs even with daemon_verbosity_level = LOG_INFO and without -g option.
Actually this issue is there at the begining, the user would see the debug
messages if they run teamd with -d option. The reason that most users did
not notice this is because they are using libteam via NetworkManager, and
NetworkManager run libteam in frontend.
But after commit e47d5db53873 ("teamd: add an option to force log
output to stdout, stderr or syslog"), NetworkManager will set
TEAM_LOG_OUTPUT=syslog in the environment. At the same time libdaemon
does not filter log levels if we use syslog(see function daemon_logv in
libdaemon). Then all the users would see the debug messages suddenly and
And here is the quote for daemon_set_verbosity() from libdaemon/dlog.h
Allows to decide which messages to output on standard output/error
streams. All messages are logged to syslog and this setting does
not influence that.
Since we should not limit how our user(NM) used libteam. And libdaemon
is intend to not filter logs if use syslog. We'd better filter the
debug message ourselves, like via -g option. So I would prefer to
move all teamd_log_dbg to teamd_log_dbgx. After that, the user could
decide whether to enable debug or not by themselves with -g option.
Hangbin Liu (6):
teamd/teamd.c: move teamd_log_dbg to teamd_log_dbgx
teamd/teamd_runner_activebackup.c: move teamd_log_dbg to
teamd/teamd_balancer.c: move teamd_log_dbg to teamd_log_dbgx
teamd/teamd_runner_lacp.c: move teamd_log_dbg to teamd_log_dbgx
teamd/teamd_link_watch.c: move teamd_log_dbg to teamd_log_dbgx
teamd: move teamd_log_dbg to teamd_log_dbgx
teamd/teamd.c | 40 ++++++++++++++--------------
teamd/teamd_balancer.c | 21 ++++++++-------
teamd/teamd_dbus.c | 6 ++---
teamd/teamd_hash_func.c | 2 +-
teamd/teamd_link_watch.c | 14 +++++-----
teamd/teamd_lw_arp_ping.c | 12 ++++-----
teamd/teamd_lw_ethtool.c | 4 +--
teamd/teamd_lw_nsna_ping.c | 2 +-
teamd/teamd_lw_psr.c | 12 ++++-----
teamd/teamd_lw_tipc.c | 8 +++---
teamd/teamd_per_port.c | 6 ++---
teamd/teamd_runner_activebackup.c | 18 ++++++-------
teamd/teamd_runner_lacp.c | 44 +++++++++++++++----------------
teamd/teamd_usock.c | 12 ++++-----
teamd/teamd_zmq.c | 10 +++----
15 files changed, 107 insertions(+), 104 deletions(-)
I have a few questions regarding the mixing of port-speeds in link aggregates and how it is handled by teamd. It's about how teamd is fulfilling the requirement that a Link Aggregation Group should consists of N parallel instances of full duplex point-to-point links operating at the same data rate.
The most obvious case is when a link in a LAG has its speed changed (either manually by configuration or automatically by autoneg). In the current setup of teamd that we have (lacp runner), a link that has its speed lowered will still be a part of the aggregate, a set with three 100 MBit links and one 10 Mbit link will still be aggregated together (same applies for mixing with GBit ports as well of course).
My questions are:
Q1. Is there a way to configure teamd to automatically disable ports/links of lower port speed, e.g. through "agg_select_policy"?
Q2. If the answer to Q1 is no, I am hopeful about the option to use dynamic keys (see my test (3) below). Is there a reason for this not being a part of teamd? Maybe some side-effects that I have missed?
Q3. Or is it so that this problem is generally avoided manually, i.e. it is a implicit requirement to manually see to it that links with different speeds are never configured to be part of the same LAG?
Q4. Or is there a totally different solution to this that I have overlooked completely?
Some background to the experiments I have done:
I have tested three main approaches to see if I can get the low speed 10 Mbit link to be disabled from the LAG:
1. I have tested with setting the parameter "agg_select_policy" to "bandwidth" for the lacp runner in the hope that agg_lead would be set differently and automatically disable the low speed 10 Mbit link. But it does not.
2. I wrote a patch that actively disables ports that have a lower speed than the highest in the LAG. It calls a evaluation function from for example lacp_port_link_update and lacpdu_recv to iteratate over all links and disable those that of lower speed. The code is located here (branch wmo/mismatch commit 4b1eca47624c6dc39a59efebcc6480ed5b33f47b):
I'm not entirely happy with this patch, but is food for the discussion though.
3. I have tried to see if operational key could be a solution. As 802.1ax states in clause 5.6.2 "Dynamic allocation of operational Keys":
"In the course of normal operation a port can dynamically change its operating characteristics (e.g., data rate, full or half duplex operation). It is permissible (and appropriate) for the operational Key value associated with such a port to change with the corresponding changes in the operating characteristics of the link, so that the operational Key value always correctly reflects the aggregation capability of the link. Operational Key changes that reflect such dynamic changes in the operating characteristics of a link may be made by either System without restriction."
Since it seems that teamd does not actively set key on the outgoing lacp frames, I have tried out code that sets lacp_port->actor.key in lacp_port_link_update() based on the speed of the port. This in turn means that lacp_ports_aggregable at the partner will in fact rule out lower priority links/ports as expected. It seems to work very well in fact. Is this a plausible way forward?
Thanks for your time, and all feedback is appreciated.
Best Regard, Leif