Re: [PATCH] teamd: Disregard current state when considering port
enablement
by Jiri Pirko
Wed, Nov 13, 2019 at 02:26:47PM CET, petrm(a)mellanox.com wrote:
>On systems where carrier is gained very quickly, there is a race between
>teamd and the kernel that sometimes leads to all team slaves being stuck in
>enabled=false state.
>
>When a port is enslaved to a team device, the kernel sends a netlink
>message marking the port as enabled. teamd's lb_event_watch_port_added()
>calls team_set_port_enabled(false), because link is down at that point. The
>kernel responds with a message marking the port as disabled. At this point,
>there are two outstanding messages: the initial one marking port as
>enabled, and the second one marking it as disabled. teamd has not processed
>either of these.
>
>Next teamd gets the netlink message that sets enabled=true, and updates its
>internal cache accordingly. If at this point ethtool link-watch wakes up,
>teamd considers (in teamd_port_check_enable()) enabling the port. After
>consulting the cache, it concludes the port is already up, and neglects to
>do so. Only then does teamd get the netlink message informing it of setting
>enabled=false.
>
>The problem is that the teamd cache is not synchronous with respect to the
>kernel state. If the carrier takes a while to come up (as is normally the
>case), this is not a problem, because teamd caches up quickly enough. But
>this may not always be the case, and particularly on a simulated system,
>the carrier is gained almost immediately.
>
>Fix this by not suppressing the enablement message.
>
>Signed-off-by: Petr Machata <petrm(a)mellanox.com>
applied. Thanks!
2 years, 9 months
[PATCH] Skip setting the same hwaddr to a lag port if not needed
by Pavel Shirshov
Avoid setting the same mac address to a LAG port,
if the LAG port already has the right one.
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
teamd/teamd.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/teamd/teamd.c b/teamd/teamd.c
index 8cdc16d..f955b19 100644
--- a/teamd/teamd.c
+++ b/teamd/teamd.c
@@ -837,7 +837,14 @@ static int teamd_set_hwaddr(struct teamd_context *ctx)
err = -EINVAL;
goto free_hwaddr;
}
- err = team_hwaddr_set(ctx->th, ctx->ifindex, hwaddr, hwaddr_len);
+
+ if (memcmp(hwaddr, ctx->hwaddr, hwaddr_len))
+ err = team_hwaddr_set(ctx->th, ctx->ifindex, hwaddr, hwaddr_len);
+ else {
+ err = 0;
+ teamd_log_dbg(ctx, "Skip setting same hwaddr string: \"%s\".", hwaddr_str);
+ }
+
if (!err)
ctx->hwaddr_explicit = true;
free_hwaddr:
--
2.7.4
3 years, 1 month
[PATCH] Send LACP PDU right after the Actor state has been changed
by Pavel Shirshov
According to the LACP standard "LACP daemon must send LACP PDU
packets with updates when the Actor's state changes or when it is
apparent from the Partner's LACPDUs that the Partner does not know
the Actor's current state."
The current libteam implementation sends periodic updates only, so
in LACP slow rate mode the update will be transmitted within 30 seconds
which doesn't follow the LACP standard.
To fix the issue:
1. Revert the patch:
b2de61b39696 ("Fix sending duplicate LACP frames at the start of establishing a logical channel.")
2. Recalculate LACP actor state before the comparison what the LACP
Partner thinks about the LACP Actor state.
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
teamd/teamd_runner_lacp.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
index ec01237..11d02f1 100644
--- a/teamd/teamd_runner_lacp.c
+++ b/teamd/teamd_runner_lacp.c
@@ -996,8 +996,7 @@ static int lacp_port_set_state(struct lacp_port *lacp_port,
return err;
lacp_port_actor_update(lacp_port);
- if (lacp_port->periodic_on)
- return 0;
+
return lacpdu_send(lacp_port);
}
@@ -1110,9 +1109,10 @@ static int lacpdu_recv(struct lacp_port *lacp_port)
if (err)
return err;
+ lacp_port_actor_update(lacp_port);
+
/* Check if the other side has correct info about us */
- if (!lacp_port->periodic_on &&
- memcmp(&lacpdu.partner, &lacp_port->actor,
+ if (memcmp(&lacpdu.partner, &lacp_port->actor,
sizeof(struct lacpdu_info))) {
err = lacpdu_send(lacp_port);
if (err)
--
2.7.4
3 years, 1 month
[PATCH] Make all netlink socket RCVBUF sizes configurable
by Pavel Shirshov
libteam often losses netlink messages on linux devices
with multiple network ports.
Read custom RCVBUF size from environment variable
TEAM_EVENT_BUFSIZE for all netlink sockets used by
libteam.
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
libteam/libteam.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/libteam/libteam.c b/libteam/libteam.c
index 9c9c93a..2a9053b 100644
--- a/libteam/libteam.c
+++ b/libteam/libteam.c
@@ -600,12 +600,24 @@ int team_init(struct team_handle *th, uint32_t ifindex)
return -errno;
}
- err = nl_socket_set_buffer_size(th->nl_sock, NETLINK_RCVBUF, 0);
+ env = getenv("TEAM_EVENT_BUFSIZE");
+ if (env) {
+ eventbufsize = strtol(env, NULL, 10);
+ /* ignore other errors, libnl forces minimum 32k and
+ * too large values are truncated to system rmem_max
+ */
+ if (eventbufsize < 0)
+ eventbufsize = 0;
+ } else {
+ eventbufsize = NETLINK_RCVBUF;
+ }
+
+ err = nl_socket_set_buffer_size(th->nl_sock, eventbufsize, 0);
if (err) {
err(th, "Failed to set buffer size of netlink sock.");
return -nl2syserr(err);
}
- err = nl_socket_set_buffer_size(th->nl_sock_event, NETLINK_RCVBUF, 0);
+ err = nl_socket_set_buffer_size(th->nl_sock_event, eventbufsize, 0);
if (err) {
err(th, "Failed to set buffer size of netlink event sock.");
return -nl2syserr(err);
@@ -640,18 +652,6 @@ int team_init(struct team_handle *th, uint32_t ifindex)
nl_cli_connect(th->nl_cli.sock_event, NETLINK_ROUTE);
nl_socket_set_nonblocking(th->nl_cli.sock_event);
- env = getenv("TEAM_EVENT_BUFSIZE");
- if (env) {
- eventbufsize = strtol(env, NULL, 10);
- /* ignore other errors, libnl forces minimum 32k and
- * too large values are truncated to system rmem_max
- */
- if (eventbufsize < 0)
- eventbufsize = 0;
- } else {
- eventbufsize = NETLINK_RCVBUF;
- }
-
err = nl_socket_set_buffer_size(th->nl_cli.sock_event, eventbufsize, 0);
if (err) {
err(th, "Failed to set cli event socket buffer size.");
--
2.7.4
3 years, 1 month
[PATCH] Send LACP PDU right after Actor state has been changed
by Pavel Shirshov
Fix a bug in libteam LACP protocol implementation.
According to the LACP standard "LACP daemon must send LACP PDU
packets with updates "when the Actor’s state changes or when it is
apparent from the Partner’s LACPDUs that the Partner does not know
the Actor’s current state."
The current libteam implementation sends periodic updates only, so
in LACP rate slow mode the update will be sent within 30 seconds
which doesn't follow the LACP standard.
To fix the issue:
1. The following patch was reverted:
https://github.com/jpirko/libteam/commit/b2de61b39696c9158e725a691aee5a6f...
2. LACP actor state recalculated before the comparison what the LACP
partner thinks about the LACP actor state.
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
teamd/teamd_runner_lacp.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
index ec01237..11d02f1 100644
--- a/teamd/teamd_runner_lacp.c
+++ b/teamd/teamd_runner_lacp.c
@@ -996,8 +996,7 @@ static int lacp_port_set_state(struct lacp_port *lacp_port,
return err;
lacp_port_actor_update(lacp_port);
- if (lacp_port->periodic_on)
- return 0;
+
return lacpdu_send(lacp_port);
}
@@ -1110,9 +1109,10 @@ static int lacpdu_recv(struct lacp_port *lacp_port)
if (err)
return err;
+ lacp_port_actor_update(lacp_port);
+
/* Check if the other side has correct info about us */
- if (!lacp_port->periodic_on &&
- memcmp(&lacpdu.partner, &lacp_port->actor,
+ if (memcmp(&lacpdu.partner, &lacp_port->actor,
sizeof(struct lacpdu_info))) {
err = lacpdu_send(lacp_port);
if (err)
--
2.7.4
3 years, 1 month
[PATCH] Fix ifinfo_link_with_port race condition with newlink
by Pavel Shirshov
When a member port is enslaved into a port channel
immediately after the port channel was created,
it is possible to get member port ifinfo structure
not initialized for the member port because of a race
condition.
The race condition here occurs because order of
following events is not strict:
- adding the member port to the port channel;
- creating ifinfo structure for the member port.
The error message "Failed to link port with ifinfo" is
thrown when a member port is tried to be added to the
team handler's port list before ifinfo structure was
initialized.
To fix this situation ifinfo_find_create() is used
to search member ports ifinfo structure in
ifinfo_link_with_port().
Signed-off-by: Shuotian Cheng <shuche(a)microsoft.com>
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
libteam/ifinfo.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/libteam/ifinfo.c b/libteam/ifinfo.c
index 46d56a2..a15788b 100644
--- a/libteam/ifinfo.c
+++ b/libteam/ifinfo.c
@@ -453,7 +453,10 @@ int ifinfo_link_with_port(struct team_handle *th, uint32_t ifindex,
{
struct team_ifinfo *ifinfo;
- ifinfo = ifinfo_find(th, ifindex);
+ if (port)
+ ifinfo = ifinfo_find_create(th, ifindex);
+ else
+ ifinfo = ifinfo_find(th, ifindex);
if (!ifinfo)
return -ENOENT;
if (ifinfo->linked)
--
2.7.4
3 years, 2 months
[PATCH] Don't return an error when timerfd socket return 0
by Pavel Shirshov
It is possible to read 0 bytes from timerfd descriptor
despite the fact that descriptor notified poll() that
it has data.
It is possible to see such behaviour on some hardware
platforms.
Solve this by treating such situation as normal.
Signed-off-by: Pavel Shirshov <pavel.contrib(a)gmail.com>
---
teamd/teamd.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/teamd/teamd.c b/teamd/teamd.c
index e035ac5..8cdc16d 100644
--- a/teamd/teamd.c
+++ b/teamd/teamd.c
@@ -265,6 +265,10 @@ static int handle_period_fd(int fd)
teamd_log_err("read() failed.");
return -errno;
}
+ if (ret == 0) {
+ teamd_log_warn("read() for timer_fd returned 0.");
+ return 0;
+ }
if (ret != sizeof(uint64_t)) {
teamd_log_err("read() returned unexpected number of bytes.");
return -EINVAL;
--
2.7.4
3 years, 2 months