[ISSUE] Failed to find "enabled" option.
by Xin Long
I found this err in "loadbalance" and "lacp" runner when adding ports.
It's caused by trying to set "enabled" option in .port_link_changed()
or .port_changed().
When a new port is added, the first 'port changed event' process is
earlier than CMD TEAM_CMD_OPTIONS_GET, in this CMD, all
the options are synchronized from kernel.
It means there's no 'enabled' option yet when calling port_link_changed
in the first 'port changed event' process. In lb_event_watch_port_link_changed
and lacp_event_watch_port_changed, they call teamd_port_check_enable
to set 'enabled' option. this err is triggered.
I'm not sure why teamd_port_check_enable needs to check if
'enabled' option exists. I checked the ab's .port_link_changed(),
it just sets it by calling team_set_port_enabled(), instead of
checking 'enabled' option first.
can we just use team_set_port_enabled to set it directly in
.port(_link)_changed OR improve teamd_port_check_enable
to avoid this err ?
Thanks.
3 years, 8 months
[PATCH] libteam: set netlink event socket as non-blocking
by Beniamino Galvani
In some situations it was observed that recvmsg() blocks even if
epoll_wait() reports the netlink socket as ready:
select(15, [3 9 10 14], [], [], NULL) = 1 (in [9])
epoll_wait(9, [{EPOLLIN, {u32=8, u64=8}}], 2, -1) = 1
recvmsg(8, ### blocked
(9 is the epoll fd, 8 the nl_cli event socket).
Probably this is caused by a bug in kernel, however it seems more
robust anyway to set the socket as non-blocking to avoid problems like
this.
Signed-off-by: Beniamino Galvani <bgalvani(a)redhat.com>
---
libteam/libteam.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libteam/libteam.c b/libteam/libteam.c
index 0065a7f..9c9c93a 100644
--- a/libteam/libteam.c
+++ b/libteam/libteam.c
@@ -638,6 +638,7 @@ int team_init(struct team_handle *th, uint32_t ifindex)
nl_socket_modify_cb(th->nl_cli.sock_event, NL_CB_VALID,
NL_CB_CUSTOM, cli_event_handler, th);
nl_cli_connect(th->nl_cli.sock_event, NETLINK_ROUTE);
+ nl_socket_set_nonblocking(th->nl_cli.sock_event);
env = getenv("TEAM_EVENT_BUFSIZE");
if (env) {
--
2.20.1
4 years
Re: [patch libteam v1] libteam: double NETLINK_RCVBUF to fix -ENOMEM
error
by Jiri Pirko
Tue, Apr 23, 2019 at 01:42:13AM CEST, jmaxwell37(a)gmail.com wrote:
>v1: Change 96k to 192k in the comments
>
>We are seeing the following errors on some systems configured for LACP:
>
>eth0: Failed to set "priority".
>Loop callback failed with: Cannot allocate memory
>Failed loop callback: libteam_events, 0x
>
>The slave is then rejected and does not become part of the team. We debugged
>this down to -ENOMEM netlink error getting returned by nl_recvmsgs()called
>by send_and_recv(). Doubling the buffer size fixed the problem.
>
>Signed-off-by: Jon Maxwell <jmaxwell37(a)gmail.com>
applied, thanks!
4 years
[PATCHv3] teamd: add a default value 1000 for link_watch.interval
by Xin Long
From: Hangbin Liu <liuhangbin(a)gmail.com>
As we don't have a default value for link_watch.interval. If a user
forgets to set this parameter, teamd will fail to init ports' priv
and exit in the end.
e.g.
teamd -g -c '{"runner":{"name":"activebackup"},
"link_watch":{"name":"arp_ping","target_host":"198.51.100.1"}}'
teamdctl team0 port add p5p1
teamdctl team0 port add p5p2
teamd debug log shows:
p5p2: Got link watch from global config.
p5p2: Using sticky "0".
Failed to get "interval" link-watch option.
Failed to load options.
Failed to init port priv.
Callback named "lw_periodic" not found.
Callback named "lw_socket" not found.
Loop callback failed with: Invalid argument
Failed loop callback: libteam_events, 0x5624c28b9410
select() failed.
Exiting...
Removed loop callback: usock_acc_conn, 0x5624c28bab60
Removed loop callback: usock, 0x5624c28b9410
Removed loop callback: workq, 0x5624c28b9410
Removed loop callback: libteam_events, 0x5624c28b9410
Removed loop callback: daemon, 0x5624c28b9410
Failed: Bad file descriptor
Fix it by adding a default value for link_watch.interval.
v2: update default value to 1000, as Jamie Bainbridge suggested.
v3: fix the changelog to pass checkpatch.pl.
Reported-by: LiLiang <liali(a)redhat.com>
Reviewed-by: Jamie Bainbridge <jamie.bainbridge(a)gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
man/teamd.conf.5 | 10 ++++++++++
teamd/teamd_lw_psr.c | 11 ++++++++---
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/man/teamd.conf.5 b/man/teamd.conf.5
index 5b0f3e9..9090b4a 100644
--- a/man/teamd.conf.5
+++ b/man/teamd.conf.5
@@ -308,6 +308,11 @@ Default:
.TP
.BR "link_watch.interval "| " ports.PORTIFNAME.link_watch.interval " (int)
Value is a positive number in milliseconds. It is the interval between ARP requests being sent.
+.RS 7
+.PP
+Default:
+.BR "1000"
+.RE
.TP
.BR "link_watch.init_wait "| " ports.PORTIFNAME.link_watch.init_wait " (int)
Value is a positive number in milliseconds. It is the delay between link watch initialization and the first ARP request being sent.
@@ -371,6 +376,11 @@ Default:
.TP
.BR "link_watch.interval "| " ports.PORTIFNAME.link_watch.interval " (int)
Value is a positive number in milliseconds. It is the interval between sending NS packets.
+.RS 7
+.PP
+Default:
+.BR "1000"
+.RE
.TP
.BR "link_watch.init_wait "| " ports.PORTIFNAME.link_watch.init_wait " (int)
Value is a positive number in milliseconds. It is the delay between link watch initialization and the first NS packet being sent.
diff --git a/teamd/teamd_lw_psr.c b/teamd/teamd_lw_psr.c
index c0772db..ad6e56b 100644
--- a/teamd/teamd_lw_psr.c
+++ b/teamd/teamd_lw_psr.c
@@ -28,6 +28,7 @@
*/
static const struct timespec lw_psr_default_init_wait = { 0, 1 };
+#define LW_PSR_DEFAULT_INTERVAL 1000
#define LW_PSR_DEFAULT_MISSED_MAX 3
#define LW_PERIODIC_CB_NAME "lw_periodic"
@@ -77,9 +78,13 @@ static int lw_psr_load_options(struct teamd_context *ctx,
int tmp;
err = teamd_config_int_get(ctx, &tmp, "@.interval", cpcookie);
- if (err) {
- teamd_log_err("Failed to get \"interval\" link-watch option.");
- return -EINVAL;
+ if (!err) {
+ if (tmp < 0) {
+ teamd_log_err("\"interval\" must not be negative number.");
+ return -EINVAL;
+ }
+ } else {
+ tmp = LW_PSR_DEFAULT_INTERVAL;
}
teamd_log_dbg("interval \"%d\".", tmp);
ms_to_timespec(&psr_ppriv->interval, tmp);
--
2.1.0
4 years