Tue, Mar 22, 2022 at 04:10:20PM CET, pauldsmith(a)microsoft.com wrote:
>Jiri,
>
>Any more thoughts on the proposed patch below and the issue that caused it? As far as I can see, it's not made it into the source code yet.
Please resend the patch with extended patch description. Thanks!
>
>Thanks,
>Paul DS.
>
>-----Original Message-----
>From: Paul D.Smith
>Sent: 30 November 2021 17:11
>To: Jiri Pirko <jiri(a)resnulli.us>
>Cc: libteam(a)lists.fedorahosted.org
>Subject: RE: [EXTERNAL] Re: [patch libteam] ifindex 0 is invalid so do not process it.
>Importance: Low
>
>Jiri,
>
>Just checking in with you that the explanation below for the small fix makes sense. Anything more that I need to explain to you or do you have an alternative suggestion?
>
>Regards,
>Paul DS
>
>-----Original Message-----
>From: Paul D.Smith
>Sent: 09 November 2021 16:51
>To: Jiri Pirko <jiri(a)resnulli.us>
>Cc: libteam(a)lists.fedorahosted.org
>Subject: RE: [EXTERNAL] Re: [patch libteam] ifindex 0 is invalid so do not process it.
>
>The issue relates to the discussion we had earlier. The issue is that without this patch, the initial creation of the object tries to find the name of the interface with ifindex == 0. However n the underlying libraries that ae used to identify this name, there is an explicit assert that the ifindex cannot be zero.
>
>Now the ifindex is explicitly set later in the object creation and then the resolution to the interface name can succeed which if why this patch just catches ifindex=0 (the default) and does no attempt to store the ifindex and resolve to an interface.
>
>I have copied the explicit flow from our earlier e-mail below
>
>Regards,
>Paul DS
>
>---- The stack that I hit on a CentOS 7 system --- I start from the simplest of python scripts:
>
># -- Start script
>from team.core import Team
>
>Team("eth0")
># -- End script
>
>When I run this on CentOS7 I get the following error:
>
>[build@PC6663 external]$ python script.py APPLICATION BUG: route/link.c:1233:rtnl_link_build_get_request: ifindex or name must be specified
>python: route/link.c:1233: rtnl_link_build_get_request: Assertion `0' failed.
>Aborted
>
>So I traced this error back through the Python and source code (adding debugging print()s to see where we failed). This is the stack
>
>+ script.py tries to create the Team()
> A Team() object, as defined in core.py around line 337, is created.
> First the __init__() function calls capi.team_alloc()
> Next there is a call to the superclass init:
> super(Team, self).__init__(th)
> Note that this does not pass the teamdev (name string) and in fact it doesn't matter what string I pass when initializing the Team() object; the result is the same.
>+ The superclass __init__() is that for the class TeamNetDevice() which
>+ is defined in core.py, around line 85
> First the th which was just created is stored
> Next the th is passed to TeamNeDeviceIndexNameConverter()
> Finally we store the ifindex but here we note two things:
> 1/ The subclass didn't pass an ifindex so the default of 0 is used
> 2/ ifindex is actually a setter method defined slightly below.
>+ The ifindex setter first stores the value if the ifindex into object
>+ field _ifindex
> Now self._conv.dev_ifname() is called on the ifindex (0)
>+ dev_ifname() is defined in core.py, around line 71
> First the dev_id is passed to an if/else arm which allows a string
> It is this that allows my "fix" to work since my bad patch passes a string which causes this function to return at this point and avoid the error normally hit below
> The dev_id passed in was the parent's ifindex which is an int of value 0 so we store it in local field ifindex
> The code then calls capi.team_ifindex2ifname() passing the ifindex, again 0.
>+ team_ifindex2ifname() is defined in libteam.c, around line 1535 and
>+ call rtnl_link_get_kernel() passing the ifindex (0) and NULL as the
>+ possible name
> At this point we leave libteam and venture into libnl.
>+ rtnl_link_get_kernel() is defined in link.c, around line 1165
> rntl_link_get_kernel() calls rtnl_link_build_get_request() passing the ifindex (0) and name (NULL)
>+ rntl_build_get_request() is defined in link.c, around line 1116 and in the version I have contains this code:
>
>int rtnl_link_build_get_request(int ifindex, const char *name,
> struct nl_msg **result) {
> struct ifinfomsg ifi;
> struct nl_msg *msg;
>
> if (ifindex <= 0 && !name) {
> APPBUG("ifindex or name must be specified");
> return -NLE_MISSING_ATTR;
> }
>...
>
>Here is the precise error that I see and we can understand why - we passed the "name" as NULL and the "ifindex" is 0.
>
>---
>
>----Original Message-----
>From: Jiri Pirko <jiri(a)resnulli.us>
>Sent: 09 November 2021 16:03
>To: Paul D.Smith <pauldsmith(a)microsoft.com>
>Cc: libteam(a)lists.fedorahosted.org
>Subject: [EXTERNAL] Re: [patch libteam] ifindex 0 is invalid so do not process it.
>
>Thu, Oct 21, 2021 at 06:29:10PM CEST, pauldsmith(a)microsoft.com wrote:
>>From 89914e67018bdb8dfed5db1f7ddf7f39c7d1a493 Mon Sep 17 00:00:00 2001
>>From: "Paul D.Smith" <paul.d.smith(a)metaswitch.com>
>>Date: Thu, 21 Oct 2021 17:28:01 +0100
>>Subject: [patch libteam] ifindex zero cannot identify interface.
>>
>
>This is very short patch description. Could you please describe why this change is needed, what actual issue does this cause? Thanks!
>
>
>>---
>> binding/python/team/core.py | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>>diff --git a/binding/python/team/core.py b/binding/python/team/core.py
>>index 54161bf..b2fc5e8 100644
>>--- a/binding/python/team/core.py
>>+++ b/binding/python/team/core.py
>>@@ -101,7 +101,8 @@ class TeamNetDevice(object):
>> @ifindex.setter
>> def ifindex(self, ifindex):
>> self._ifindex = ifindex
>>- self.ifname = self._conv.dev_ifname(ifindex)
>>+ if ifindex:
>>+ self.ifname = self._conv.dev_ifname(ifindex)
>>
>> def get_hwaddr(self):
>> err, hwaddr = capi.team_hwaddr_get(self._th, self.ifindex, 6)
>>--
>>2.33.0.721.g106298f7f9