> On 6 Sep 2019, at 10:12, Peter Robinson <pbrobinson(a)gmail.com <mailto:pbrobinson@gmail.com>> wrote:
>
> On Fri, Sep 6, 2019 at 9:58 AM Tim Coote <tim+fedoraproject.org(a)coote.org <mailto:tim%2Bfedoraproject.org@coote.org>> wrote:
>> On Wed, Sep 04, 2019 at 02:41:49PM -0700, Troy Dawson wrote:
>> 2 – Of those packages that need to be in the IoT image, which ones do you think are the top priority? And what parts do you think can be trimmed? These are from previous discussions.
>> anaconda-core – move flatpack from -core to -gui [3]
>> initial-setup – pulls in … everything. Can it be trimmed down.
>> Presumably `podman` is high priority.
>> why?
>> I’ve heard mention that there’s support for lxc images as an approach to packaging code within Fedora, but I’ve not managed to identify what the argument is. It makes sense in the context of a multiple machine environment, where there’s value in hiding the machine specifics, but in the context of IoT, the typical deployment is one machine with many cooperating processes and adding in an extra layer of indirection makes testing and operations more complex, while removing some of the dependency management/control of a package manager.
>>
> There's a bunch of points here so:
>
> lxc images in Fedora? That's a new one to me and I've never had any discussion around that with anyone in the context of IoT.
My mistake. I meant podman containers.
>
> In terms of mult machine environment that's exactly what IoT is, I've spoken with customers where they want tens or hundreds of thousands of machines, in one case there was even a desire for the millions and then you do need to hide the specifics of the machine because there will be multiple versions of applications running, multiple versions of HW etc so the ability to be able to manage each app or app stack independently on a lifecycle that's separate to the underlying OS is critical. If we need to patch the OS for something like Spectre, a WiFi or bluetooth flaw you don't necessarily want to have to impact the application stack to be able to do that.
>
> I would ultimately, and pretty much all the feedback I've had is that running things in containers make things easier for teams as managing the dependencies for the apps in independent for each of the apps is easier and makes operations easier because they can look at each of the applications independently because each of the apps teams are independent to make the decision that is best of those teams independent of the underlying BaseOS and independent of the HW.
My last IoT system’s around 1M computers. Initially, I thought that the issue being addressed was to create pools of containers that demand can be spread across, but I think that the desire is to support h/w and o/s variation in the field.
I’ve looked at this problem in large enterprises in some depth. IoT emphasises the same challenges.
At scale, the support costs dominate the economics because of the explosion of combinations of compoenent versions. The only way that I’ve found to keep a lid on the Incident and Problem Management costs is to keep as much the same as possible, otherwise the regression test corpus and the cost of Problem reproduction becomes horrid. It does mean driving the CD pipeline hard, to keep up, but you’ve got to do that anyway to keep abreast of fixes for exploits.
>> I’d love to understand the rationale for using containers in this context better, as I’m concerned that I’ve missed something. Are there any pointers?
>>
> It allows isolation. Isolation of the applications from each other and from the HW. A lot of IoT use cases want to run multiple independent applications on a single device and if one of those gets compromised the impact is has can be mitigated and not affect the underlying HW or other applications running along side it. Add to that it allows applications to be upgraded on different lifecycles to the underlying BaseOS or each other.
As a general rule, in the early stages of an IoT system, when the problem and the solution are not well understood, there is a high rate of deployment of new s/w versions. If these are going all the way to the edge, I’d expect a lot of instability.
I think that I understand the problem better now. Thanks. I’m less sure that I believe the analysis/solution: let’s see what works :-)