[fedora-arm] Hardware Crypto Offload on Kirkwood (SheevaPlug)

Wed May 25 14:45:01 UTC 2011

omalleys at msu.edu wrote:
> Quoting Gordan Bobic <gordan at bobich.net>:
> 
>> Peter Robinson wrote:
>>>>> Interesting question; not sure.
>>>> I think it is an important one to answer, and sooner rather than later.
>>>> This is particularly important to the ARM community since a lot of
>>>> (most?) ARMs seem to have a crypto co-processor of some description
>>>> (Freescales, Kirkwood and Tegra definitely seem to have this, I haven't
>>>> checked others, but since these are the three classes of devices I own,
>>>> that's 3 out of 3 - I don't think it's luck/coincidence).
>>>
>>> Remember this needs to be upstream in Fedora and other projects and
>>> then Fedora ARM will get it by default. Work upstream, propose it as a
>>> Fedora 16 feature and do the work.
>>>
>>> I don't think its particularly important to ARM. It certainly helps on
>>> ARM due to low processing but its relevant on all platforms and most
>>> platforms have some form of crypto offload built in to the core CPU
>>> now days. Hell even the AMD geode processor used in the XO has it.
>>
>> Don't confuse crypto offload with crypto instructions. The new x86 stuff
>> has crypto instructions that make AES churn faster, but that doesn't
>> leave the CPU free to do other things while this is happening.
>>
>> My big concern is that the pursued solution will be one size fits x86,
>> as often happens (also why the same packages need ARM patches in
>> consecutive releases).
> 
>  From what I was reading in the kernel notes last night, the kernel 
> crypto bits are optimized for multiple cores (multithreaded) and SEE3.1 
> enabled.
> 
> Given the kernel support for crypto, shouldn't this be in the kernel also?
> 
>>>>>> 2) More abstraction (a OpenSSL->NSS shim library), means more bloat,
>>>>>> more context switching and less performance. Is that really the way
>>>>>> forward? I mean _really_?
>>>>> For bulk crypto operations an extra call via a shim probably doesn't
>>>>> matter.  For some signature operations it might.
>>>>>
>>>>> It seems like a clean solution from the point of view of application
>>>>> developers, though.
>>>> The other thing that needs to be considered is added complexity and
>>>> security. I would imagine that since there is an abstraction layer, it
>>>> introduces additional scope for exploits (buffer overruns, stack
>>>> smashing, etc.) Is this shim library going to also be FIPS 
>>>> certified? If
>>>> not then the improved security aspect of NSS vs. OpenSSL comes a lot
>>>> closer to pure marketing rhetoric (maybe that's where it's at at the
>>>> moment anyway, I don't claim to be an expert on the subject).
>>>
>>> Read the details mentioned on the advantage of NSS over OpenSSL for
>>> FIPS certification on the consolidation wiki.
>>
>> Are you talking about this sentence?
>>
>> "NSS, in contrast, allows all applications to inherit the NSS FIPS
>> validation status by following some simple rules detailed in the NSS
>> security policy document."
>>
>> I find it's a bit vague and opaque. I don't see how you could make the
>> certification hereditary.
> 
> Think of something like DNSSEC, which hands your machine a certificate, 
> then add something for your actual credentials, like another Cert on a 
> smartcard, kerberos and/or OpenID, then add say VPN to the equation..
> 
> If you start looking at this scenario, then most traffic is going to be 
> encrypted between all servers and all clients thus increasingly 
> important that hardware support is enabled for crypto. And this is why 
> crypto accelerators are pretty standard. It speeds up the client systems.
> 
> My question is, is everyone using the same crypto hardware accelerator 
> or are they all different (which would be my assumption.) on the ARM 
> Platform.
> 
> This whole scenario is a similar to the FPU, SIMD/MIMD, GPU Processing, 
> Crypto issues.. I -wish- there was an easy moduler way to say, okay we 
> have this function, lets use it.
> 
> The whole /dev/crypto if it is the right way to go.. reminds me of the 
> /dev/rnd or /dev/urnd argument from way back when we used prng.
> 
> To me this means you need /dev/crypto (or something similar) for -all- 
> devices regardless of hardware support or not, and all implementations 
> need to use it.  The optimizations only have to be done once, (but can 
> be overridden, as some algorithms are faster then others at specific 
> tasks for the same type of encryption), and the scheduling is taken care 
> of at the kernel level or at a library level but it has a unified 
> interface so NSS, OpenSSL, etc. can all use it efficiently and it needs 
> to be extensible to allow for future expansion.
> 
> Is that what we are aiming for?

It's an interesting approach, but I think it would be inefficient on 
hardware without hardware crypto acceleration. I don't think there is 
any inherent benefit in having all crypto done in kernelspace. Pushing 
it through the kernel is only useful if the kernel can hand it off to 
some underlying hardware and not worry about it until it gets back an 
interrupt saying the data is crypted. If there's no underlying hardware, 
context switching to the kernel will just make things slower.

I like the uniformity of the idea (and how thin this would make the 
userspace crypto libs), but all that skinniness of the userspace layer 
would mean bloat in the kernel, which is worse. It would also make 
things non-portable. So, sadly, I think the userspace will have to 
continue to bring it's own implementation for where hardware isn't there 
to help.

Unless I'm misunderstanding what you were describing?

Gordan