urandom vs haveged

Fri Mar 30 12:55:11 UTC 2012

On Monday, March 26, 2012 03:56:43 PM Chris Murphy wrote:
> Performance:
> 
> dd if=/dev/zero		~56MB/s		CPU < 10%
> dd if=/dev/urandom	~12MB/s		CPU 99%
> haveged			~54MB/s		CPU < 25%
> 
> 
> The dd relative values are consistent with kernels in Fedora 16. However
> these tests were done with 3.3.0-1. The questions are:
> 
> Is the urandom performance expected?

I get this:

# dd if=/dev/zero of=/dev/null 
4775272+0 records in
4775271+0 records out
2444938752 bytes (2.4 GB) copied, 4.12342 s, 593 MB/s

# dd if=/dev/urandom of=/dev/null 
118512+0 records in
118511+0 records out
60677632 bytes (61 MB) copied, 8.0117 s, 7.6 MB/s

On ^^  quadcore using 2.6.35.14 kernel.

I would say this is somewhat expected because /dev/zero does nothing while 
/dev/urandom stirs in system entropy and hashes it before letting it out. 

> What is the quality of pseudo-random data produced by urandom vs haveged?

The quality of urandom is very good. Its studied every couple years for common 
criteria purposes. Haveged on the other is never used in common criteria and its 
properties are largely unknown. From its home page:

HAVEGE (HArdware Volatile Entropy Gathering and Expansion) is a user-level 
software unpredictable random number generator for general-purpose computers 
that exploits these modifications of the internal volatile hardware states as a 
source of uncertainty

Unpredictable means someone needs to do a lot of study to determine if there are 
predictable cycles to it. Does it have scheduler artifacts in its numbers? What 
if the hardware its using is not available during system installation? Does it 
work on all platforms? Does it do any conditioning of its entropy sources? Does 
it quality check its numbers before sending them out?

> If the qualities are similar, or haveged's is better, is there anything
> that can be done to improve urandom's performance?

Possibly. I don't know if anyone has looked at making it faster or studied where 
the bottleneck is. It does produce very high quality numbers and its well 
studied, though. That has always been the prime focus.

Something else I'd like to mention is that during system installation there is 
very little system entropy. There is no saved seed to prime the generators with. 
(LiveCD's have the same problem.) I have a feeling that the randomness of the 
numbers is not what you would expect. If you have a mouse attached and are doing 
a graphical install, then waving the mouse around will make sure you have 
entropy. But if you don't have a mouse and are doing a text or kickstart 
install, you need to find a way to get keystrokes involved. If you can think of a 
key that has no effect on any questions in the install, hit it a bunch of times. 
If you have a kickstart, put something in the script requiring typing a bunch of 
keystrokes and throw them away.

In a way, if encrypted disks are being created at install time, Anaconda might 
want to measure entropy before creating the keys and optionally allow you to add 
keystrokes or wave the mouse around or startup rngd to gather entropy from a tpm 
chip or rdrand instructions.

-Steve