Analysis of CVE-2013-4441: pwgen Phonemes mode has heavy bias and is enabled by default

Jan Rusnacko jrusnack at redhat.com
Sun Jul 27 19:39:15 UTC 2014


Hello,

this is what I found when analysing phonemes bias in pwgen - I haven`t forwarded this to upstream yet. Comments appreciated.

Analysis of CVE-2013-4441
-------------------------
pwgen [1] has been reported to generate biases pronounceable passwords - issue which was assigned CVE-2013-4441. 
See original report: http://www.openwall.com/lists/oss-security/2012/01/22/6

The bias that was reported does not concern probability distribution of characters in pronounceable passwords (for example https://www.miknet.net/security/how-random-password-generators-can-fail/ analysis is completely irrelevant), since they naturally must have some bias towards vowels to be pronounceable. The bias is rather in the overall distribution of passwords: given set of N pronounceable passwords generated by pwgen, certain passwords have substantially more occurrences (original report mentions they are as much as 137 times more likely to be generated than they should be).
l
To illustrate the bias, I disabled the length check and recompiled it to generate 2 character pronounceable passwords. I generated 10 million of them and counted their occurrences, along with increase/decrease with respect to expected count (see attached). From the stats it is clear that bigrams starting with a vowel followed by a consonant are 73% less likely to be generated then expected. On the other hand, diphthongs starting with a vowel are whopping 850% more likely to be generated.

The code makes use of structure elements, which contains characters or diphthongs along with flags:

struct pw_element elements[] = {
	{ "a",	VOWEL },
	{ "ae", VOWEL | DIPTHONG },
	{ "ah",	VOWEL | DIPTHONG },
	{ "ai", VOWEL | DIPTHONG },
	{ "b",  CONSONANT },
	...

Looking at the code (pwgen-2.06/pw_phonemes.c), there are several places which contribute to the bias:

		if (should_be == CONSONANT) {
			should_be = VOWEL;
		} else { /* should_be == VOWEL */
			if ((prev & VOWEL) ||
			     (flags & DIPTHONG) ||
			     (pw_number(10) > 3))
			 	should_be = CONSONANT;
			 else
			 	should_be = VOWEL; 
		}

Variable should_be indicates whether the character which is being added is a VOWEL or CONSONANT. In case it is VOWEL, then there is a possibility variable should_be will be again assigned a VOWEL. This means pairs like oo, ae, ai, ie can possibly be generated if program flow goes through this branch. Bad news is, that pairs like oo, ae, ai, ie etc. are also diphthongs, which means there are two ways of generating them and higher likelihood they will end up in password. 

72:           should_be = pw_number(2) ? VOWEL : CONSONANT;

This is where should_be is initialized before the first character of password is generated. Since there are less vowels than consonants, yet the initialization split the change 50:50, for odd-length passwords there is a heavy bias towards generating a vowel-starting password.

Another bias comes from the fact that diphthongs are essentially treated as single characters, and since they are of length 2, they are more likely to stay in the password before it is cut-off at max size. This bias not accounted for in any way. Also this bias towards diphthongs and above mentioned condition creates another bias: digrams starting with vowel are less likely (-73 %) than digrams starting with consonant (-8 %).

To even out the bias, I removed diphthongs ah, oh and qu, removed complicated condition and added dice throw to even out diphthong bias (patch and stats attached). Bias with patch is much lower (85 % for vowel diphthongs to -48 % for consonant diphthongs) and perhaps fixes the CVE. But here comes the catch - the patch actually decreases security of pwgen. By removing three diphthongs and removing the weird conditional allowing two vowels in a row, it effectively decreases the number of password that can possibly be generated. Just for digraphs it decreases the password space from 229 to 209. For longer passwords this grows fast and surpasses the bias in generated passwords by magnitude. 

For fun I also generated 10 milion of pronounceable digraphs (yeah!) with apg [2] - stats attached. Manpage claims it is based on NIST`s FIPS-181 standard, and it's bias seems much smoother, but not too small either. In fact it might be interesting to conduct thorough research whether pwgen's passwords are more biased than apg's.

For completeness, there are methods for generating pronounceable password with provably no bias (I think [3] is an example, but unfortunately seems patented).

TL;DR
The algorithm in pwgen will inherently produce biased passwords. The fix to mitigate bias will inevitably decrease password space by magnitudes, which lowers the security much more than improves by removing the bias. Also apg password generator based on FIPS-181 has comparably big bias, so it is debatable whether CVE-2013-4441 should be valid at all.

[1] http://sourceforge.net/projects/pwgen/
[2] http://www.adel.nursat.kz/apg/
[3] http://www.google.com/patents/US5588056
-- 
Jan Rusnacko, Red Hat Product Security

-------------- next part --------------
et	11336	-74.04 %
oj	11275	-74.18 %
ab	11421	-73.85 %
ac	11671	-73.27 %
ad	11536	-73.58 %
af	11532	-73.59 %
ag	11544	-73.56 %
aj	11465	-73.75 %
ak	11645	-73.33 %
al	11620	-73.39 %
am	11585	-73.47 %
an	11539	-73.58 %
ap	11653	-73.31 %
ar	11449	-73.78 %
as	11624	-73.38 %
at	11433	-73.82 %
av	11512	-73.64 %
aw	11444	-73.79 %
ax	11460	-73.76 %
ay	11494	-73.68 %
az	11447	-73.79 %
eb	11690	-73.23 %
ec	11534	-73.59 %
ed	11417	-73.85 %
ef	11559	-73.53 %
eg	11528	-73.60 %
eh	11452	-73.77 %
ej	11358	-73.99 %
ek	11562	-73.52 %
el	11474	-73.72 %
em	11569	-73.51 %
en	11780	-73.02 %
ep	11596	-73.45 %
er	11577	-73.49 %
es	11516	-73.63 %
ev	11559	-73.53 %
ew	11385	-73.93 %
ex	11594	-73.45 %
ey	11487	-73.69 %
ez	11526	-73.61 %
ib	11719	-73.16 %
ic	11399	-73.90 %
id	11452	-73.77 %
if	11529	-73.60 %
ih	11511	-73.64 %
ij	11552	-73.55 %
ik	11656	-73.31 %
il	11601	-73.43 %
im	11485	-73.70 %
in	11624	-73.38 %
ip	11694	-73.22 %
ir	11512	-73.64 %
is	11471	-73.73 %
it	11379	-73.94 %
iv	11516	-73.63 %
iw	11384	-73.93 %
ix	11515	-73.63 %
iy	11686	-73.24 %
iz	11557	-73.53 %
ob	11515	-73.63 %
oc	11557	-73.53 %
od	11485	-73.70 %
of	11582	-73.48 %
og	11426	-73.83 %
ok	11449	-73.78 %
ol	11648	-73.33 %
om	11699	-73.21 %
on	11742	-73.11 %
op	11633	-73.36 %
or	11533	-73.59 %
os	11619	-73.39 %
ov	11709	-73.19 %
ow	11729	-73.14 %
ox	11426	-73.83 %
oy	11574	-73.50 %
oz	11492	-73.68 %
ub	11499	-73.67 %
uc	11540	-73.57 %
ud	11698	-73.21 %
uf	11433	-73.82 %
ug	11669	-73.28 %
uh	11431	-73.82 %
uj	11554	-73.54 %
uk	11476	-73.72 %
ul	11559	-73.53 %
um	11660	-73.30 %
un	11500	-73.66 %
up	11601	-73.43 %
ur	11492	-73.68 %
us	11508	-73.65 %
ut	11483	-73.70 %
uv	11409	-73.87 %
uw	11565	-73.52 %
ux	11724	-73.15 %
uy	11516	-73.63 %
uz	11779	-73.03 %
ig	11796	-72.99 %
ot	11802	-72.97 %
ea	30301	-30.61 %
ia	30525	-30.10 %
ui	30567	-30.00 %
aa	30765	-29.55 %
ao	30887	-29.27 %
au	30719	-29.65 %
eo	30630	-29.86 %
eu	30805	-29.46 %
ii	30953	-29.12 %
io	30812	-29.44 %
iu	30611	-29.90 %
oa	30927	-29.18 %
oe	30817	-29.43 %
oi	30732	-29.62 %
ou	30576	-29.98 %
ua	30856	-29.34 %
ue	30657	-29.80 %
uu	30642	-29.83 %
uo	31100	-28.78 %
bo	39577	-9.37 %
co	39430	-9.71 %
ga	39675	-9.14 %
gi	39715	-9.05 %
ka	39735	-9.01 %
na	39600	-9.32 %
ra	39622	-9.27 %
ro	39709	-9.07 %
si	39380	-9.82 %
to	39692	-9.11 %
vo	39736	-9.00 %
wo	39589	-9.34 %
yi	39610	-9.29 %
ba	39994	-8.41 %
be	39953	-8.51 %
bi	40142	-8.07 %
cu	39952	-8.51 %
da	39853	-8.74 %
de	39913	-8.60 %
do	39760	-8.95 %
du	40091	-8.19 %
fa	39819	-8.81 %
fe	39869	-8.70 %
fi	39789	-8.88 %
fo	39877	-8.68 %
fu	40012	-8.37 %
ge	40028	-8.34 %
gu	39952	-8.51 %
ha	40046	-8.29 %
hi	39920	-8.58 %
ho	40004	-8.39 %
hu	39777	-8.91 %
ja	40167	-8.02 %
ji	40058	-8.27 %
ju	39804	-8.85 %
ke	39994	-8.41 %
ki	39824	-8.80 %
ku	40015	-8.37 %
la	40115	-8.14 %
le	40050	-8.29 %
li	40038	-8.31 %
lu	39910	-8.61 %
ma	39778	-8.91 %
me	40132	-8.10 %
mi	40142	-8.07 %
ne	39974	-8.46 %
ni	39827	-8.80 %
no	40018	-8.36 %
nu	39996	-8.41 %
pa	39911	-8.60 %
pe	39996	-8.41 %
pi	40018	-8.36 %
po	40074	-8.23 %
pu	40119	-8.13 %
sa	39896	-8.64 %
se	39889	-8.65 %
su	39951	-8.51 %
ti	39901	-8.63 %
va	39866	-8.71 %
ve	39995	-8.41 %
vi	40140	-8.08 %
vu	39891	-8.65 %
wa	40011	-8.37 %
we	40019	-8.36 %
wi	40086	-8.20 %
wu	39986	-8.43 %
xa	39773	-8.92 %
xe	40168	-8.02 %
xi	40095	-8.18 %
xo	39785	-8.89 %
xu	39917	-8.59 %
ya	40116	-8.13 %
ye	39945	-8.53 %
yo	40171	-8.01 %
yu	40069	-8.24 %
zi	39746	-8.98 %
zo	39883	-8.67 %
bu	40343	-7.61 %
ca	40206	-7.93 %
ce	40291	-7.73 %
ci	40217	-7.90 %
di	40346	-7.61 %
go	40392	-7.50 %
he	40278	-7.76 %
je	40193	-7.96 %
jo	40459	-7.35 %
ko	40232	-7.87 %
lo	40269	-7.78 %
mo	40242	-7.85 %
mu	40248	-7.83 %
re	40247	-7.83 %
ri	40257	-7.81 %
ru	40295	-7.72 %
so	40231	-7.87 %
ta	40251	-7.82 %
te	40208	-7.92 %
tu	40341	-7.62 %
za	40265	-7.79 %
ze	40229	-7.88 %
ch	199969	357.93 %
ph	200096	358.22 %
sh	200050	358.12 %
th	200077	358.18 %
qu	200449	359.03 %
oh	396031	806.91 %
ah	396534	808.07 %
ae	414253	848.64 %
ie	414869	850.05 %
ee	415376	851.21 %
oo	415616	851.76 %
ei	415839	852.27 %
ai	416161	853.01 %
-------------- next part --------------
A non-text attachment was scrubbed...
Name: phonemes.patch
Type: text/x-patch
Size: 2033 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/security-team/attachments/20140727/ee424938/attachment.bin>
-------------- next part --------------
th	24281	-49.25 %
ch	24682	-48.41 %
ph	24523	-48.75 %
sh	24521	-48.75 %
ab	44400	-7.20 %
ag	44474	-7.05 %
aj	44333	-7.34 %
ak	44344	-7.32 %
at	44474	-7.05 %
eg	44421	-7.16 %
er	44301	-7.41 %
ez	44424	-7.15 %
ic	44456	-7.09 %
ih	44474	-7.05 %
in	44156	-7.71 %
iy	44423	-7.15 %
of	44494	-7.01 %
og	44418	-7.16 %
om	44463	-7.07 %
on	44448	-7.10 %
ox	44412	-7.18 %
oy	44454	-7.09 %
ug	44487	-7.02 %
uj	44415	-7.17 %
uk	44497	-7.00 %
um	44189	-7.64 %
up	44427	-7.15 %
ut	44489	-7.02 %
ux	44103	-7.82 %
ac	44656	-6.67 %
ad	44694	-6.59 %
af	44642	-6.70 %
ah	44604	-6.78 %
al	44729	-6.51 %
am	44573	-6.84 %
an	44642	-6.70 %
ap	44768	-6.43 %
as	44631	-6.72 %
av	44904	-6.15 %
aw	44789	-6.39 %
ay	44778	-6.41 %
az	44743	-6.49 %
eb	44841	-6.28 %
ec	44615	-6.75 %
ed	44511	-6.97 %
ef	44611	-6.76 %
eh	44607	-6.77 %
ek	44556	-6.88 %
el	44672	-6.63 %
em	44813	-6.34 %
en	44819	-6.33 %
ep	44606	-6.77 %
es	44919	-6.12 %
et	44569	-6.85 %
ev	44598	-6.79 %
ew	44613	-6.76 %
ex	44726	-6.52 %
ey	44559	-6.87 %
ib	44826	-6.31 %
id	44618	-6.75 %
if	44541	-6.91 %
ig	44720	-6.53 %
ij	44532	-6.93 %
ik	44550	-6.89 %
im	44518	-6.96 %
ip	44572	-6.84 %
ir	44761	-6.45 %
is	44655	-6.67 %
it	44784	-6.40 %
iv	44851	-6.26 %
iw	44602	-6.78 %
ix	44505	-6.98 %
iz	44613	-6.76 %
ob	44747	-6.48 %
oc	44636	-6.71 %
od	44638	-6.70 %
ok	44675	-6.63 %
ol	44819	-6.33 %
op	44813	-6.34 %
or	44570	-6.85 %
ot	44615	-6.75 %
ov	44654	-6.67 %
ow	44651	-6.68 %
oz	44839	-6.28 %
ub	44626	-6.73 %
uc	44566	-6.86 %
ud	44815	-6.33 %
uf	44629	-6.72 %
uh	44814	-6.34 %
ul	44530	-6.93 %
un	44637	-6.71 %
ur	44706	-6.56 %
us	44840	-6.28 %
uv	44569	-6.85 %
uw	44541	-6.91 %
uy	44611	-6.76 %
uz	44867	-6.23 %
ar	45044	-5.86 %
ax	44981	-5.99 %
ej	45120	-5.70 %
il	45020	-5.91 %
oh	45028	-5.89 %
oj	45050	-5.84 %
os	44989	-5.97 %
ce	48664	1.71 %
fe	48795	1.98 %
fi	48729	1.85 %
hi	48697	1.78 %
hu	48616	1.61 %
je	48788	1.97 %
ki	48473	1.31 %
no	48556	1.48 %
nu	48772	1.94 %
pi	48567	1.51 %
ra	48533	1.44 %
su	48682	1.75 %
ze	48734	1.86 %
ba	49149	2.72 %
be	48952	2.31 %
bi	49107	2.64 %
bo	48981	2.37 %
bu	49107	2.64 %
ca	49103	2.63 %
ci	49077	2.57 %
co	49114	2.65 %
cu	49122	2.67 %
da	49131	2.69 %
di	48920	2.24 %
du	49040	2.50 %
fa	48825	2.05 %
fo	49219	2.87 %
fu	48868	2.14 %
ga	49082	2.58 %
ge	48860	2.12 %
gi	49269	2.97 %
go	49111	2.64 %
gu	49000	2.41 %
ha	49009	2.43 %
he	49088	2.60 %
ho	49007	2.43 %
ja	49234	2.90 %
ji	49239	2.91 %
jo	49225	2.88 %
ju	49061	2.54 %
ka	49229	2.89 %
ku	49088	2.60 %
la	49066	2.55 %
le	48950	2.31 %
li	48848	2.09 %
lo	49146	2.72 %
lu	48911	2.23 %
ma	49086	2.59 %
me	49018	2.45 %
mi	49186	2.80 %
mo	48955	2.32 %
mu	48873	2.15 %
na	48993	2.40 %
ne	48931	2.27 %
ni	49072	2.56 %
pa	49213	2.86 %
pe	49264	2.96 %
po	48832	2.06 %
pu	48943	2.29 %
re	48967	2.34 %
ri	49023	2.46 %
ro	48997	2.41 %
ru	49087	2.59 %
se	49050	2.52 %
so	49067	2.55 %
ta	49058	2.53 %
te	49240	2.91 %
ti	49166	2.76 %
va	49081	2.58 %
ve	49147	2.72 %
vi	49028	2.47 %
vo	48908	2.22 %
vu	49058	2.53 %
wa	48941	2.29 %
we	49056	2.53 %
wi	49007	2.43 %
wu	48875	2.15 %
xa	49138	2.70 %
xi	49250	2.93 %
xo	48849	2.10 %
xu	48982	2.37 %
ya	48938	2.28 %
ye	48813	2.02 %
yi	48893	2.19 %
yo	48927	2.26 %
yu	49245	2.92 %
zo	49238	2.91 %
de	49506	3.47 %
do	49634	3.74 %
ke	49318	3.08 %
ko	49403	3.25 %
sa	49298	3.03 %
si	49357	3.16 %
to	49652	3.77 %
tu	49282	3.00 %
wo	49302	3.04 %
xe	49384	3.21 %
za	49337	3.12 %
zi	49281	3.00 %
ee	88901	85.81 %
ei	88779	85.55 %
ae	89359	86.76 %
ai	89262	86.56 %
ie	89016	86.05 %
oo	89804	87.69 %
-------------- next part --------------
uz	3138	-94.01 %
oz	3657	-93.02 %
ux	3144	-93.99 %
az	4090	-92.19 %
ex	4187	-92.00 %
ez	4169	-92.04 %
ox	3718	-92.90 %
ax	4270	-91.84 %
ix	4353	-91.69 %
iz	4501	-91.40 %
zu	6152	-88.25 %
za	10464	-80.01 %
zi	10456	-80.03 %
zo	10380	-80.17 %
um	18297	-65.05 %
ul	18537	-64.59 %
up	18592	-64.49 %
om	21924	-58.13 %
op	21985	-58.01 %
ol	22042	-57.90 %
ou	22056	-57.87 %
eu	24543	-53.12 %
ub	24449	-53.30 %
ug	24524	-53.16 %
uk	24592	-53.03 %
us	24548	-53.11 %
al	24871	-52.50 %
am	24774	-52.68 %
ap	24964	-52.32 %
au	24644	-52.93 %
el	24909	-52.42 %
em	24875	-52.49 %
ep	24774	-52.68 %
uf	24699	-52.82 %
uj	24735	-52.76 %
uv	24632	-52.95 %
im	26585	-49.22 %
ip	26394	-49.59 %
su	26294	-49.78 %
il	26954	-48.52 %
ob	29144	-44.33 %
of	29303	-44.03 %
ok	29146	-44.33 %
os	29256	-44.12 %
ov	29198	-44.23 %
og	29368	-43.91 %
oj	29454	-43.74 %
ow	29559	-43.54 %
oy	29379	-43.89 %
un	30483	-41.78 %
ur	30811	-41.15 %
ut	30943	-40.90 %
af	32954	-37.06 %
ak	32979	-37.01 %
as	32877	-37.20 %
av	32794	-37.36 %
ay	32667	-37.61 %
ef	32679	-37.58 %
ej	32745	-37.46 %
ew	32935	-37.09 %
ab	33015	-36.94 %
ag	33089	-36.80 %
aj	33016	-36.94 %
aw	33088	-36.80 %
eb	33131	-36.72 %
eg	33118	-36.74 %
ek	33152	-36.68 %
es	33377	-36.25 %
ev	33104	-36.77 %
ey	33102	-36.78 %
ig	35546	-32.11 %
iv	35515	-32.17 %
sy	35186	-32.79 %
ib	35727	-31.76 %
if	35617	-31.97 %
ij	35850	-31.53 %
ik	36078	-31.09 %
is	35734	-31.75 %
oi	36557	-30.18 %
oa	36756	-29.80 %
on	37094	-29.15 %
oo	36926	-29.47 %
or	36769	-29.77 %
ot	36901	-29.52 %
uc	37138	-29.07 %
ud	37038	-29.26 %
ai	41037	-21.62 %
an	41196	-21.32 %
ar	41114	-21.47 %
at	41111	-21.48 %
ei	41306	-21.11 %
en	41126	-21.45 %
eo	41274	-21.17 %
er	40954	-21.78 %
et	41065	-21.57 %
ea	41540	-20.66 %
lu	42837	-18.18 %
hu	43113	-17.65 %
ku	43094	-17.69 %
mu	43072	-17.73 %
oc	43735	-16.47 %
si	43883	-16.18 %
so	43745	-16.45 %
in	44503	-15.00 %
io	44494	-15.02 %
ir	44120	-15.73 %
it	44131	-15.71 %
od	43982	-15.99 %
sa	44056	-15.85 %
ia	44643	-14.73 %
wu	48112	-8.11 %
py	48383	-7.59 %
ac	49318	-5.80 %
ad	49485	-5.48 %
ec	49340	-5.76 %
ed	49582	-5.30 %
ee	49693	-5.09 %
fu	49962	-4.57 %
gu	50047	-4.41 %
bu	51676	-1.30 %
ic	53497	2.18 %
id	53735	2.63 %
tu	54318	3.75 %
my	57015	8.90 %
hy	57381	9.60 %
ly	57465	9.76 %
po	60711	15.96 %
pa	60810	16.15 %
pi	60921	16.36 %
wy	64766	23.70 %
yu	66473	26.96 %
cu	66652	27.31 %
ju	66573	27.15 %
vu	66897	27.77 %
by	69145	32.07 %
ru	71119	35.84 %
ho	71670	36.89 %
la	71669	36.89 %
li	71692	36.93 %
ma	71571	36.70 %
ha	71856	37.25 %
hi	71783	37.11 %
ka	71917	37.36 %
ki	72212	37.92 %
lo	71984	37.49 %
mi	72104	37.72 %
mo	72011	37.54 %
ty	72032	37.58 %
ko	72484	38.44 %
du	72845	39.13 %
wa	81434	55.54 %
wo	81156	55.01 %
wi	82183	56.97 %
go	83208	58.93 %
fa	83669	59.81 %
fi	83372	59.24 %
fo	83435	59.36 %
ga	83674	59.82 %
nu	83413	59.32 %
gi	83841	60.14 %
ba	86022	64.30 %
bo	86198	64.64 %
bi	86678	65.56 %
cy	88427	68.90 %
ta	90406	72.68 %
ti	90383	72.63 %
to	90201	72.28 %
ry	95243	81.91 %
dy	96728	84.75 %
vi	110961	111.94 %
ci	111030	112.07 %
co	111302	112.59 %
jo	111232	112.45 %
va	111400	112.77 %
vo	111337	112.65 %
ya	111351	112.68 %
yo	111413	112.80 %
ca	111537	113.04 %
ja	111686	113.32 %
ji	111851	113.64 %
yi	111799	113.54 %
ra	118659	126.64 %
ri	119705	128.64 %
ro	119739	128.70 %
da	120809	130.75 %
di	121147	131.39 %
do	121157	131.41 %
ye	133639	155.25 %
na	138708	164.93 %
ni	138899	165.30 %
no	139342	166.14 %


More information about the security-team mailing list