FESCO request to revert password confirmation change in F22

Sun Mar 8 01:35:48 UTC 2015

On 7 March 2015 at 15:33, Mike Pinkerton <pselists at mindspring.com> wrote:

>
> On 7 Mar 2015, at 15:52, Stephen John Smoogen wrote:
>
>
>>
>> On 7 March 2015 at 11:53, Mike Pinkerton <pselists at mindspring.com> wrote:
>>
>> On 7 Mar 2015, at 10:41, Björn Persson wrote:
>>
>> Mike Pinkerton wrote:
>> On 6 Mar 2015, at 23:49, Adam Williamson wrote:
>> On Fri, 2015-03-06 at 23:09 +0100, Björn Persson wrote:
>> I hope  https://xkcd.com/936/will be among the inputs to that
>> discussion.
>>
>> I'm fond of noting that pwquality has not yet blacklisted any variant
>> of correcthorsebatterystaple. I've been using correcthorse as my stock
>> anaconda testing password, since the strength check has been
>> enforced...
>>
>> It won't stand up to a combinator attack:
>>
>> <https://www.schneier.com/blog/archives/2013/06/a_really_good_a.html>
>>
>> It's not entirely clear, but I guess you mean that a two-word
>> combination like "correct horse" won't stand up. That appears to be
>> true. A four-word phrase is an entirely different matter. Each
>> additional word increases the complexity exponentially, so doubling the
>> number of words squares the number of possible combinations.
>>
>> The "combinator" attack that is described in the Ars Technica article
>> that Bruce Schneier quotes in the above link appears to be an attack that
>> tries combinations of multiple words from one or more of the attacker's
>> word lists.  Certainly adding more words to the pass-phrase would make that
>> more difficult.  As I don't know the current state of the art in password
>> cracking, I don't know whether attackers typically limit their attacks to
>> only two words, or extend to three or more words.
>>
>>
>> They limit it to 1-2 words because it takes a LONG time to crack
>> SHA512crypt passwords. You can do on average 32k -> 128k hash crypt checks
>> per second per password. A two word dictionary of diceware would have
>> 2^25.85 passwords in it. A single system is going to take 256 seconds on 2
>> words. Add in 3 words (2^38.775) and it is 24 days. Add in a 4th word and
>> it is 544 years. Add in a 5th word and it is 4.5 million years.
>>
>
>
> Apparently Diceware's creator is not as confident as you -- he nows
> recommends more than 5 words.
>
> <http://arstechnica.com/information-technology/2014/
> 03/diceware-passwords-now-need-six-random-words-to-thwart-hackers/>
>
> Perhaps improvements in graphics cards have changed the calculus in recent
> years.
>
>
Yes and no.

1) He has always wanted to make sure that an attack was going to take
billions of years for the US government on. Thus his level of threat is the
100 billion dollar cluster... Which yes 6 or 7 words would be needed if not
8. Your password of completely random characters will also need to be a lot
longer.

2) He is also aware that most of the hacks out there have not been
SHA512crypt but MD5sum/SHAsum/NT password breaches. If you are lucky they
used md5crypt or the original sha1crypt. Those are formats that yes
millions of attacks per second can be done in an offline attempt.  If you
have no control over how the password is stored then using 4 or 5 words is
not enough.

3) Yes graphic cards improve with more cores but they do not increase word
size as often because there really isn't much need other than cracking
large passwords (bitcoin which is the primary use for video cards doesn't
get faster with a larger word so it isn't something people will pay for.)
Without a larger word size the various code for doing a SHA512crypt gets
slow.

Neither of the first two items are things which are going to be general
users of Linux are needing to deal with. If you are having to worry about
that sort of attack then you are going to need a lot more work than a 100+
bit entropy password.

>
>  While writing this up I went and checked that the whole thing is outlined
>> point for point in wikipedia
>> http://en.wikipedia.org/wiki/Password_strength
>>
>> To estimate the time just do the following:
>>
>> $15,000 computer -> 128k/sec = 2^17. Lets assume moore's law comes in and
>> we have 2^20 by 2020.
>>
>> Take the possible entropy and subtract the 2^17 and that will give you
>> the worst case. I believe it may be 1/4 of that so make it subtract 2^19
>> currently for one system and 2^29 for a cluster of 1024 computers (so 15
>> million dollars).
>>
>> 2 words is going to be (25.85-19) 115 seconds for one system and 0.1 for
>> big ass cluster.
>> 3 words is going to be (38.78-19) 236 hours ). <1 day for big ass cluster
>> 4 words is going to be (51.70-19) 221 years).  < 1 year
>> 5 words is going to be (64.63-19) 1.7 million years) < 1700 years. (or
>> 1.7 years for a 15 billion dollar investment).
>>
>> To get equivalent strength from say an all lower case password you are
>> going to need 14 [a-z] characters.
>>
>> Now here is the funny thing. All that speed to get 128k is if the
>> password is less than around 12 characters for most cracking software due
>> to the way the hardware and algorithms have been optimized. If the string
>> is longer than that the hardware drops in speed by orders of magnitude. So
>> correctstaple is actually going to take longer than I said. In fact all the
>> numbers I put for 3+ words is probably going to be 10-100 times longer.
>>
>
>
> All of this assumes that the attacker is trying to brute force the entire
> string -- character by character.  In the Ars Technica article I linked to
> in my previous message, the attackers did not try to brute force anything
> over 6 characters.  Instead, they used other strategies, including the
> combinator strategy that would have broken correcthorse.
>
>
I am going to assume that your definition of brute force is a, b, c, d, e,
f,... all the way to ~~~~~~ . That is 95^6 735,091,890,625 things to test.

Second of all they were testing md5sum passwords That is a format which you
can do hundreds of millions of attempts per second on a standard video
card. The speed difference between md5sum and md5crypt is 3-4 orders of
magnitude. The speed difference between md5sum and sha512crypt are much
more.

Three the thing they took advantage of was that people are lazy. If a
password set were from fedoraproject.org then I would start testing with

fedora
fedoraproject
redhat
linux
password
letmein
foobar
correct
correcthorse
correcthorsebattery
correcthorsebatterystaple
smartass
abcdefghijklmnopqrstuvwxyz
qwertyuiopasdfghjklzxcvbnm

as my main words. I would then test with capitalization and then add in the
most common combinators. 4kids, 4life, 123456, other linux websites
(slashdot, lwn.net) plus various pre and other items. I would also do
various other items. It is still bruteforcing, it just isn't a,b,c,d,e,f
bruteforcing. The reason they stopped at 6 characters of ASCII is that you
can do that on md5sum in a couple of  hours (7 characters takes weeks, 8
months?). On md5crypt doing a brute force takes a couple of years (3 when I
did it last.. if moore's law is in place for CUDA that should be 1 year
now). On sha512crypt it was close to a decade (2 years now?). [Red Hat
Linux and Fedora have never used md5sum. They used md5crypt for many years
and now use sha512crypt]

So why did they pick up so many passwords? Because a TON of people just add
1, !, 123, 123456, and maybe the popular TV show of the time (bigbang is
popular). You aren't testing all the combinations, you figure out the most
common combinations and use that. The search space is completely different
from a properly run diceware (or similar passphrase generator). It is also
different from a completely random 12 character string.. it is more like a
6 character string with a high likelihood of 123456 added to the end. And
that is how most of those passwords were gotten (I repeated the experiment
after the article was out because I was accessing how hard it would be for
Fedoraproject passwords to be 'gotten' . It took 3 months to go over 10,000
possible passwords. Of that only people who chose easily guessed passwords
were found (the ones listed above as core words). Now if you had
redhat987654 as your password it would have taken me years to get there
linearly.

>
>  There are 2 caveats.
>>
>> 1) Once again, Adam was being sarcastic. He knows the password isn't any
>> good because well he TOLD everyone what it was. He was making fun of the
>> fact that libpwquality does not blacklist it.. which means that
>> correctstaple is the new password of choice (when the old one might have
>> been 123456)
>>
>
>
> I saw your first note that Adam was being sarcastic -- although it
> probably doesn't matter what password he is using for offline testing of
> Anaconda and release candidates.
>
> I was responding to Björn Persson's suggestion that, in discussions of
> password quality, correcthorsebatterystaple would be an example of a safe
> password.  My point is that, if attackers are using strategies other than
> brute forcing, which the Ars Technica article suggests is the case, then
> constructing long passwords out of known words is probably not a safe
> strategy.
>
>
The problem with your conjecture is that there is a vast difference between
if you just choose 2 words versus using a computer to choose 2 words. If
you choose two words you are most likely going to choose ones with some
sort of association. Or you will use your languages grammar to do some sort
of association. That cuts down the search space incredibly. [A two word
space instead of having say 60466176 combinations will only be 100-1000.]
 red hat, red beer, red car, red tomato, red dress. blue hat, blue boat,
blue car, blue coat. If the passwords are really randomly selected via
diceware or similar tools, then that number jumps back to 6 million (with a
mean time of 1/2 of however long it would take to do 6 million.)

> Because the word lists used by attackers are lists of strings that they
> have scraped from various sources -- human language dictionaries, password
> strings found in previous attacks, passwords publicized by Adam on mailing
> lists, strings constructed on patterns (e.g., "7kids", "8kids"), etc. -- a
> string that one would normally think of as four words --
> correcthorsebatterystaple -- once it has been discovered as a password once
> and added to the attacker's word list, becomes only one word for all future
> cracking attempts.
>
>
>  2) This is always true http://xkcd.com/538/
>>
>> And finally. If one were to take the top 1 million known passwords as the
>> dictionary.. then each word would have about 20 bits of entropy. A password
>> generator that outputted stuff like
>>
>> 123456 password trustn01 letmein1
>>
>> would take 256 or more longer to brute force crack than using diceware.
>> Actually that sounds like a nice project to add to my EN_RN translation
>> project.
>>
>
>
> Except that the attackers aren't brute forcing long passwords.
> Apparently, they can successfully crack a ridiculously high percentage (90%
> in the Ars Technica experiment) in the space of a day using other
> techniques.
>
> How much entropy does "rastafarianestablishmentarian" have?  With the
> techniques attackers are using, I doubt it would take even one hour to
> crack it.
>

It depends on a lot of things. If the password is stored as an md5sum then
it will probably take a week or so because I have to work my way through
the oxford dictionary and put all the words that match in a definition
together. If you had put a word in the second part that wasn't associated
with rastafarian... it would be a lot longer.

Rastafarian - Oxford Dictionaries
www.oxforddictionaries.com/us/.../Rastafarian
OxfordDictionaries.com
Rastafarians have distinctive codes of behaviour and dress, including the
... Darien, disciplinarian, egalitarian, equalitarian, establishmentarian,
fruitarian,  ...

Then it is going to depend on the tools I use. For a long time most of the
fast password crackers did not check anything after 15 characters for
md5sum and tend to do this for some other formats.
Thus rastafarianestablishmentarian would only be checked as rastafarianesta
and would never match. This has been fixed with most tools but there is
still a catch in that you are using glibc versus cuda version of the code
and you are now at CPU speed versus video card speed. CPU speed on md5sum
is in the low millions and md5crypt and sha512crypt are down in the
thousands.

At that point, you have to be very smart in your forcing spending a lot of
time ahead looking for word associations. It would take weeks even with
fast md5sum to go through a brute force of 2 word combination in the oxford
dictionary. It would take days if you have already worked out that word X
is going to be associated with words A,B,C,D,E,F.... People like the
experts in that article have done a LOT of homework on human psychology to
make smart guesses. They are also the first to tell you that all that goes
out the window if the passwords are truly randomly put together.

> That was my point.
>
>

>
> --
> Mike
>
>
> --
> devel mailing list
> devel at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/devel
> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
>

-- 
Stephen J Smoogen.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/devel/attachments/20150307/f409515c/attachment-0001.html>