Cracking Password Condensates... without violence

Published on

Before anything else, I recommend you to read the presentation I had done at the OSSIR in May 2019:

Summary of the presentation

Without repeating the complete content of the presentation, here are the main messages.

Breaking condensates is useful

Breaking condensates is useful to protect yourself by identifying the weakest passwords. For several months, cybercriminals have been massively attacking Office 365 using two main attacks:

-The "password spraying" consisting of trying a few predictable passwords (Enterprise2020, Password123...) on a large number of accounts, in order not to reach the threshold of blocking the number of erroneous attempts per account (often 5);

-The "password reuse" consisting in trying passwords found in public data leaks on the associated accounts. If Martin DUPONT from Entreprise had a Linkedin account with the password "Maman007", following the leak of the Linkedin database in 2012, there is a good chance that attackers will try to compromise his Office 365 account by trying to connect with the login and the password "Maman007".

Sรฉbastien Mรฉriot from OVH gave a very good presentation on the subject at the CORI&IN 2020 conference: Data Leakage & Credential Stuffing. If you couldn't see it, here is a quick summary:

Breaking condensates can also be used during a penetration test, for example in Active Directory environment by recovering password condensates thanks to the responder tool or by performing Kerberoasting, which works quite well ๐Ÿ˜‰ . For details on Kerberoasting, I recommend reading this excellent blog post in French from Pixis:

It's done well for not too much money

For a modest cost (at the scale of a company or a criminal organization), it is possible to have hardware allowing to test tens of billions of passwords per second to break condensates.

The results are always interesting

In general, in companies that have not performed this type of operation and/or do not have a strong security awareness, the results vary between 60% and 80%.

Recently, I worked on this subject for a company and I was able to recover the passwords of 85% of the 80,000 active accounts including :

  • 1,900 passwords belonging to the 10 worst passwords in the world (123456, azerty, password, 12345, 123123...), the famous "The World's Top 10 Worst Passwords";
  • 2,200 passwords containing only the company name or with the year as a suffix;
  • 3,140 passwords available in public data leaks ("password reuse");
  • 25,638 passwords from public dictionaries like CrackStation. So yes it's not good, yes it's unpleasant to discover, yes it can be scary, but it's better to discover it by doing this kind of audit and launch a remediation plan, rather than turning a blind eye and letting cybercriminals exploit these weaknesses.


For the conclusions, I refer you to my presentation that I will just complete with: if you use Office 365 activate the two-factor strong authentication, it is essential; I hear almost every day about compromised accounts from my customers, contacts, prospects, friends...

My modus operandi

When I recover condensates that I want to break, I generally follow the same modus operandi, which I improve with each iteration.

All this is of course perfectible, criticizable...

1 - Search for condensates in my own database

I keep a small database associating passwords and condensates (NTLM and SHA1-160bits) coming from publications such as, pastebin... If the condensates to be broken are neither NTLM nor SHA1-160bits, I move on.

2 - Specific dictionary

I build a dictionary specific to my target, from:

  • The Wikipedia of the target company, in most languages;
  • The website of the company, its brands, its subsidiaries, its parent company... with the CeWL tool (which I hate because it is developed in Ruby) or manually;
  • Press articles about the company, its brands... ;
  • Facebook accounts of the company, its brands... rather manually but it's quite fast;
  • Twitter accounts of the company, its brands... with the twofiquand tool when I can make it work, otherwise manually (almost as fast); I concatenate all these raw data, build some complex words or expressions manually and it gives me a first dictionary.

Breaking tool: hashcat with this dictionary and a set of 3 million derivation rules that I maintain over time (those provided by default with hashcat are already very good).

Duration: a few minutes

3 - All passwords with less than 7 characters

The technique is simple, it consists in testing all possible passwords from 1 to 7 characters long.

Tool: hashcat

Duration: within 15 minutes on 2 RTX 2080 graphics cards for NTLM condensates

3 bis - All passwords with less than 8 characters

If I have time, I do the same thing as before but with all possible passwords of 8 characters length.

Tool: hashcat

Duration: within 20 to 22h on 2 RTX 2080 graphics cards for NTLM condensates

4 - InsidePro dictionary

I use the InsidePro dictionary of 31Mb (having removed the duplicates with CrackStation, see the following)

Tool: hashcat and my 3 millions derivation rules

Duration: a few tens of minutes on 2 RTX 2080 graphics cards for NTLM condensates

5 - Crackstation dictionary

I use the CrackStation dictionary (downloadable on their website) that I previously cleaned, sorted and got rid of its duplicates (# sort -u | awk 'length($0) > 4 && length($0) < 41' )

Tool: hashcat and my 3 million derivation rules

Duration : within 24h on 2 RTX 2080 graphics cards for NTLM condensates

6 - Personal dictionary

For a while I've been building a password dictionary from all the data leaks I can recover. To date it takes 43Go. It is not exhaustive (I have neither the time nor the criminal networks to recover everything ๐Ÿ˜Š) but it allows to complete the previous dictionaries.

Tool: hashcat and my 3 millions derivation rules

Duration: between 3 and 4 days on 2 RTX 2080 graphics cards for NTLM condensations

7 - Obvious masks

I created a list of masks representing passwords that can be considered as classical like for example all words of 8 letters, starting with a capital letter and followed by 4 numbers.

Tool: hashcat

Duration: between 4 and 6 hours on 2 RTX 2080 graphics cards for NTLM condensates

8 - Custom masks

With the previously recovered passwords and a reduced list of interesting words (brand name, product name, chemical formula...) I realize a dictionary from which I will generate masks for hashcat. I developed a small tool in python for that, which I can provide only on request because... it is not absolutely clean ๐Ÿ˜‰.

For example, I define the following 4 groups: 'Company', 4 digits, 2 special characters and 2 numbers. Then from these 4 groups, I generate all possible combinations by incrementing the length of the elements, which would give the following masks ( ?d = digit, ?s= special character):

Company ?d ?s ?d Company ?d ?s ?s ?d => here it is the number of special characters that has varied Company ?d ?s ?d ?d=> we start again with 2 numbers instead of one Company ?d ?s ?d ?d => again 2 special characters ... Company ?d ?d ?d ?s ?d ?d => I finish this combination with all the groups in their entirety ?dCompany ?s ?d => here it's a new combination where I don't start with the company name but with the second group of 1 to 4 digits ... ?d ?d ?d Company ?s ?s ?d ?d => end of this combination

Tool: hashcat

Duration: I limit myself to combinations leading to passwords of 10 to 12 characters in order not to exceed one day of calculation in total

9 - Imitation masks

Here again, I use the previously recovered passwords but to generate masks that would have allowed to find them. This method is particularly effective ๐Ÿ˜.

If for example, I have the password "Cuckoo2020!", I deduce the following mask ( ?u = upper case, ?l = lower case, ?d = digit, ?s = special character) : ?u ?l ?l ?l ?l ?d ?d ?d ?s

I developed a small tool in python for this, which I can provide only on request because... it is, again, not absolutely clean ๐Ÿ˜‰.

Tool: hashcat

Duration: between 12 and 24h but my script being a bit more advanced than the previous one, I limit here the breaking duration to 1 day

10 - Found passwords + derivation rules

Here again, I use the previously recovered passwords and I simply use them as a new dictionary.

Tool: hashcat and my 3 millions derivation rules

Time: less than 5 minutes on 2 RTX 2080 graphics cards for NTLM condensations

All this can be improved and I still need to:

  • Connect the scripts in phases 7, 8 and 9 to eliminate redundant masks ;
  • Make my scripts more pro and usable by someone other than myself ๐Ÿ˜ƒ. If you've been following along, you'll notice that overall, it's all about the same passwords and I can hardly come up with a password that hasn't already been found previously or something close to it. This is true and that's why I regularly update my own dictionary with new passwords, actually found in the wild.

But finding a password that has never been used anywhere before, not based on a common word and of a good length... it's very difficult, so you know what to do ๐Ÿ˜‰.

Blog: Fortigate CVE-2023-27997 (XORtigate) in the eyes of the owl

The arbitrary Top of the past year 2022

KeePass, ultra-mega-giga critical vulnerability ๐Ÿคฆโ€โ™‚๏ธ

Let's keep in touch

Subscribe to our newsletter