De-Anonymizing User Accounts Through Password Correlation

Executive Summary

We demonstrate that a large number of anonymous account users who are savvy enough to have complex passwords but still use their regular password with an anonymous account are vulnerable to being de-anonymized by even the limited credential leaks available to the public. 

This is demonstrated against Tormail, a now defunct anonymous email service that provided a high degree of anonymity for its users. With an old but recently published email/password dump we were able to de-anonymize with a high degree of certainty more than 16% of the 1019 Tormail accounts found. This was done by finding Tormail accounts with sufficiently complex passwords and linking them to non-anonymous email accounts that used the same or similar passwords.

Introduction

If a password is rare enough, then it uniquely identifies the person who uses it. If a person uses the same unique password with multiple accounts then that password can be used as a digital fingerprint to link those accounts. This is not a new principle but based on the number of vulnerable accounts there seems to be a widespread lack of awareness in just how important it is to use a different password when creating an anonymous account.

In order to meaningfully de-anonymize anyone we need a large set of plaintext and/or hashed credentials and a set of email accounts that are presumed to be anonymous. Because of the historical significance of Tormail and the difficulty of de-anonymizing these accounts, we are going to look at the practical relevance of this technique by seeing how many Tormail accounts we can de-anonymize using the recently discovered 1.4 Billion Email/Password Compilation.

Starting Small With 100 Random Tormail Credentials

To kick off the analysis, 100 random Tormail credentials were selected from the dump. We tried to de-anonymize each of them manually by searching for other entries that contained the password as a case-insensitive substring. Each of the search results was manually inspected to protect against error and to make sure that the correlation was real. The results were:

15 Successfuly De-Anonymized: Fifteen high complexity passwords were correlated with a Tormail account and a small number of non-tormail accounts. We verified through manual inspection for each of these passwords that they were sufficiently random enough that most people would consider them to be unique identifiers. Here are some examples these passwords (slightly obfuscated for ethical reasons):
  • gelunbac@m@c@2011*
  • 9thaeGener8ion
  • mlC3Erwg
20 Probably De-Anonymized: Twenty medium complexity passwords were correlated with a Tormail account and a small number of non-Tormail accounts, however the passwords were not complex enough or were too common to be considered a clear de-anonymization. In this case it is plausible that a small number of people who are not the owner of the Tormail account are also using this password. Here are some examples (slightly obfuscated for ethical reasons):
  • blue1tie
  • jed1983
  • patriotswin15

65 Remaining Anonymous: These passwords did not clearly correlate with other accounts - they were either too common or were only associated with their corresponding Tormail account.

    Widening the Net

    After looking through all the randomly sampled accounts manually, we derived the following heuristics to quickly surface tormail accounts that could probably be de-anonymized. We then applied it to surface all of the entries in the full credential set that matched the following conditions:
    • Email address contains "@tormail" as a substring
    • The password meets the following conditions:
      • Length of at least 10 characters
      • OR contains at least three character types out of the following four types: lower-case, upper-case, digit, symbol
    • At most 15 non-Tormail email addresses have a matching password.
    The 15 threshold may seem high, but in many cases it was necessary to account for duplicate domain suffixes - jdoe@gmail.com, jdoe@hotmail.com, jdoe@mail.ru, and so on.

    This heuristic flagged 191 out of 1019 Tormail accounts present in the dataset as being likely candidates for de-anonymization - a surprisingly large 19%. All of these could probably be de-anonymized with a little additional information, but for the purpose of this article we are only considering Tormail accounts that can be de-anonymized using nothing more than the chosen credential set.

    In order to be as complete as possible we manually went through all 191 hits and removed entries where the password had common words or where it was associated with too many unique account names. An obfuscated example password that was removed at this stage is 123Jenny123

    The final count was 167 out of 1019 Tormail accounts were de-anonymized using this technique with a high likelihood - a total of 16% of all accounts present in the data set.

    Conclusion

    A surprisingly large number of Tormail users were vulnerable to de-anonymization through password correlation using existing publicly available data sets. This is likely due to a general lack of awareness of the privacy implications to re-using an existing password when creating an anonymous account.

    Additional Insights

    There has been some interesting research lately on attacking anonymity by correlating secret information such as passwords - see On the Privacy Impacts of Publicly Leaked Password Databases by Olivier Heen and Christoph Neumann for an interesting albeit very academic read on the subject.

    After looking through so many passwords we noticed some specific points that are worth calling out where a password gives away details about the user without necessarily correlating them to another account:
    • Using real initials and full year of birth as a password (e.g. jwd1974)
    • Using full date of birth in a password (i.e. YYYYMMDD or something of that sort)
    • Using a real name or non-anonymous username with a number on the end (e.g. JohnDoe1)
    • Using an anonymous account name as password on a regular account
    • Copying and pasting a regular password twice as an anonymous password

    Comments

    Popular posts from this blog

    Exploiting Command Argument Injections With Openssl and Tar