HomeBlogAbout UsWorkContentContact Us
 
 Advertisment 

 

PIN analysis

A good friend of mine, Ian, recently forwarded me an internet joke. The headline was something like:

“All credit card PIN numbers in the World leaked”

The body of the message simply said 0000 0001 0002 0003 0004

Ian’s messages made me chuckle. Then, later the same day, I read this XKCD cartoon. The merging of these two humorous topics created the seed for this article.

 

I love Randall’s work. My favorite, to date, is this one. I have a signed copy of it on my office wall.

Like many of his creations, this cartoon is excellent at bifurcating readers; people read it, then either smile and chuckle, or stare blankly at it followed by a “Huh? I don’t get it!” comment. Then you explain it, and get a reply “Yeeaaaaaa…no, I still don’t get it!”

Esoteric humor in action.

You can be cool and buy his signed artwork too.

 

What is the least common PIN number?

There are 10,000 possible combinations that the digits 0-9 can be arranged to form a 4-digit pin code. Out of these ten thousand codes, which is the least commonly used?

Which of these pin codes is the least predictable?

Which of these pin codes is the most predictable?

If you were given the task of trying to crack a random credit card by repeatedly trying PIN codes, what order should you try guessing to maximize your chances of selecting the correct number in the shortest time?

If you had to make predication about what the least commonly used 4-digit PIN is, what would be your guess?

This tangentially relates to the XKCD cartoon. In Randall’s cartoon, the perpetrator’s plan backfired because his selected license plate was so unique that it was very memorable. What is the least memorable license plate? Ask any spy you know (snigger) what the best way to blend into a crowd is. Their answer will be not stand out, to appear “normal”, and not be notable in any way.

People are notoriously bad at generating random passwords. I hope this article will scare you into being a little more careful in how you select your next PIN number.

Are you curious about what the least commonly used PIN number might be?

How about the most popular?

Read on …

DISCLAIMER

This article is not intended to be a hacker bible, or to be used as a utility, resource, or tool to help would-be thieves perform nefarious actions. I will only disclose data sufficient to make my points, and will try to avoid giving specific data outside of the obvious examples. I do not want to be an enabler for script-kiddies. Please do not email me asking for the database I used; if you do, you will be wasting your time as I’m not going to respond. I’m not going to sell, donate or release the source data – don’t ask!

Source

Obviously, I don’t have access to a credit card PIN number database. Instead I’m going to use a proxy. I’m going to use data condensed from released/exposed/discovered password tables and security breaches.

Soap Box – Password Database Exposures

Over the years, there have been numerous password table security breaches: Some very high profile, some low profile, but all embarrassing (and many exceedingly expensive; both in direct fines and indirect loss of business through erosion of trust and reputation).

Fool me once, well, no, even that’s not really acceptable, but fool me twice … I’ll go even further: Any developer who stores the password table of their database in clear text should be so mortified by this lack of security that they should not be sleeping at night until they fix it. Ignoring the fact that you should never have ever coded it this way, you have an obligation to learn from these past breaches.

If you work for a company and are knowledgeable that your customer database is “protected” by such lightweight security then run, don’t walk, to your CEO/Presidents office, pound on the door and insist (s)he puts out a mandate to fix the matter with extreme prejudice. Don’t leave until you get an affirmative response. Badger, badger then badger them again. Make yourself a proverbial thorn in their side.

I’m not trying to sell my services as a consultant here (though if you are interested, my rates are very reasonable compared to the cost of legal defense, potential FTC sanctions, class action suits, shareholder backlash, fines, loss of reputation and business …) There are plenty of security experts in the industry who can help you (if you need help filtering them and don’t have referrals, someone who has CISSP qualifications is a good place to start).

 Bottom line  Security strengthens with layers, and the simple application of encryption on your database table can help protect your customer’s data if this table is exposed. It does not defend against all possible attacks, but it does nothing but good things. What possible reason is there store things in clear-text?

Back to the data

By combining the exposed password databases I’ve encountered, and filtering the results to just those rows that are exactly four digits long [0-9] the output is a database of all the four digit character combinations that people have used as their account passwords.

Given that users have a free choice for their password, if users select a four digit password to their online account, it’s not a stretch to use this as a proxy for four digit PIN codes.

The Data

I was able to find almost 3.4 million four digit passwords. Every single one of the of the 10,000 combinations of digits from 0000 through to 9999 were represented in the dataset.

The most popular password is  1234 

… it’s staggering how popular this password appears to be. Utterly staggering at the lack of imagination …

… nearly 11% of the 3.4 million passwords are  1234  !!!

The next most popular 4-digit PIN in use is  1111  with over 6% of passwords being this.

In third place is  0000  with almost 2%.

A table of the top 20 found passwords in shown at the right. A staggering 26.83% of all passwords could be guessed by attempting these 20 combinations!

(Statistically, with 10,000 possible combination, if passwords were uniformly randomly distributed, we would expect the these twenty passwords to account for just 0.2% of the total, not the 26.83% encountered)

Looking more closely at the top few records, all the usual suspects are present  1111   2222   3333  9999  as well as  1212  and (snigger)  6969 .

It’s not a surprise to see patterns like  1122  and  1313  occurring high up in the list, nor  4321  or  1010 .

 2001  makes an appearance at #19.  1984  follows not far behind in position #26, and James Bond fans may be interested to know  0007  is found between the two of them in position #23 (another variant  0070  follows not much further behind at #28).

PINFreq
#1123410.713%
#211116.016%
#300001.881%
#412121.197%
#577770.745%
#610040.616%
#720000.613%
#844440.526%
#922220.516%
#1069690.512%
#1199990.451%
#1233330.419%
#1355550.395%
#1466660.391%
#1511220.366%
#1613130.304%
#1788880.303%
#1843210.293%
#1920010.290%
#2010100.285%

The first “puzzling” password I encountered was  2580  in position #22. What is the significance of these digits? Why should so many people select this code to make it appear so high up the list?

Then I realized that  2580 is a straight down the middle of a telephone keypad!

(Interestingly, this is very compelling evidence confirming the hypothesis that a 4-digit password list is a great proxy for a PIN number database. If you look at the numeric keypad on a PC-keyboard you’ll see that 2580 is slightly more awkward to type on the PC than a phone because the order of keys on a keyboard is the inverted. Cash machines and other terminals that take credit cards use a phone style numeric pads. It appears that many people have an easy to type/remember PIN number for their credit card and are re-using the same four digits for their online passwords, where the "straight down the middle" mnemonic no longer applies).

(Another fascinating piece of trivia is that people seem to prefer even numbers over odd, and codes like  2468  occur higher than a odd number equivalent, such as  1357 ).

Cumulative Frequency

As noted above, the more popular password selections dominate the frequency tables. The most popular PIN code of  1234  is more popular than the lowest 4,200 codes combined!

That's right, you might be able to crack over 10% of all codes with one guess! Expanding this, you could get 20% by using just five numbers!

Below is a cumulative frequency graph:

Statistically, one third of all codes can be guessed by trying just 61 distinct combinations!

The 50% cumulative chance threshold is passed at just 426 codes (far less than the 5,000 that a random uniformly distribution would predict). Paranoid yet?

Bottom of the pile

OK, we've investigated most frequently used PINS and found they tend to be predictable and easy to remember, let's turn for a second to the bottom of the pile.

What are the least "interesting" (least used) PINS?

In my dataset the answer is  8068  with just 25 occurrences in 3.4 million (this equates to 0.000744%, far, far fewer than random distribution would predict, and five orders of magnitude behind the most popular choice).

To the right are the twenty least popular 4-digit passwords encountered.

 Warning  Now that we’ve learned that, historically,  8068  is (was?) the least commonly used password 4-digit PIN, please don’t go out and change yours to this! Hackers can read too! They will also be promoting 8068 up their attempt trees in order to catch people who read this (or similar) articles.

Check out about the Nash Equilibrium

PINFreq
#998085570.001191%
#998190470.001161%
#998284380.001161%
#998304390.001161%
#998495390.001161%
#998581960.001131%
#998670630.001131%
#998760930.001131%
#998868270.001101%
#998973940.001101%
#999008590.001072%
#999189570.001042%
#999294800.001042%
#999367930.001012%
#999483980.000982%
#999507380.000982%
#999676370.000953%
#999768350.000953%
#999896290.000953%
#999980930.000893%
#1000080680.000744%

Memorable Years

Many of the high frequency PIN numbers can be interpreted as years, e.g.  1967   1956   1937  … It appears that many people use a year of birth (or possibly an anniversary) as their PIN. This will certainly help them remember their code, but it greatly increases its predictability.

Just look at the stats: Every single  19??  combination can be found in the top fifth of the dataset!

Below is a plot of this in graphical format. In this chart, each yellow line represents a PIN number that starts  19?? 

If all the passwords were uniformly distributed, there should be no significant difference between the frequency of occurrence of, for instance,  1972  and any other PIN ending in seventy two  ??72 . However, as we shall see, this is not the case at all.

 1972  occurs in ordinal position #76 (with a frequency 0.099363%). Here’s a histogram for the occurrences of all  ??72  probabilities.

You can clearly see the spike at  1972  (with smaller spikes at  7272  and  1472 )

If you calculate the ratio of the peak of  1972  to the average of all the other  ??72  PINS you get the ratio of  22:1

PINS starting with  19??  are much more likley to occur. Of course, it’s not just 1972. Here is plot of the ratio of 19 to non-19 for all hundred combinations. Along the x-axis are all the combinations of last two digits 帽X, and for each of these the ratio of the 19XX to average of all the other ??XX occurrences has been calculated. Here’s the chart:

It's a pretty good approximation for a demographic chart! (suggested by the red-dashed trend line) which would probably allow a fair estimation of the ages (years of birth) of the people using the various websites. (Of course, hackers invert this strategy and use the age of a target to try and give information to guess a user's PIN. Looking at this graph, this might give them up to a 40x advantage!)

Just about all the ratios are above 1.0. The noteable exceptions are  ??34  and  ??00  (which are easy to explain, since the massive popularity of  1234  and  0000  dwarf  1934  and  1900 respectively). Simiarly  33   44   55   66  … are lower than expected as the quad codes like  3333  mask out even the  1933  boost.

There are also spikes in the graph corresponding to the popular PINS of  1919   1984  and  1999 

Patterns in data