Skip to main content
edited tags; edited title
Link
200_success
  • 145.7k
  • 22
  • 191
  • 481

Python: What’s the largest MD5 hash you can find?

Source Link

Python: What’s the largest MD5 hash you can find?

I got inspired by Doorknob's blog post What’s the largest MD5 hash you can find? to get my hands dirty and learn some Python. After playing around with the code from that post, I decided to rewrite the whole thing and to make it more flexible (to get some practice).

This is what the code does:

  • It generates a random string (called alphabet) and calculates all of its permutations

  • It keeps the permutation with the largest MD5 representation (hashes are numbers)

  • This is repeated DEFAULT_CYCLES times, if no static alphabet is set. In the end, the winning string is printed.

It ended up having many functions and I'm not sure if this is the best way to go. This is a quite simple task, but takes almost 50 lines of code. However I'm afraid it will become less readable if I try to "compress" it or lump some of the functions together.

So, how could this code be improved and/or shortened? This is the first time I'm writing Python, so please be gentle. :)

from itertools import permutations
from hashlib import md5
from random import choice
from string import ascii_uppercase, ascii_lowercase, digits

# Number of alphabets to test. Ignored if using a static alphabet.
DEFAULT_CYCLES = 256

# Length of 8 recommended. Set to empty string for random values.
STATIC_ALPHABET = ''

def get_alphabet():
    return STATIC_ALPHABET if STATIC_ALPHABET != '' else ''.join(choice(ascii_uppercase + ascii_lowercase + digits) for _ in range(8))

def number_of_cycles():
    return DEFAULT_CYCLES if STATIC_ALPHABET == '' else 1

def numeric_md5(input_string):
    return int(md5(input_string.encode('utf-8')).hexdigest(), 16)

def unique_permutations(plain_string):
    return set([''.join(p) for p in permutations(plain_string)])

def string_with_largest_hash(plain_strings, current_leader):
    largest_hash = 0
    leading_plaintext = ''
    for current_string in plain_strings:
        current_hash = numeric_md5(current_string)
        if current_hash > largest_hash:
            largest_hash = current_hash
            leading_plaintext = current_string
    old_leader_hash = numeric_md5(current_leader)
    return leading_plaintext if largest_hash > old_leader_hash else current_leader

def find_largest_md5():
    cycles = number_of_cycles()
    print('Running {} cycles, alphabet: {}'.format(cycles, (STATIC_ALPHABET + ' (static)' if STATIC_ALPHABET != '' else 'random')))
    
    leading_string = ''
    for i in range(cycles):
        current_alphabet = get_alphabet()
        jumbled_strings = unique_permutations(current_alphabet)
        leading_string = string_with_largest_hash(jumbled_strings, leading_string)
    print('Largest MD5: {} from plaintext {}'.format(hex(numeric_md5(leading_string)), leading_string))

if __name__ == '__main__':
    find_largest_md5()