~kravietz/django-fast-common-passwords-validator

A faster drop-in replacement for Django built-in CommonPasswordValidator
940fa612 — Pawel Krawczyk 4 years ago
Also delete fast_password_validation.egg-info
587c7d2a — Pawel Krawczyk 4 years ago
require Django first
add1ae56 — Pawel Krawczyk 4 years ago
Clean up after commands ran with sudo

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~kravietz/django-fast-common-passwords-validator
read/write
git@git.sr.ht:~kravietz/django-fast-common-passwords-validator

You can also use your local clone with git send-email.

builds.sr.ht status

#FastCommonPasswordValidator

A faster drop-in replacement for Django built-in CommonPasswordValidator. With the default password list it has 4x lookup speed gain and 30% memory savings and these results will be even better with larger password lists.

Validate whether the password is a listed common password. By default, will use built-in list of 20k common passwords (lowercase and deduplicated) by Royce Williams. If called with a file name, it will load passwords one-per-line and use for subsequent checks.

#Why?

The original class loads a static list of 20k passwords into memory and scans through it each time it's called, which is... far from being optimal. From Django maintainers point of view it has one advantage: it does not require any extra dependencies, which was the main reason that class was included into the default Django distribution while this wasn't and is available as an extra module.

#Compiling your own password list

Initialize a new Bloom filter from your data:

from bloom_filter import BloomFilter
import pathlib

approx_number_of_lines = 20_000 # or whatever your file has
bloom = BloomFilter(max_elements=approx_number_of_lines, error_rate=0.001)

with pathlib.Path('mypasswords.txt').open() as f:
    for line in f.readlines():
        line = line.strip()
        if len(line.strip()) > 0 and not line.startswith('#'):
            bloom.add(line)

# test if it works
'password77' in bloom # should be True
'PLWmV6Zh3viv' in bloom # should be False (but see on false positives below)

And dump it as a file using pickle module:

import pickle
with open('myawesomepasswords.dat') as f:
    pickle.dump(f, bloom)

#False positives

Bloom filter is a probabilistic structure. The filter is by default configured for 0.001 (0.1%) error rate which means on 1000 checks in will falsely report 1 password on average as "common" even if it was not in the original list. In practical applications it's not really a hill to die on, and it might actually bump the respect for your prophetic skills among the users.