zxcvbn-hs

Password strength estimation based on zxcvbn.

LTS Haskell 22.44:	0.3.6
Stackage Nightly 2023-12-26:	0.3.6
Latest on Hackage:	0.3.6

See all snapshots zxcvbn-hs appears in

MIT licensed and maintained by Peter Jones

This version can be pinned in stack with:zxcvbn-hs-0.3.6@sha256:1efd047082e3b7c6f9ca994814ef712080b6050adf9235bfc81e51570d744ddc,4887

Module documentation for 0.3.6

Text
- Text.Password
  - Text.Password.Strength

Depends on 22 packages(full list with versions):

Used by 1 package in lts-22.24(full list with versions):

zxcvbn-hs

Password Strength Estimation

What?

This is a native Haskell implementation of the zxcvbn password strength estimation algorithm as it appears in the 2016 USENIX Security paper and presentation (with some small modifications).

Why?

The zxcvbn algorithm is a major improvement over traditional password strength estimators. Instead of counting the occurrence of special characters, mixed case characters, numeric digits, etc., zxcvbn analyzes a plain text password and estimates the number of guesses that an attacker would need to make in order to crack it.

How?

A plain text password is broken into a list of substrings called tokens and each token is analyzed as follows:

Membership in a password or word frequency dictionary
Reversing the token and testing it against said dictionaries
Decoding l33t speak and testing it against said dictionaries
Determine if the token forms a pattern on a keyboard (e.g., “asdfgt”, “poiuy”, “aSw2@”, etc.)
Compare the code points of the characters to see if they form a sequence (e.g., “13579”, “abcde”, “zyx”, etc.)
Attempt to parse the token as a date with or without separators (e.g., “1013”, “2011-01-01”, “23/01/19”, “012319”, etc.)
Search for adjacent tokens that are identical (i.e. repeating patterns)

Each possible interpretation of a token is given an estimated number of guesses and then the entire password is scored based on the weakest path.

Usage

A complete example can be found in the example/Main.hs file. That said, it’s pretty easy to use:

import Text.Password.Strength (score, strength, en_US)
import Data.Time.Clock (getCurrentTime, utctDay)

main = do
  -- The date matcher needs to know the current year.
  refDay <- utctDay <$> getCurrentTime

  let password = "password1234567"
      guesses  = score en_US refDay password

  print guesses -- Number of estimated guesses (18)
  print (strength guesses) -- Sum type describing the password strength (Risky)

Demo App

If you want to play with an interactive demo take a look at the zxcvbn-ws repository.

Customization

You’ll most likely want to add custom words to the frequency dictionaries. For example, the name of your application, your domain name, and any personal information you have on the customer. Doing so will penalize the score of a password using such information.

The Text.Password.Strength.Config module defines the addCustomFrequencyList function which can be used to easily add words to the frequency dictionary.

Localization

Unlike other implementations of the zxcvbn algorithm, this version fully supports localization. It’s easy to augment or completely replace the frequency dictionaries and keyboard layouts. Tools are provided to compile simple text files into the data types required by this library.

However, like the other implementations, the default configuration is heavily biased towards United States English, hence its name: en_US.

Included in the default configuration are:

30,000 most frequently used passwords according to Mark Burnett
30,000 most frequently used words in US movies and television shows
30,000 most frequently used words in Wikipedia English articles
Top 10,000 surnames
Top 4,275 female names
Top 1,219 male names
QWERTY keyboard layout
Number pad keyboard layout

Existing Localization Packages

zxcvbn-dvorak Dvorak keyboard layout

Performance

It takes approximately 1.5 ms to process a 30-character password. Performance degrades as the length of the password increases (e.g., a 60-character password clocks in at 13.54 ms).

You probably want to limit the number of characters you send through the score function using something like Text.take 100 in order to prevent a malicious user from slowing down your application.

Most of the time is currently spent in decoding and testing l33t speak. If you want to work on improving the performance I suggestion you generate a profile using the benchmark tool.

Changes

Revision History

0.3.6 2023.09.11

bump tasty bounds

0.3.4 2023.08.13

Bump opt parse applicative #20

0.3.3 (Aug 02 2023)

Bump #17 aeson >=1.3 && <2.2 (latest: 2.2.0.0) hedgehog >=0.6 && <1.3 (latest: 1.3)

0.3.2 (June 22, 2023)

Bump version numbers #14
Fix minor hlint error
Add ghc 9.6 support

0.3.1 (May 26, 2022)

Bump version numbers

0.3.0.0 (October 29, 2020)

Added a ToJSON instance to the Score type.
Minor releases:
- Version 0.3.1 (May 26, 2022): Update dependency bounds

0.2.1.0 (September 25, 2019)

Export the entire HasConfig class so external code can use the config lens.

0.2.0.0 (September 12, 2019)

Make it possible for external projects to use the code generation tool (zxcvbn-tools)
Due to the binary-orphans -> binary-instances rename this package now only compiles under nixpkgs-unstable.

0.1.0.0 (September 10, 2019)

Initial (unreleased) version