philnash / pwned

😱 An easy, Ruby way to use the Pwned Passwords API.

Home Page:https://rubygems.org/gems/pwned/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggestion/Request First Class Support for continuous checking of passwords once they were sent.

terracatta opened this issue · comments

This gem is fantastic.

One use-case I am interested that doesn't seem supported out of the box is saving the necessary hash prefix/suffix in the rails DB so we can periodically check the pwned status of a user's password after it has been set (without storing enough).

Any first-class support for this use-case would be great.

I'm not sure that's a great idea. Storing the hash would mean storing an insecure sha1 hash of the password, even if you are also storing a strong bcrypt (or other good hashing algorithm) version. The Pwned Passwords API uses sha1 hashes because it doesn't matter if they are brute forced, there's a full database of plain text versions behind the API which you can download anyway. If you store unsalted sha1 hashes of your users' passwords you're asking for trouble!

One suggestion is to check user's passwords when they log in and you have the plain text password to hand. That way you can also take them through a reset password flow as they log in if their password has been pwned.

If you store unsalted sha1 hashes of your users' passwords you're asking for trouble!

I had a poor understanding of the k-Anonymity model before I wrote this issue, for some reason I didn't think we had to store the full sha1 hash. Now that I understand it, I completely agree with your assessment.

One suggestion is to check user's passwords when they log in and you have the plain text password to hand

Yes this is a much better idea, thanks!

Yeah, to be fair I had written, "sounds like a good idea, I'll try to find time for it" before I realised too 😅

Facebook does something like this, in a way that it believes is safe/secure:

Upon finding a collection of email addresses and passwords, the system uses an automated process to check them against the social network's user database. Facebook says that doesn't mean it has copies of people's passwords in plain text, though: it encrypts or hashes stolen passwords first before comparing them to similarly encrypted log-in details.

So does Google:

Any time Google hits a match, it notifies you that a specific set of credentials is public and unsafe and that you should probably change the password.

... Google is doing all of this by comparing your encrypted credentials with an encrypted list of compromised credentials

  1. Why couldn't/shouldn't other sites do the same?

  2. Would this be within scope of this project?

  3. And would this even be possible to accomplish using the haveibeenpwned API?

Unfortunately, it doesn't look like the haveibeenpwned API provides a good way to do this. The only way to look up passwords is by SHA-1 hash, which we can't safely be caching for users. We can search for all pastes for an email/account, but that's not enough either, because we'd want to detect when a user's password was compromised anywhere by anyone, not just when paired with the specific email address they happen to be using for "our" app.

I don't understand, then why Google at least apparently only sends password matches if username also matches?

Chrome first sends an encrypted, 3-byte hash of your username to Google, where it is compared to Google's list of compromised usernames. If there's a match, your local computer is sent a database of every potentially matching username and password in the bad credentials list,

And although there's a way to retrieve all breaches, I don't see a similar API for listing all pastes. But if we could get a list of new pastes, then for every paste, we could filter the set to just those where paste["DataClasses"].include?("Passwords").

It seems like the only way to securely implement this feature would be to download the entire contents of each new paste, and then rehash all passwords from the paste with the same encryption used for user passwords (bcrypt?) so that they can be easily checked for matches against against actual encrypted user passwords. Would that work?

Seems like this feature would require an offline dataset. We can download their entire password list. But:

  • It's big (11.1GB)
  • It's not updated regularly. (Last update was 2019-07-14) Not the same as continually crawling pastes like Facebook does.

This idea should probably be explored in a separate project...

Looks like https://github.com/DanielHeath/has_unpublished_password uses an offline dataset (instead of HIBP API), but only a subset of it, which obviously wouldn't give you the ability to respond in a timely manner to new pastes like Facebook does.

https://www.csoonline.com/article/3266607/1-4b-stolen-passwords-are-free-for-the-taking-what-we-know-now.html

Download the billions of breached passwords and blacklist them all. Attackers have a copy; so should you.

Hey @TylerRick, good thoughts here. The idea is that even if you check a password hasn't been breached on sign up, it could still be breached in the future, making it insecure.

The way that this gem helps with that, is that you can use it when a user signs in as well as when they sign up. That's the only other time the site should have the unhashed password. This is supported by the devise-pwned_password plugin, for example.

has_unpublished_password appears to have the same goals as this gem, and can only check using the original plain text password. It differs in that it uses the top 11,000,000 passwords and a bloom filter to make it available without the API request. I believe GitHub, for example, actually downloaded the entire HIBP password set and made it available as an internal service to avoid the 3rd party API call too.

You could certainly build something that periodically took the full HIBP plain text password set, and hashed it in the same way that your user's passwords are hashed and checked for matches, but I would say that this is beyond the scope of this gem.

Some other points:

I don't understand, then why Google at least apparently only sends password matches if username also matches?

Seems like Google are crawling to find out if accounts are compromised, not just passwords leaked in a breach. This would be to avoid credential stuffing attacks against Google accounts specifically.

Facebook does something like this. [...] Why couldn't/shouldn't other sites do the same?

Other sites could absolutely spend time crawling for new lists of breached passwords and then hashing and comparing against their own. However this is an intensive process and I'm sure both Facebook and Google have a team dedicated to account security with functions like this. There's also an amount of processing power that would have to be dedicated to hashing and comparing each password in a breach to each user's password hash (you have salted your hashes, right? Or using bcrypt there is a generated salt). Ultimately it comes down to how much time and money you can dedicate to keeping your users' accounts secure versus how important it is to do so. Google and Facebook accounts sit at the centre of many users' internet lives and would be devastating to use. Same for most email accounts.

Before embarking on the effort to scrape the web for new password breaches and compare against your entire user database you also need to consider the ROI. The beauty of the pwned passwords API and this, and other, implementations of it is that you can get a good improvement in your account security with comparatively little engineering effort.

I'd be interested to hear if you do try to build something that can efficiently ingest a password set and check against a database of users. Let me know if you start something!

Good points! I agree on all points.

One of the drawbacks of waiting until someone signs in again to check their password is that a user may simply stay signed in for a long time without signing out. I suppose that could be an argument in favor of limiting the maximum duration of a session or remember-me token, but as far as user experience, I always find it annoying when I was signed in and a website arbitrarily signs me out without telling me why.

I doubt I will get a chance to develop this idea further personally, but like you, I would be quite interested if anyone else ever develops a library/tool that does this (which you summed up nicely):

build something that periodically took the full HIBP plain text password set, and hashed it in the same way that your user's passwords are hashed and checked for matches

Regarding my question about Google's Password Checkup, I think you are right. I also just recently ran across some clarification about that:

  • Advice that avoids fatigue: We designed Password Checkup to only alert you when all of the information necessary to access your account has fallen into the hands of an attacker. We won’t bother you about outdated passwords you’ve already reset or merely weak passwords like “123456”. We only generate an alert when both your current username and password appear in a breach, as that poses the greatest risk.