Infuriating fact: if a service has maximum password length limits (lower than 1000 characters), they’re reversibly storing your password and if they’re that lazy it’s probably plain text
Yeah, you actually better not save the users passwords in plain text or in an encrypted way it could be decrypted.
You rather save a (salted) hashed string of the password. When a user logs in you compare the hashed value of the password the user typed in against the hashed value in your database.
What is hashed? Think of it like a crossfoot of a number:
Let’s say you have a number 69: It’s crossfoot is (6+9) 15. But if someone steals this crossfoot they can’t know the original number it’s coming from. It could be 78 or 87.
Dumb question: isn’t it irrelevant for the malicous party if it’s 78 or 87 per your example, because the login only checks the hash anyway? Won’t both numbers succesfully login?
It’s actually a really good question. What you’re explaining is called a collision, by creating the same hash with different numbers you can succesfully login.
This why some standard hashing function become deprecated and are replaced when someone finds a collision. MD5, which was used a lot to hash passwords or files, is considered insecure because of all the collisions people could find.
In the real world, finding an input that produces the right hash output isn’t easy. And because a lot of users reuse passwords (don’t do it, but people do), a list of emails and passwords gives you an incredibly lazy and easy to do way to compromise accounts on other sites.
Reminds me of a funny moment in my IT internship, ahead of an audit one of the sysadmins came over and was saying “yeah so I pulled all of the department password hashes to check for weak/compromised accounts and noticed one person has the same sysadmin and user password hash” and my boss went “wait everyone doesn’t do that?” And after realizing they outed themselves turned bright red and changed their admin password
With a hash it’s difficult to find a combination that results in this specific hashed password. Think of it like this: you have a biiig prime number and you multiply it by another. Now, that’s easy, but it’s way harder to do it backwards - factorize a large composite number (this is just for illustration). Similarly trying to find a password that works when you input it based on the hashed one is way more difficult than hashing the password in the first place.
Additional to what others have said: The “salted” part is very relevant for storing.
There aren’t soooo many different hashing algorithms people use. So, let’s simplify the hashing again with the crossfoot example.
Let’s say, 60% of websites use this one algorithm (crossfoot) for storing your password, and someone steals the password “hashes” (and the login / email). I could ran a program that creates me a list of all possible crossfoots for all numbers for 1 to 100000.
This would give me an easy lookup table for finding the “real” number behind those hashes.
(Those tables exists. Look up “rainbow tables”)
Buuuut what if I use a little bit of salt (and pepper pepper pepper) before doing my hashing / crossfooting?
Let’s use the pw “69” again and use a salt with a random number “420” and add them all together:
6 + 9 + 420 = 435
This hash wouldn’t be in my previous mentioned lookup table.
Use different salts for every user and at least the lookup problem isn’t such a big problem anymore.
i was more wondering why a length limit implies anything about how they’re storing the password. once they receive the password they’re free to hash it any which way they want
random memory—yahoo back in the day used to hash the password in the browser before sending it to the server, but TLS made that unnecessary i guess
Nope. No point in storing > 256 or even 128 chars for a password anyway. Useless storage wasted. Also it doesn’t really mean they store the password badly in the server.
The length limit is mostly for the user’s sake - companies don’t want people to set their passwords to 30+ character ones that they keep forgetting and call their tech support to reset.
That’s really really really annoying, as someone who has a good, strong brain-based password algorithm and hates it when websites forbid my strong password forcing me to make an exception.
Ignoring that they must be hashed to be acceptable and that it’s not possible for 1000 characters of text to add up to a waste of storage worth mentioning in pretty much any environment, it’s literally impossible for a 128 character password limit to be beneficial in any way.
A limit below that demonstrably lowers security by a huge margin.
That sounds incredibly unlikely. I would be good money that 99% of password length limits are not based on concrete limits. Things like “100 should be enough 🤷” must be way more common.
I doubt 1% of programmers are away of their hashes block size. It is also probably irrelevant since after the first round everything is fixed size anyways.
Couldn’t it just be that they’re using something like bcrypt which won’t take any chars above its limit into account (knowing that there’s a limit will pretty much never matter to a user but why obscure the fact)? What does it even mean to store it reversibly, just because they have a char limit doesn’t mean they are encrypting the password, could just be some frontend shenannigans as well.
Fun fact: Lemmy instances cap at 60. they’re not storing reversibly, they’re just using bcrypt and rather than pre-hashing the pw before bcrypt like most bcrypt users do, they just truncate to 60.
Infuriating fact: if a service has maximum password length limits (lower than 1000 characters), they’re reversibly storing your password and if they’re that lazy it’s probably plain text
reversibly?
Yeah, you actually better not save the users passwords in plain text or in an encrypted way it could be decrypted. You rather save a (salted) hashed string of the password. When a user logs in you compare the hashed value of the password the user typed in against the hashed value in your database.
What is hashed? Think of it like a crossfoot of a number:
Let’s say you have a number 69: It’s crossfoot is (6+9) 15. But if someone steals this crossfoot they can’t know the original number it’s coming from. It could be 78 or 87.
Dumb question: isn’t it irrelevant for the malicous party if it’s 78 or 87 per your example, because the login only checks the hash anyway? Won’t both numbers succesfully login?
It’s actually a really good question. What you’re explaining is called a collision, by creating the same hash with different numbers you can succesfully login.
This why some standard hashing function become deprecated and are replaced when someone finds a collision. MD5, which was used a lot to hash passwords or files, is considered insecure because of all the collisions people could find.
In the example yes.
In the real world, finding an input that produces the right hash output isn’t easy. And because a lot of users reuse passwords (don’t do it, but people do), a list of emails and passwords gives you an incredibly lazy and easy to do way to compromise accounts on other sites.
Reminds me of a funny moment in my IT internship, ahead of an audit one of the sysadmins came over and was saying “yeah so I pulled all of the department password hashes to check for weak/compromised accounts and noticed one person has the same sysadmin and user password hash” and my boss went “wait everyone doesn’t do that?” And after realizing they outed themselves turned bright red and changed their admin password
With a hash it’s difficult to find a combination that results in this specific hashed password. Think of it like this: you have a biiig prime number and you multiply it by another. Now, that’s easy, but it’s way harder to do it backwards - factorize a large composite number (this is just for illustration). Similarly trying to find a password that works when you input it based on the hashed one is way more difficult than hashing the password in the first place.
Additional to what others have said: The “salted” part is very relevant for storing.
There aren’t soooo many different hashing algorithms people use. So, let’s simplify the hashing again with the crossfoot example.
Let’s say, 60% of websites use this one algorithm (crossfoot) for storing your password, and someone steals the password “hashes” (and the login / email). I could ran a program that creates me a list of all possible crossfoots for all numbers for 1 to 100000.
This would give me an easy lookup table for finding the “real” number behind those hashes. (Those tables exists. Look up “rainbow tables”)
Buuuut what if I use a little bit of salt (and pepper pepper pepper) before doing my hashing / crossfooting?
Let’s use the pw “69” again and use a salt with a random number “420” and add them all together:
6 + 9 + 420 = 435
This hash wouldn’t be in my previous mentioned lookup table. Use different salts for every user and at least the lookup problem isn’t such a big problem anymore.
This was super helpful 🙏🏼 sent me down a whole other rabbit hole of learning
i was more wondering why a length limit implies anything about how they’re storing the password. once they receive the password they’re free to hash it any which way they want
random memory—yahoo back in the day used to hash the password in the browser before sending it to the server, but TLS made that unnecessary i guess
In the least bad case, they encrypt the password instead of hashing it, making it possible to decrypt the password.
In the most common case, they store the password in plaintext, so there isn’t even any encryption to be reversed.
Nope. No point in storing > 256 or even 128 chars for a password anyway. Useless storage wasted. Also it doesn’t really mean they store the password badly in the server.
A hashed password is always the same length though is it not?
The length limit is mostly for the user’s sake - companies don’t want people to set their passwords to 30+ character ones that they keep forgetting and call their tech support to reset.
That’s really really really annoying, as someone who has a good, strong brain-based password algorithm and hates it when websites forbid my strong password forcing me to make an exception.
Ignoring that they must be hashed to be acceptable and that it’s not possible for 1000 characters of text to add up to a waste of storage worth mentioning in pretty much any environment, it’s literally impossible for a 128 character password limit to be beneficial in any way.
A limit below that demonstrably lowers security by a huge margin.
Ok but are 15 characters too much?
I’ve seen 14-char limits, which are NOT reasonable
there is at least one bank that I know of with a 12 character limit
There’s a major bank in Australia that limited passwords to six characters. Exactly six. No more, no less. The passwords were also case-insensitive.
Yikes, how do banks, of all things, have such low password limits…
They may just base their limit on one or a few block sizes of the hash function.
That sounds incredibly unlikely. I would be good money that 99% of password length limits are not based on concrete limits. Things like “100 should be enough 🤷” must be way more common.
I doubt 1% of programmers are away of their hashes block size. It is also probably irrelevant since after the first round everything is fixed size anyways.
Couldn’t it just be that they’re using something like bcrypt which won’t take any chars above its limit into account (knowing that there’s a limit will pretty much never matter to a user but why obscure the fact)? What does it even mean to store it reversibly, just because they have a char limit doesn’t mean they are encrypting the password, could just be some frontend shenannigans as well.
Fun fact: Lemmy instances cap at 60. they’re not storing reversibly, they’re just using bcrypt and rather than pre-hashing the pw before bcrypt like most bcrypt users do, they just truncate to 60.
60 makes sense, 14 does not