How is something like this even possible? It just sounds so incompetent.

eterm · on May 3, 2018

Here's one potential scenario?:

Over-zealous developers who think it's appropriate to log all function calls with parameters for trace level logs, or a framework with the same opinion automatically applies such tracing and logging over the whole code-base.

No-one notices because no-one uses trace level logging, until one day another developer is tearing their hair out because they can't reproduce a bug that is only occuring on live. It's an urgent bug that needs resolution asap. So this developer turns on the trace level logging and eventually finds and resolves their bug.

Being the careful person they are, they turn off the logging and go away happy.

Meanwhile they've unknowingly produced a few gigabytes of log outputs which happen to include plaintext passwords.

That's just one of many different scenarios where people acting in 'good faith' can still lead to bad outcomes. That is why a "PUNISH THEM!" attitude to this kind of incident is not helpful.

leggomylibro · on May 3, 2018

HTTPS form submissions should be encrypted while the data travels between the user's computer and the server, but the server will still need to decrypt them to perform the hashing. It's possible, and probably even common, for inexperienced or forgetful developers to add request logging for debugging or diagnosing service outages without adding extra logic to scrub sensitive fields.

thln666 · on May 3, 2018

seriously? pretty easily. somebody probably left a debug log message in place or something. guaranteed that this happens all the time and most people don't report it.

chatmasta · on May 3, 2018

I doubt anyone left something that logged the plaintext password. No reasonable architecture necessitates holding onto a plaintext password for more than one line of code.

One possibility is an HTTP server on the request path after TLS termination. But then why is an HTTP server logging the request body?

My guess would be some sort of instrumentation process was blindly reading data in memory without distinguishing what the data was, but produced logs that incidentally included passwords.

testplzignore · on May 3, 2018

In my experience, I've seen both of the following scenarios:

POST request comes in from the client. Full URL and request body is logged. Sometimes for simply troubleshooting, sometimes for security reasons (e.g., wanting to know all data coming in so that it's possible to identify security holes after they've been exploited).

POST request comes in from client. Frontend server makes a GET request to a backend server, and the password ends up in the standard request logs. In one case, I've seen this happen because the developer thought path variables were cool, so every API they wrote looked like /a/b/c/d/e. Sigh.

jonjojr · on May 3, 2018

As developer, I can tell you this happens more often than I'd like to admit.

debug logs is that necessary evil you need to troubleshoot pesky bugs. Unfortunately some of these debug tools need to be turned on in a live environment to capture those logs for debugging. But also Unfortunately, we are humans and we concentrate on fixing the bug and forget to turn off logging or log unnecessary data.

weavie · on May 3, 2018

Indeed. This is probably a good reminder for every developer to just go and check through their logs to see what is there. It can be quite a shock sometimes to find how much can get dumped there..

bdamm · on May 3, 2018

Absolutely this happens all the time. I personally have seen it happen twice at two different companies.

Jach · on May 3, 2018

I'm curious if anyone has details on using bcrypt/scrypt at scale. Specifically one way I could see this happening is something like login requests go to a load balancer that puts the requests on a queue to be picked up and validated by some hasher service, and the queue ends up writing the requests to logs to recover from certain kinds of failures.

panarky · on May 3, 2018

Is it insecure to bcrypt/scrypt on the client instead so the server never sees the plaintext password?

Jach · on May 3, 2018

From the password compromise side, not really, you're just pushing the cost of hashing to your users (and it will impact mobile users more). There's a similar technique on needing proof-of-work on the client to combat DDoS.

From the authorization side, there is a threat, because if your table storing hashes is compromised, attackers just have to supply the stored hash to the auth endpoint and they get to login as anyone.

A combination of hashing on the client side (or immediately once the pw hits the endpoint) with something cheaper followed by a more intense bcrypt/scrypt afterwards might help a bit with the tradeoffs.

krallja · on May 3, 2018

Then anyone with a copy of your DB can log in by sending the hash directly. It’s identical to storing plain text passwords.

chatmasta · on May 3, 2018

You could hash it twice, once at the client and once at the server. So your database would store a hash of a hash.

But without persistence on the client side you wouldn’t be able to do salting in the first hash (where do you store the salt?)

stordoff · on May 3, 2018

Does it buy you anything at that point? A server-side issue, such as this, would still log the thing you need to log in, and a client-side issue would just intercept the hashed form or could derive the hashing mechanism from analysing the client.

At most, it would seem to prevent weak passwords from being passed directly to bcrypt, but salting should solve that in a similar way anyway, and anyone brute-forcing a copy of the database can incorporate the same weak hashing logic

chatmasta · on May 4, 2018

Another comment in this thread mentioned an idea of seeding the hash with a quickly expiring nonce fetched from the server. I think that’s a quite clever approach, similar to CSRF tokens in a sense.

That would effectively create a one time “password” for transmission from browser to database. In a case like this one, where sensitive text transmitted from the client leaked into logs, it would be a non-issue. The sensitive string in the logs is a temporary hash that would be useless shortly after discovery, since it was derived from an expired nonce.

It effectively becomes a real time scrubbing system with 100% coverage, because the passwords are “scrubbed” by design, and do not depend on explicit detection code in some scrubbing mechanism.

jonjojr · on May 3, 2018

No, it is just poor oversight. Not incompetence. These could be excellent developers that just overlooked a logging tool.

icedchai · on May 3, 2018

This happens constantly. You'd be surprised what's recorded in logs...

carapace · on May 4, 2018

That's exactly what it is: incompetence, rank incompetence. Something like nine out of ten people getting paid today as professional software "engineers" should be let go. Dr. Margaret Hamilton figured out most of what we need to do to develop reliable software during and after the Apollo 11 mission. She coined the term "software engineering". Unfortunately, her work suffered from bad languaging and languished.

You'll notice you've been downvoted to hell and the comments in reply to yours are apologists and excuses. Not a coincidence.

FizzBuzz