Millions of rows with a JSON column? Yes, indeed I have. Recently in fact. When ...

Millions of rows with a JSON column? Yes, indeed I have. Recently in fact. When all normalized fields are populated, you're absolutely right. However that wasn't our dataset. We had entries with sparse keys. This was due to the nature of the data we were ingesting.

We ended with multiple partial expression indexes on the JSON column due to the flexibility it provided. Each index ended up relatively small (again, sparse keys), didn't require a boatload of null values in our tables, was more flexible as new data came in from a client with "loose" data, didn't require us to make schema migrations every time the "loose" data popped in, and we got the job done.

In another case, a single GIN index made jsonpath queries trivially easy, again with loose data.

I would have loved to have normalized, strict data to work with. In the real world, things can get loose without fault to my team or even the client. The real world is messy. We try to assert order upon it, but sometimes that just isn't possible with a deadline. JSON makes "loose" data possible without losing excessive amounts of development time.