Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's one of those 'simple at first glace' standards. If you want to confuse someone who advocates for JSON's 'simplicity' as a feature, ask them what happens when their favorite JSON deserializer receives repeated dictionary keys (yes, that's valid JSON per RFC 7159).


there is no standard so simple that humans won't make a mess of it. You think "CSV, comma-separated-values, it's right in the name!" And then you write a parser for it and realize it won't read CSV files Excel generates.

Years ago I was porting an r5rs Scheme app from Guile to other Scheme systems. The entire standard is 50 pages of fairly readable basic English. You would not believe the amount of differences different interpreter/compilers have on their agreement as to what a symbol can be. Or under-specified cases such as integer->char (one system goes out of its way to be a dick about it and purposely not use ASCII, trading pragmatics for pedantry)


  % echo '{"a": 5, "a": 42}' | jq  
  {
    "a": 42
  }
Pretty much what I expected and also what you would get if you saw two instances of the same optional non-repeated field in a protocol buffer message. What were you expecting or wanting?


Here's the kicker: while this might sensible to you, some JSON implementations out there will reject duplicate fields (fail a parse completely), and the RFC does not even specify what is the correct behavior (override previous, ignore duplicates, fail entire parse, return non-unique keyed dictionary, something else entirely?).

So while to you and I this behavior might be expected (although I'm still not sure that overriding previous fields is more obvious than ignoring repeated fields) - some library implements thought differently, and there isn't even an agreed on standard. Arguing about this isn't purely academic, either - there have been security vulnerabilities resulting from these differences [1].

[1] - https://justi.cz/security/2017/11/14/couchdb-rce-npm.html


Interesting. Protobuf specifies this last-instance-wins behavior, and it can be pretty useful. It allows you to override a field by simply appending a few bytes, without having to re-encode a whole message. JSON I guess doesn't have as much concern for efficiency as protobuf has.


Their point was likely that implementations vary on the interpretation. Which is a bit of a problem for rpc systems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: