Date: Sun, 26 May 2013 09:38:13 +0200
- Random Strings
PHP and JSON: Cut #987
JSON Decoding in PHP 5.2.1 is Broken
As of PHP 5.2.1,
json_decode() no longer follows the published standards for JSON-encoded texts.
Why not? For no reason other than the convenience of those ignorant of JSON standards.
Prior to PHP 5.2.1, this:
As of PHP 5.2.1, it results in:
Nice and handy, perhaps … but a blatant violation of JSON specifications, since
'true' is not a valid JSON encoded text.
A little history
Rather than roll some new JSON interpreter, I chose to leverage the Services_JSON package along with ext/json to build a package that was compatible with ext/json, but usable for those who did not have that extension installed. With compatibility built in, developers could move application code back and forth between systems without having to worry about whether or not the extension was installed — if it was, the application would benefit from the added performance of a native extension. If it wasn’t, everything should work exactly the same way.
In the course of my research for the Solar_Json package, I learned a lot about JSON and how it is supposed to behave. JSON.org is a spartan but complete resource about the format, and includes the JSON Checker and a comprehensive JSON test suite. There’s also a link to RFC 4627, which details JSON’s structure in a proposal for the formal application/json media type.
While digging through all this JSON goodness, I came to appreciate ext/json’s strict adherence to JSON’s format. The version of ext/json bundled with PHP 5.2.0 (version 1.2.1) was right on the money in its parsing, and by the time I was done, Solar_Json matched it every step of the way. To ensure ext/json compatibility, I wrote a series of unit tests (26 in all) to ensure that Solar_Json’s “pure PHP” implementation matched ext/json’s output exactly.
It was challenging, but it worked out well. The result was Solar_Json.
JSON Bundled with PHP, Confusion Ensues
All was good, for awhile.
(Yesterday, Paul M. Jones re-ran the JSON unit tests I’d written for Solar_Json using PHP 5.2.1 in preparation for a new release of Solar. He mentioned that some of the tests started failing, which sparked this discussion. Thanks, Paul!)
Sure, a couple people (myself included) didn’t fully understand JSON for awhile. I even opened (and quickly closed) a bug in the way I thought certain strings should be decoded by the
json_decode() function. Others had the same confusion.
What’s so confusing?
Well, the common thing that people want to do is something like this:
The confusing part about these two snippets is that they both return
NULL instead of
true. Based on the two bug reports (#38440 and #38680), it’s common for people to expect the output to be a boolean
true in these examples.
However, if you understand JSON at all, you’ll know that
NULL is a perfectly reasonable result, because
'true' are not valid JSON texts.
Note that I said “if you understand JSON at all”, you’ll realize that
NULL is a perfectly reasonable result when attempting to decode an invalid JSON text.
I should amend that: If you understand JSON at all and actually care about standards and compatibility, you’ll realize that
NULL is a perfectly reasonable result of parsing an invalid JSON text.
Section 2 of the standard states very plainly:
A JSON text is a serialized object or array.
JSON-text = object / array
Translated, that means that a valid JSON-text is either an object or an array. It’s not a string literal, an integer, a boolean. The list of what a valid JSON-text can be is short. It can be an object. It can be an array. It can be … whoops, that’s it. An object, an array, or it just isn’t JSON.
DAMN, that’s inconvenient, you may be thinking. Yep, it is. But, it is what it is. If you don’t like it, submit an RFC to have it changed. That’s the way this crazy thing called the internet works.
Put another way: if you don’t like it, you do not just start making things up. Apparently, enough people unclear on the concept of JSON complained about their lack of understanding that PHP now just does whatever it wants with JSON. Check this out for the details. (And to reiterate, I’m not knocking the people who aren’t clear about JSON. I was one of them too, up until I actually researched how JSON is supposed to behave.)
Imagine if the core team behind every language did that. Hey, if you don’t like the standards, just ignore them! We can explain it away with documentation, right?
The cavalier attitude taken by the PHP internals team on this issue is inexcusable. Yep, cavalier — a colleague who spoke to a member of the PHP internals team about this change confirmed that the break from the JSON spec is deliberate and intentional.
To make matters worse, the version number of ext/json did not change between PHP 5.2.0 and PHP 5.2.1. In both releases, ext/json claims to be at version 1.2.1, despite this significant change.
While some are lobbying to compile the definitive business case for PHP (and I even piped in and agreed that it was necessary), some PHP internals folks are effectively shooting that effort in the foot by disregarding published standards.
I’ve spent the better part of the last two years defending my choice of PHP 5 as my preferred language, first at Feedster, now at Mashery. With all the buzz about other languages these days, the case for PHP is getting harder to make. Incidents like this will not make the case for PHP any easier.
Is this a big flap over a little thing? That’s certainly one way of looking at it. I see this flagrant disregard for published specs as one more cut toward a death by a thousand cuts.
Talented and notable developers are dropping PHP, or seriously considering other languages. If PHP’s next 10 years are to be as poignant as its first, a significant attitude adjustment is required.