Neil's News

+ 2010
- 2009
 Google Multivac
 Lua Diff
 Moo Lobotomy
 Python JSON
 Stray Pixel
 Jupiter Photos
 Moon Photos
 Android Scope
 Impact Night
 Magnetic Noise
 DocEng 2009
 München
 Regensburg
 Java MobWrite
 Cursor Preservation
 Orbital Paper Airplane
 JSONP Memory Leak
 First Academic Paper
 Moore's Bet
 Moon Movie
 Hard Drive Crash
 Maker Faire 2009
 Self-collisions
 Mad Scientist
 Doorbell
 Rice DNA
 Somalia Affair
 Colorado River
 Great White North
 Lava Lamp
 Turbine
 Black and White
 CAT Triplet
 Dodecahedron
 MobWrite 3
 Differential Sync Talk
+ 2008
+ 2007
+ 2006
+ 2005
+ 2004
+ 2003
+ 2002

Python JSON

8 December 2009

JSON is a great data interchange format, it complements XML quite well. But whereas there exists an entire industry devoted to XML parsers, encoders, validators, transformers and activists, JSON does not get as much attention. [As evidence, despite this post being about JSON, check out what's being advertised on the right-hand column.] My requirements called for me to merge JSON blocks from several untrusted sources and publish them as a single JSON block. At issue is that if one source provides syntactically invalid JSON, then the merged block will be unusable by everyone. Thus a JSON validator was needed.

Python has a standard library called (big surprise) json. It offers two methods, read() and write() which convert JSON to Python and Python to JSON respectively. Passing illegal JSON (such as a missing bracket) throws an error:

>>> import json
>>> json.read('[1, 2')
StopIteration
This is good, but insufficient as a validator. Consider the following case:
>>> import json
>>> json.read('123xx')
123
When passed to JavaScript's eval() function, '123xx' will throw a SyntaxError.

A more aggressive option is to round-trip the JSON to Python and back. Thus:

>>> import json
>>> json.write(json.read('123xx'))
'123'
Not only is this approach excessively CPU-intensive, but it introduces new errors. Such as:
>>> import json
>>> json.write(json.read('0.1234567'))
'0.123457'
Not to mention that Python's JSON schema does not follow the spec laid out in RFC 4627. Amongst other things, the top-level JSON object is supposed to be either an array or an object.

Thus I found myself writing a JSON validator from scratch. Efficiency and portability were key considerations.


Obviously I had no choice but to take this photograph.

< Previous | Next >

 
-------------------------------------