[Melbourne-pm] PostgreSQL and indexing with json types

Tue Oct 9 16:42:41 PDT 2012

On 10/10/12 09:18, Craig Sanders wrote:
> On Thu, Oct 04, 2012 at 09:56:44AM +1000, Toby Corkindale wrote:
>> Hi,
>> Just something interesting I noticed today..
>>
>> As you've probably already seen, PostgreSQL 9.2 added support for
>> storing JSON documents in a 'json' type field.
>
> this is a cool and interesting feature but it will serve to encourage
> even more morons to avoid even basic/minimalist database normalisation
> methods.
>
> i'm getting tired of seeing FOSS programs that use databases just
> stuff JSON or XML or CSV or similar into a text field.
>
> their reasoning seems to be that the data will only ever be used by
> their own web app or via their REST API so there's no need to spend any
> time thinking about or designing the database schema (or maybe there's
> no reasoning and they're just ignorant and stupid).  Data lock-in is no
> less irritating in FOSS apps than it is in proprietary apps.
>
> also, why do something in the database server when you can do a
> half-arsed emulation of it 10,000 times less efficiently in your crappy
> php web app?
>
> pg's support for JSON fields isn't a bad thing - overall, it's a good
> thing...it will mitigate some of the problems - but it won't do anything
> to educate web-dev morons about database design.

Nothing has stopped people putting arbitrary documents into TEXT or 
BYTEA fields before -- I've seen plenty of cases where people use Perl's 
Storable, Python's Pickles, YAML, or custom serialisation, and just dump 
that into a text field in the database.

In one company I worked at, someone had even written a query-creator 
which would build the appropriate LIKE phrase to search within the 
serialised format! (But needless to say, it performed atrociously 
slowly, because you're doing a full table scan every time..)

I agree with you, that this practice is usually indicative of poor 
design -- even if the database offers ways to index and query the 
document natively. (Which is still a big improvement though)

However, I think you're missing a point. Sometimes you are really 
storing a whole document, or at least a large amount of very structured, 
self-contained data. For these cases, JSON storage is quite useful and 
document storage is a lot neater (and performant) than having to write a 
whole heap of code to deparse and reparse it into all the constituent 
components.

-Toby