-->

Postgres with JSON

When designing a web application, one of the most important decision is where to store the data. There are many alternatives, and the most popular (that are free) are PostgreSQL, MySQL and MongoDB. In this blog post I will focus on PostgreSQL (postgres in short) and on its feature of JSON support.

Few definitions first

Relational database - software application that holds its data in a rational model. It means that the data is stored in tables, each table has columns and rows. Each row represents an object, generally it represents a class in the object oriented programing. The columns for each table are determined, and the result is a structured data. The structure of the data is fixed and new records or object should be inserted according to the predefined structure. The main language that is being used is SQL for querying the database and make CRUD (create, update, delete) operations. The popular relational databases are Oracle, SQL Server, MySQL and Postgres.

JSON - JavaScript notation, a standard for data representation. It is originally used for data transmission, but today it has many Usages such as data store, configuration files templating etc. The JSON format is lightweight and allows unstructured data in contrast to other format such as XML. Generally JSON does not have a scheme so there are no validations on its content.

Document oriented database - one of the NoSQL families. This database in the contrary to rational database hold the data in documents that are unstructured, it means that every document can be different. One of the popular implementation for it is MongoDB, that stores it data in a JSON documents.

What to choose?

The decision of choosing the right solution is based on the purpose of the application and its data model. For many years the only solution was relational database, and the application had to be build and adjust itself to it. A lot of computation and hard migration of data had to be done in order to arrange the data in the predefined structure, every upgrade or model change caused a lot of work and there was a lot of compromises that had to be taken in the way.

In the the last couple of decades a lot of NoSQL databases started to show up and begin to gain popularity. Each database has its unique solution and advantages compared to the others. One of the biggest advantages compared to relational databases is that the data can be unstructured with great performances and size efficiency.

So the question that pops to mind is whether to take a NoSQL database that fit perfectly to my data model and leave the relational databases aside.

My experience is that the answer is generally ‘no’. Unless the application is very complicated and contains a lot of different and diverse modules, or it needs a database that can give it advanced analysis capabilities (like Hadoop), you better stay with relational database. There are few reasons for it:

The hybrid solution

Fortunately in the last couple of years some of the relational databases added “NoSQL” capabilities such as storing unstructured data. One of them is postgres that added the JSON column type in its 9.4 version. This column can hold a JSON document such as document oriented databases and enables indexing and complex queries. Now, i can store the data structured and unstructured in the same database and even in the same table. For instance, let’s assume an application that receives a data from an API and its only uses the data to represents it to the user. Instead of migrate the data into a structured table, i can store it as JSON, and not care for changes of the API. I can make operation such as search and filtering on the data with minimum effort. In addition to that I can store some structure data the will accompany the unstructured data such as creation date, fields for managing the data etc. i’m enjoying both worlds. Because most of the rendering is being done by sending JSON to the client, in many cases I can transfer the data ‘As Is’ without any manipulation.

JSON in practice (with rails)

First you need to update your postgres installation to 9.4 or later you can find here a good guide for it.

Create a table with JSON column:

rails generate model Weather city:string forcast:jsonb

The result will be:

class CreateWeathers < ActiveRecord::Migration
  def change
    create_table :weathers do |t|
      t.string :city
      t.jsonb :forecast
      t.timestamps null: false
    end
    add_index  :weathers, :forecast, using: :gin
  end
end

The forecast column type is JSONB which is almost identical to the JSON type but is more efficient for processing but a bit slower in storing. To make thing go fast we will add an index to the column. JSONB indexes are called GIN (Generalized Inverted Index) that makes queries much faster.

In Order to query the JSON data, postgres defined special operators. A full table can be found here. The main ones are:

For this JSON for instance: (from http://openweathermap.org/api

  {"city": {
    "id": 2643743,
    "name": "London",
    "coord": {
      "lon": -0.12574,
      "lat": 51.50853
    },
    "country": "GB",
  },
  "list": [{
    "dt": 1458302400,
    "temp": {
      "day": 8.48,
      "min": 4.81,
      "max": 9.12,
    },
...

Query the weather from all the cities of Great Britain will be:

Weather.where("forecast -> 'city' ->> 'country' = 'GB'")

The forecast -> ‘city’ will retrieve the json of ‘city’, the -» ‘country’ will retrieve the text of ‘country’ and will compare it to ‘GB’.

Conclusion

There are a lot of benefits using NoSQL databases in terms of performance, scale, availability and more. But old good relation database can provide a very good solution for the common web application. Using Postgres with JSON type adds a huge advantage when there is a need to handle unstructured data. It allows us to have the benefits of relational database such as transaction management, constraints and more. In addition, the ability to have good performance working with unstructured data.

Originally posted on Spectory's blog