Code Self Study Forum

How to prettify huge JSON files (jq stream)

I have a large, compact (one-line) JSON file (2 GB) that is too large for most tools to process quickly. I thought if I could prettify it, it would be possible to process it with command line tools more quickly.

Trying to load it into memory and then to mongo via a script was taking too long.

Python has a nice JSON formatting tools, but that was too slow:

$ cat filename.json | python3 -m json.tool > output.json

It looked like jq has a streaming option, and that ended up working better for what I was doing:

$ cat filename.json | jq --stream '.' > output.json

You can see if it’s working by typing tail -f output.json in another terminal while the command is running.

jq looks like a useful tool. There are more options listed here:
https://stedolan.github.io/jq/tutorial/

Another jq tip – minify JSON with the -c option:

A data.json file:

{
    "name": "Bilbo",
    "age": 111,
    "equipment": ["mithril shirt", "sting", "ring"]
}

command:

$ jq -c < data.json

output::

{"name":"Bilbo","age":111,"equipment":["mithril shirt","sting","ring"]}

an example of where it could be useful:

$ curl -XPOST http://localhost:4444/api/characters \
    -H 'Content-Type: application/json' \
    -d '{"name":"Bilbo","age":111,"equipment":["mithril shirt","sting","ring"]}'

jq tutorial: