domingo, 20 de marzo de 2016

console tools for structured trext

The lingua franca of unix is plain text and raw files. That's why we can do so many things using its standard tools and pipes as combinators.

But there are some special formats of files which have a concrete structure, and we can extract meaning from its structure. For example CSV.


jq is like sed for json files. It allows you to parse, grep, replace, match and join json files. For example, /tmp/issues.json being the output of a github repo issues:

cat /tmp/issues.json | jq '.[] |select(.labels[].name | in({"S-zendesk": 12})) | {labels: [.labels[].name | match("^A-.*") | .string] }'

Select issues which have label S-zendesk, and pick the label A-* of it. To know which are the areas that have zendesk issues.


Xmlstarlet is the same as jq but for xml. Allows you to print and match fields from xmls. For an xml file like this:
We can list title of files and id's with:

cat files.xml | xmlstarlet sel  -T -t -m 'files/file' -v 'title' -o ' => ' -v 'id' -n


I just discovered dateutils. But it seems a very good companion for tail, or just to do standalone date calculations.


Part of netpbm. compare and operate on images from the commandline. Sucks less (apparently) than imagemagick.