Short Note: use DuckDB as (pipe) jq replacement

The usage of the jq can be substituted sometimes by DuckDB with a bit cleaner syntax.

For example, to get properties Id and State.Status from a list of active containers in docker one might do this:

➜ docker inspect f42 | jq '.[] | "\(.Id) \(.State.Status)"' -r
f42d97a8670846a737a972329cf44e5b092954ef7e30681f73d70963ad971e43 running

but it might be easier to write a modern SQL JSON:

➜ docker inspect f42 | mjq "select Id, State->>'Status' from JQ"
┌──────────────────────────────────────────────────────────────────┬──────────────────────┐
│                                Id                                │ (State ->> 'Status') │
│                             varchar                              │       varchar        │
├──────────────────────────────────────────────────────────────────┼──────────────────────┤
│ f42d97a8670846a737a972329cf44e5b092954ef7e30681f73d70963ad971e43 │ running              │
└──────────────────────────────────────────────────────────────────┴──────────────────────┘

How does mjq work? It's this function basically:

mjq() {
  local TFILE="/tmp/mjq-$((RANDOM % 100)).json"
  cat >  ${TFILE}
  local SQL="${1//JQ/read_json('${TFILE}')}"
  duckdb -c "${SQL}"
}

I used mjq because that's a prefix for my tools. You can build any SQL you want as arg, and JQ string will be replaced with the read_json output of the file provided as the input.

Further ideas:

  • the file was used since then further debugging can also be done without re-doing the upstream bash command. YMMV

More information about usage of DuckDB and the motivation for this script: https://www.pgrs.net/2024/03/21/duckdb-as-the-new-jq/

More information about DuckDB JSON extension: https://duckdb.org/docs/extensions/json.html