Richard Forshaw

Posted on Feb 9, 2023 • Edited on Feb 22, 2023 • Originally published at developdeploydeliver.com

Turbocharge Your AWS CLI Skills with JSON Parsing

#aws #json #cloud #productivity

I am a self-professed want-to-be keyboard-wizard, and I was brought up on amazing text-manipulation tools like grep, sed and awk. Text-based data manipulation can still be very fast and powerful and might save you dozens of lines of code. This post shows how a little knowledge of JSON-parsing tools can go a long way.

Tools

The tools used in this article are as follows:

AWS CLI
JMES Path parser
JQ
Bash

This combination may mean that some examples below may not map exactly to your particular use-case or configuration, so please bear that in mind.

Basic JSON Formatting and Filtering

I looked at basic formatting and filtering in this post about essential AWS CLI skills. As a quick recap, you can use the --query command-line parameter to pass a string specifying a query expression on the JSON results, and sometimes these can get quite complex.

If you want a bit more power in your querying, it is worth looking at the JQ tool, which also lets you process JSON structures. The good thing about using JQ is that if can operate on files as well as STDIN, so you can save your JSON output into a file and run JQ over and over. This will definitely be faster when writing your queries and may also save you some AWS processing cost.

As a start, JQ is great for simply pretty-formatting JSON output from something like a lambda function, by just piping it into jq '.'. This isn't necessary with the AWS CLI because it already formats the JSON output.

AWS CLI examples

The simplest use of the CLI filter is to print a reduced amount of data so that it is more manageable. This uses JMES Path expressions to process the JSON data in the output.

Here are some basic examples. Note that the query expression is added using the --query argument. In this case we will use CloudFormation output:

aws cloudformation describe-stacks --query "<filter-goes-here>"

Use Case	Command
Show only stack ID, name and update time	`'Stacks[*].[StackId, StackName, LastUpdatedTime]'`
Show stack name/ID for stacks whose name contains 'foo'	`'Stacks[?StackId.contains(@, 'foo')].[StackId, StackName]'`
Show stack name/ID for stacks which have outputs exported and have been updated since Nov 2022	`"Stacks[?Outputs && LastUpdatedTime>'2022-11'].[StackId,StackName,LastUpdatedTime]"`

(Note the final example only works because the date format in LastUpdatedTime is able to be compared as a string - more on that later.)

JQ

If you want to do serious local-processing of AWS or any other JSON output, you need to get familiar with JQ, an awesome tool which lets you process JSON structures, not only filtering it like the AWS CLI, but also restructuring it.

Here are a few basic ways to use it. Note that you can either pipe input into JQ or provide a filename which contains your JSON.

Use Case	Command
Simply pretty-print JSON output	`jq '.'`
Get all the sort keys from a dynamo query response (assuming they are strings)	`jq '.Items[].SortKey.S'`
Put the result into a list	`jq '[.Items[].SortKey.S]'`
Only print keys, not values	`jq 'keys[]'`

An Important Note

It should be stressed here that QUOTING IS IMPORTANT. You may have noticed above that for the AWS --query I used double-quotes to enclose the whole expressions, and single-quotes when quoting literals within the expression. But for JQ, you use the opposite, which is made clear in the manual. At least within the environment I am using (bash), not following this will only lead to endless miserable debugging.

Comparing The Two

In general, the --query filters using JMES are a little more concise than their JQ alternatives, but in my opinion the sequential nature of JQ using pipes (|) is more readable than JMES. However, there are some other important differences to consider:

	CLI --query expression	JQ Expression
Usage	Only with AWS CLI commands	With any output or file
Output	Only outputs filtered results	Can restructure into new JSON
Types	Does not handle dates natively	Handles date conversions
Scripting	Expression must be entered on command-line	Expression can be store in file with comments

Below are some common queries I've used, with the CLI query and JQ query side-by-side. You should note that all the JQ expressions are wrapped in [], because by default JQ does not output a list. The AWS CLI query function does output a list, so the additional [] are used to match the outputs. For these examples, I am using DynamoDB output which looks something like this:

{
    "Items": [
        {
            "PartitionKey": { "S": "blog/books/2023-01-drive-daniel-pink/" },
            "SortKey": { "S": "1675071762" },
            "SomeField": { "N": "54" },
            // Other Fields...
        },
        {
            "PartitionKey": { "S": "blog/books/2023-01-drive-daniel-pink/" },
            "SortKey": { "S": "1675071862" },
            "SomeField": { "N": "44" },
            // Other Fields...
        },
        // More items...
    ]
}

Example	AWS CLI query expression	JQ Expression
Show all sort keys from dynamo output	`Items[*].SortKey.S`	`[.Items[].SortKey.S]`
Particular attributes from dynamo output	`Items[*].[SortKey.S,SomeField.N]`	`[.Items[] \| [.SortKey.S,.SomeField.N]]`
Filter by field value	`Items[?SortKey.S>'1674000000'].SomeField.N`	`[.Items[] \| select(.SortKey.S>"1674000000").SomeField.N]`
Filter on string prefix	`Items[?starts_with(SortKey.S, 'TEXT')].SomeField.N`	`[.Items[] \| select(.SortKey.S \| startswith("TEXT")).SomeField.N]`

If you want an example not using the data above, here is one you can run on your CloudFormation stacks right now. Each command (one JMES and one JQ) will show the last updated time of only your 'Dev'-stage stacks:

bash-5.1$ aws cloudformation describe-stacks --query "Stacks[?contains(Tags[], {Key: 'STAGE', Value: 'dev'})].[StackName,LastUpdatedTime]"
bash-5.1$ aws cloudformation describe-stacks | jq '[.Stacks[] | select(.Tags[] | contains({Key: "STAGE", Value: "dev"})) | [.StackName,.LastUpdatedTime]]'
[
    [
        "my-sls-stack-dev",
        "2023-02-03T07:56:49.848Z"
    ],
    [
        "my-www-stack-dev",
        "2022-12-20T11:07:21.155Z"
    ]
]

Getting more complex

Let's do some sorting. Yes, they can do that, and have many other functions built in!

Example	AWS CLI query expression	JQ Expression
Sort output numerically by field values*	`sort_by(Items[], &to_number(SomeField.N))[][SortKey.S,SomeField.N]`	`[.Items[] \| [.SortKey.S,.ServiceTime.N]] \| sort_by(.[1] \| tonumber)`
Sum fields (e.g. get total page access time )	`sum(map(&to_number(ServiceTime.N), Items[*]))`	`[.Items[].ServiceTime.N \| tonumber] \| add`
Perform counting e.g. sum of pages accessed by Mozilla	`length(Items[?AgentString && starts_with(AgentString.S, 'Mozilla')])`	`[.Items[] \| select(.AgentString.S \| startswith("Mozilla"))] \| length`

* Note the expressions convert the fields to numbers here so as to sort numerically rather than textually

Wrapping Up

This post shows that it is possible to perform some complicated transformations on JSON output data. I think most people could see that the above commands, which you can almost call 1-liners, can replace a whole JavaScript or Python function and allow you to perform complicated ad-hoc and maybe even regular tasks, with much less development overhead.

In fact, because JQ can read your filter expression from a file (which can also contain comments), complex filters can turn into 1-liners, with JQ as your script interpreter. This also means that you can version-control and track your JQ scripts.

The specifications of these tools are quite similar and are available in library form in JavaScript, Python, Go and many other languages.

More Resources

Filtering output from the AWS CLI

This post was adapted from a larger post on my blog. See the full post here

DEV Community

Turbocharge Your AWS CLI Skills with JSON Parsing

Tools

Basic JSON Formatting and Filtering

AWS CLI examples

JQ

An Important Note

Comparing The Two

Getting more complex

Wrapping Up

More Resources

Top comments (0)

Read next

Container Orchestration with Kubernetes

Amazon Q: Your GenAI Assistant for Business Processes, Code Reviews, and Documentation

How to Retrieve EC2 Instances Information Using Python and Boto3

iter.json: A Powerful and Efficient Way to Iterate and Manipulate JSON in Go