DEV Community

Cover image for uploading to s3 with bash
grant horwood
grant horwood

Posted on • Edited on

uploading to s3 with bash

s3 is handy and useful, but it would be a lot more handy and useful if we had a shell script for pushing things to buckets; a script that didn’t require a special cli tool. something portable that we could add to existing scripts or throw in a cron job or distribute to our friends. a script that we could tell “put this file in this bucket” and it would just do that. that would be a good thing. let’s build that.

note: for the impatient, this is provided as two function in a gist.

let’s start with curl

the actual uploading part of our script is going to use curl. curl is great stuff; i’m a big fan of curl. here’s a template for a curl call to put a file into an s3 bucket:

curl -s -X PUT -T "" \
-H "Host: .s3.amazonaws.com" \
-H "Date: " \
-H "Content-Type: " \
-H "Authorization: AWS :" \
https://.s3-$.amazonaws.com/
Enter fullscreen mode Exit fullscreen mode

as far as curl calls go, this is fairly straighforward: four headers, an url and a source file. we’re using the http PUT method here with the -T argument to ‘transfer’ a file.

let’s look at each of those configuration values:

  • <path to your file>: this is the path to the file on your local disk; the one you want to upload to s3.
  • <your bucket name>: the name of your bucket
  • <your bucket’s aws location>: the location of your s3 bucket, ie us-west-2 or similar.
  • <the date>: the current date in the format that date -R outputs. this is the RFC 5322 format. it looks something like “Mon, 22 Apr 2024 09:32:43 -0600”
  • <the file’s mime type>: the mime type of the file to upload, ie text/html.
  • <your aws access key id>: this is the public key part of your amazon credentials. look in your ~/.aws/credentials file for it. if you don’t have credentials yet, you’ll have to go get them.
  • <the name of the file to create in the bucket>: this is the name of the file you want to be created in the bucket.
  • <the calculated signature>: a signature we calculate to prove to amazon that we actually have permission to do this.

we should already have almost all of this data. the exception, of course, is the ‘calculated signature’. that, as the name implies, we need to calculate.

calculating the signature

the official documentation on how to build the calculated signature is terrible and convoluted. which is a shame, since the process is actually moderately elegant.

the purpose of the signature is to prove two things: that we actually have access to the bucket, and that the signature is unique to this curl request and has not been reused. this is done by constructing a string that contains data about the file we’re uploading and date, and then creating a hash of that string and signing it with our aws secret key.

let’s look at a sample signature calculation:

s3_resource=""
content_type=""
date_value=""
aws_secret_access_key=""

string_to_sign="PUT\n\n${content_type}\n${date_value}\n${s3_resource}"
signature=`echo -en ${string_to_sign} | openssl sha1 -hmac ${aws_secret_access_key} -binary | base64`
Enter fullscreen mode Exit fullscreen mode

here’s a quick once-over of the configuration values being used:

  • <the resource of the file to be created> the file we are creating in s3 in the format of <bucket name>/<file name>.
  • <the mime type of the file> the mime of the file. it must be the same as the Content-Type header in the curl call.
  • <the date> this is the exact same date string in the same format as we used in the Date header of our curl call. it looks something like “Mon, 22 Apr 2024 09:32:43 -0600”
  • <your aws secret key> this the secret part of your amazon credentials.

once we have all those configuration values, we’re going to assemble them into a string formatted with some unix line endings that looks something like:

PUT

text/html
Mon, 22 Apr 2024 09:32:43 -0600
mybucket/testfile.txt
Enter fullscreen mode Exit fullscreen mode

we then take that string and use the openssl command to hash it using sha1 and sign it using our aws secret key. the hash output is set to binary and is then converted to text using base64.

when aws gets our curl request, it’s going to reconstruct this hash using the values of the Content-Type and Date headers we sent and the name of the file we’re uploading. since we sent aws our public key, it can look up our private key to confirm the signature. if everything matches, aws knows that we own the bucket we’re uploading to and that the signature is unique to this request and isn’t something we re-used or just copied off the internet.

once we have our calculated signature, we can send it as the value of our Authorization header in our curl.

putting it all together

once we know how to calculate the signature and write the curl, we can apply this anywhere we want to.

for convenince, this has been written in a gist.

####
# Configuration
#

# S3 bucket and location
bucket="<your bucket name>"
location="<your aws region, ie. us-west-2>"

# AWS credentials
aws_access_key_id="<your aws access key id>"
aws_secret_access_key="<your aws secret key id>"

# Do not edit below.

####
# Calculated values

# file parts
file_path=$1
file_name=`basename "$file_path"`

# Content-Type header for curl
file_mime=`file --mime-type ${file_path}`
content_type=`echo $file_mime | awk -F ": " '{print $2}'`

# Date in format for header and signature
date_value=`date -R`

# Destination file path on s3 bucket
s3_resource="/${bucket}/`basename \"$file_path\"`"

#### FUNCTION BEGIN
# Build AWS signature for api call
# GLOBALS: 
#   -
# ARGUMENTS: 
#   s3_resource
#   content_type
#   date_value
#   aws_secret_access_key
# OUTPUTS: 
#   null
# RETURN: 
#   String. The signature
### FUNCTION END
function build_sig() {
    s3_resource=$1
    content_type=$2
    date_value=$3
    aws_secret_access_key=$4

    string_to_sign="PUT\n\n${content_type}\n${date_value}\n${s3_resource}"
    signature=`echo -en ${string_to_sign} | openssl sha1 -hmac ${aws_secret_access_key} -binary | base64`

    echo "$signature"
}

#### FUNCTION BEGIN
# PUT file to S3 bucket
# GLOBALS: 
#   -
# ARGUMENTS: 
#   file_path
#   bucket
#   location
#   date_value
#   content_type
#   aws_access_key_id
#   signature
# OUTPUTS: 
#   null
# RETURN: 
#   void
### FUNCTION END
function put_s3() {
    file_path=$1
    bucket=$2
    location=$3
    date_value=$4
    content_type=$5
    aws_access_key_id=$6
    signature=$7
    file_name=`basename "$file_path"`

    curl -s -X PUT -T "${file_path}" \
      -H "Host: ${bucket}.s3.amazonaws.com" \
      -H "Date: ${date_value}" \
      -H "Content-Type: ${content_type}" \
      -H "Authorization: AWS ${aws_access_key_id}:${signature}" \
      https://${bucket}.s3-${location}.amazonaws.com/${file_name}
}

# entry point

# Build signature for this API call
signature=`build_sig "$s3_resource" "$content_type" "$date_value" "$aws_secret_access_key"`

# PUT the file to the s3 bucket
put_s3 "$file_path" "$bucket" "$location" "$date_value" "$content_type" "$aws_access_key_id" "$signature"

exit 0
Enter fullscreen mode Exit fullscreen mode

🔎 this post was originally written in the grant horwood technical blog

Top comments (0)