You: are a data scientist harvesting Github visitor traffic stats for analysis.
I: am thirsty and insecure.
We are both interested in tallying Github's unique-visitor logs for all of our repos.
Collecting traffic data
Github offers a REST endpoint to query referrers, most-popular files, views (total or unique) and clone traffic, and most well-appointed Linux hosts offer the curl
and jq
utilities for polling and parsing the returned JSON.
#!/bin/bash
BUFF=$(mktemp)
# Query the Github `user` endpoint to get a list of the user's repos.
# This is neater than hard-coding a username or URL into this script.
#
MYREPOS=$(curl -sn https://api.github.com/user | jq .repos_url | tr -d '"')
# Query $MYREPOS for a list of repository URLs and mutate them into
# `https://api.github.com/$user/$repo/traffic/views`
#
curl -sn $MYREPOS | jq .[].url | tr -d '"' | while read URL
do
curl -sn "$URL/traffic/views" | jq -c .views[] | while read view;
do
# Query the repo's `views` URL and parse returned JSON
# for unique-visitor counts by date
VIEWDATE=$(echo "$view" | jq .timestamp | awk -F"T" '{ print $1 }')
VIEWERS=$(echo "$view" | jq .uniques)
echo "$VIEWDATE, $VIEWERS, $URL" | tr -d '"'
done
done > $BUFF
# Now we should have a temp file that looks like:
#
# 2019-05-11, 1, https://api.github.com/repos/lbonanomi/go
# 2019-05-12, 1, https://api.github.com/repos/lbonanomi/go
# 2019-05-13, 1, https://api.github.com/repos/lbonanomi/go
#
# Let's leverage `seq` and GNU `date` to make a list of dates for the
# last 10 days and build a CSV
seq 0 9 | tac | while read SINCE
do
# datestamp
date -d "$SINCE days ago" +%Y-%m-%d | tr "\n" "\t"
# If there's no-data report a '0'
grep -q $(date -d "$SINCE days ago" +%Y-%m-%d) $BUFF || printf "0"
# Normally grep-before-awk makes me psychotic, but I think this is clearer.
grep $(date -d "$SINCE days ago" +%Y-%m-%d) $BUFF|awk -F"," '{t=t+$2} END{print t}'
done
# Give a hoot and delete temp files when they're done.
rm $BUFF
Running our script should give us a tsv-formatted count of unique visitors to our personal Github presence in the last 10 days:
2019-05-16 5
2019-05-17 9
2019-05-18 1
2019-05-19 2
2019-05-20 0
2019-05-21 2
2019-05-22 2
2019-05-23 8
2019-05-24 3
2019-05-25 4
Wouldn't this look handsome with spark graph or a line graph?
Top comments (0)