I write a lot of blog posts on dev.to and really like using Markdown for blogging. But one thing that would make the blogging process even more convenient, is if I had all the image links I've made on previous dev.to posts in one place for reusing images.
So in this tutorial, I will show how we can use Ruby packages to make a CSV file of every image link in your blog posts from the command line, using the dev.to/Forem API.
We will:
- π» Send an authenticated request to the dev.to API with the standard library
net/http
package and the Forem API - π Deserialize JSON into objects of a custom class with the standard library
json
package - π Retrieve the image links using the
commonmarker
gem - πΌ Serialize objects into CSV files with the standard library
csv
package
This tutorial assumes you're familiar with JSON, HTTP requests, and the basics of Ruby.
Contacting the dev.to API
The first thing you need in order to talk to a web API is an API client. An API client is some code that sends authenticated HTTP requests to the web API. We can get a client in one of two ways:
- Search for a Ruby Gem of a client for the dev.to/Forem APIs; a lot of people have made code repositories for talking to websites' APIs so that other people can use them.
- Make our own API client from scratch
Since we're only talking to one HTTP endpoint on dev.to's API in this tutorial, we'll go with the second option. So let's jump to dev.to's API documentation!
The endpoint we're talking to is the User's Articles endpoint, which lists all articles belonging to the user calling that endpoint as JSON. Looking at the request and response samples on the right, we will find that:
- The request is a GET request to the endpoint
/articles/me
- The response is an array of JSON objects
- Each JSON object in the array contains many fields for all the data representing the article, namely
title
for the title of the article, andbody_markdown
for the Markdown of the entire text of the article
- Each JSON object in the array contains many fields for all the data representing the article, namely
- The sample cURL request contacting that endpoint contains the header
api-key
, which tells dev.to who you are and gives you access to your own account's data.
So we need a client class that can:
- Send an HTTP request to the dev.to API's
/articles/me
endpoint, with an API key as authentication - Deserialize the JSON response to Ruby objects, giving us access to the fields we are interested in.
If you're following along, make a folder for your Ruby app and write this code to app.rb
in that folder:
require 'net/http'
require 'json'
class DevToClient
def initialize(api_key)
@api_key = api_key
end
def get_my_articles
# [TODO] Send HTTP request to /articles/me and deserialize
# the JSON response
end
end
We have a DevToClient
class with two methods:
-
initialize
as its constructor. In the constructor, we pass in our API key, and it gets stored in the instance variable@api_key
-
get_my_articles
, which will send an HTTP request to dev.to's/articles/me
endpoint that uses the@api_key
and deserialize the response
Now for getting the HTTP response. Looking at the documentation for net/http
, there is the method HTTP.get_response for sending a GET request to the URL passed in and getting back an HTTP response.
From the get_response
method's documentation, in addition to a URL, we also are able to optionally pass in a hash of any request headers we want to pass in. So we can pass in an api-key
header with this code:
def get_my_articles
Net::HTTP.get_response(
URI('https://dev.to/api/articles/me'),
{ 'api-key': @api_key }
)
end
We send the request to dev.to's /api/articles/me
endpoint with an api-key
header containing the value of the DevToClient
's @api_key
instance variable.
To run this, you are going to need to get your own API key for the dev.to API. To do that, first, make a dev.to account if you don't have one already. Then, you can get your API key by following the Authentication instructions on the dev.to API docs.
β οΈWARNING!β οΈ For any web API you are working with, DO NOT share your API key or other forms of authentication with anyone; don't post it online or email it to your friends, and also don't commit it in your code! If someone else gets ahold of your API key, they will be able to impersonate you on that API and access your account data! If you suspect that someone has obtained one of your authentication keys or secrets, you should have that key/secret invalidated and then a fresh API key/secret created to protect your account.
Now, your save your API key to the environment variable DEVTO_API_KEY
. Then, at the bottom of app.rb
, add this code:
api_key = ENV['DEVTO_API_KEY']
puts DevToClient.new(api_key).get_my_articles
Run the code with ruby app.rb
and you should see terminal output like this:
#<Net::HTTPOK:0x00007fc20a92f800>
We got back a response of class HTTPOK (which inherits from HTTPSuccess and in turn inherits from HTTPResponse). So now we have our response, so let's parse it to make Ruby objects for each article.
JSON parsing in Ruby
In addition to net/http
, the Ruby standard library has a json package for serializing deserializing JSON, and inside that package there is the method JSON.parse. So if we did Ruby code like
sloth_json = <<EOF
{
"sci_name": "Bradypus",
"common_name": "Three-toed sloth",
"claw_count": 3
}
EOF
sloth = JSON.parse(sloth_json)
sci_name = sloth["sci_name"]
common_name = sloth["common_name"]
claw_count = sloth["claw_count"]
puts "The #{sci_name} (#{common_name}) has #{claw_count} claws"
and then ran ruby app.rb
, we would get output like this:
The Bradypus (Three-toed sloth) has 3 claws
The object returned from JSON.parse
is a Ruby hash, with its field names becoming the hash's keys.
Let's try JSON.parse
to return a Ruby object from DevToClient#get_my_articles
:
def get_my_articles
res = Net::HTTP.get_response(
URI('https://dev.to/api/articles/me'),
{ 'api-key': @api_key }
)
if res.code.to_i > 299 || res.code.to_i < 200
raise "got status code #{res.code}"
end
JSON.parse res.body
end
Now, if we got a status code besides 2xx, we raise an error. Otherwise, we return the result of parsing the response body.
If you then ran code like
puts DevToClient.new(api_key).get_my_articles
you will see that the Ruby object that the response body deserialized to was an array of Ruby hashes. So we could do something like this:
DevToClient.new(api_key).get_my_articles.each do |article|
md = article["body_markdown"]
# now use a Markdown file to find every image link in
# the article
end
But what if we wanted a DevToArticle
class that handles digging for all the image links, and we wanted to deserialize our JSON to an array of DevToArticles
instead of hashes?
Let's start by making a DevToArticle
class:
class DevToArticle
attr_accessor :id, :title, :body_markdown, :url
def initialize
# [TODO] Add deserialization logic here
end
def get_article_images
# [TODO] Add Markdown parsing for the article's
# @body_markdown
end
end
Since Ruby doesn't directly know if that the JSON it's getting is supposed to be a DevToArticle
, calling JSON.parse
will return an array of Ruby hashes. So we will need just a bit of extra logic for converting those hashes to DevToArticle
s.
I wasn't sure how to do this at first; in Go, the main programming language I work with, I would be doing this using code like this:
type DevToArticle struct {
ID int `json:"id"`
Title string `json:"title"`
BodyMarkdown string `json:"body_markdown"`
URL string `json:"url"`
}
func (d *DevToClient) GetMyArticles() ([]DevToArticle, error) {
// get the HTTP response for the "user's articles" API
// endpoint here
var articles []DevToArticle
if err := json.NewDecoder(res).Decode(&articles); err != nil {
return nil, err
}
return articles, nil
}
I searched for how to deserialize to a custom class rather than a hash, and after asking about that on Twitter, Jamie Gaskins told me that there isn't really a standardized way in Ruby to deserialize to a class, but you are able to give your Ruby class an initialize method that takes in a hash. So based on that advice, in our DevToArticle#initialize
class, the deserialization logic would look like this:
def initialize(attributes)
@id = attributes['id']
@title = attributes['title']
@body_markdown = attributes['body_markdown']
@url = attributes['url']
end
For each field we want an instance variable for, we just pull it out of the attributes
hash passed in.
Note, by the way, that this also gives us control of the casing scheme for the deserialized objects. In Ruby, the standardized casing for instance variables is snake_case, and that's the casing the Forem API uses, but what if Forem was a camelCase API instead? @body_markdown
still is able to be snake_case even if bodyMarkdown
in the hash is camelCase:
@body_markdown = attributes['bodyMarkdown']
Now, to have DevToClient#get_my_articles
return an array of DevToArticle
s instead of an array of hashes, we can do this:
articles = JSON.parse(res.body)
articles.map { |article| DevToArticle.new article }
By passing that block into articles.map
, we get back an array of DevToArticle
s created from each hash in the articles
array, so now get_my_articles
returns the type we want: an array of DevToArticle
s. Now let's jump into the Markdown of those articles all their image links!
Markdown parsing with commonmarker
Unlike HTTP and JSON, the standard library in Ruby doesn't have a Markdown package, so we can either write our own Markdown parser, or use a Markdown-parsing Ruby Gem.
And it turns out that there's a popular Ruby Gem that lets us parse a Markdown file and then walk over its nodes (nodes as in text, links, images, etc): CommonMarker! To get it, first run bundle init
to create a Gemfile, then in the Gemfile, add the line
gem "commonmarker"
Then run bundle install
. If CommonMarker it successfully installs, you should be able to use it in your Ruby code.
To start, add require 'commonmarker'
to the top of app.rb
, then in DevToArticle#get_article_images
, add this code:
def get_article_images
doc = CommonMarker.render_doc(@body_markdown, :DEFAULT)
puts doc
end
If you run that function in app.rb
, you will get output for an article like:
#<CommonMarker::Node:0x00007fca8718f228>
indicating that we were able to parse the Markdown in @body_markdown
, converting it to a CommonMarker::Node
object.
Following this example in the CommonMarker documentation, we can walk over all the nodes in the document with code like this:
def get_article_images
doc = CommonMarker.render_doc(@body_markdown, :DEFAULT)
doc.walk do |node|
puts node.type
end
end
Now in the do
block, we are looking at each Node
and seeing what type of node it is. So if you run this code, you might see output like this:
text
code
text
paragraph
image
text
text
paragraph
We're only interested in image
nodes, so we'll add an if statement to check the node's type, which according to the documentation, is the Ruby symbol :image
according to the new(p1) docs.
doc.walk do |node|
if node.type == :image
# [TODO] retrieve the node's content
end
end
Now we're only processing image links. And a Markdown image link has two parts: descriptive alt text, which screen reader software reads when viewing images, and the URL of the image. So we need to find ways to get both of those.
Looking at the CommonMarker documentation, the Node
method for getting the alt text is to_plaintext, and the Node
method for getting the URL of the image is url. So now, we can return the parts to the image link:
def get_article_images
doc = CommonMarker.render_doc(@body_markdown, :DEFAULT)
image_links = []
doc.walk do |node|
if node.type == :image
image_links.push [
node.to_plaintext.delete_suffix("\n"), node.url
]
end
end
image_links
end
So now, we have all the data we'll need for serializing to a CSV file!
Serializing your image links to a CSV file
The Ruby standard library also comes with a csv
package for parsing CSV files, or generating them from arrays of data. Each row will be one image link, including:
- The alt text of the image link
- The URL of the image
- The ID of the article that the image link came from
- The title of the article that the image link came from
- The URL of the article that the image link came from
So we will want CSV header text like:
Alt Text,Image URL,Article ID,Article Title,Article URL
And for each row in the CSV, we will want an DevToImageLink
Ruby class to represent all the data in that row
class DevToImageLink
def initialize(article, image_alt, image_url)
@article = article
@image_alt = image_alt
@image_url = image_url
end
def to_csv_row
[@image_alt, @image_url, @article.id, @article.title, @article.url]
end
end
In the initialize
method we pass in the DevToArticle
for the image link, and the alt text and URL of the image, to become instance variables. And in the to_csv_row
method, all of these fields are put into a Ruby array.
Heading back to the DevToArticle
class, now that we have the DevToImageLink
class defined, let's have DevToArticle#get_article_images
return an array of DevToImageLink
s, rather than an array of arrays:
if node.type == :image
- image_links.push [node.to_plaintext, node.url]
+ image_alt = node.to_plaintext
+ image_url = node.url
+ image_links.push DevToImageLink.new(self, image_alt, image_url)
end
Now that that's all set, let's add a top-level get_image_links_csv
function that will convert our article to a CSV.
def get_image_links_csv(api_key)
CSV.generate do |csv|
csv << [
'Alt Text','Image URL','Article ID','Article Title','Article URL'
]
DevToClient.new(api_key).get_my_articles.each do |article|
article.get_article_images.each do |image_link|
csv << image_link.to_csv_row
end
end
end
end
The function CSV.generate
takes in a block and returns the CSV string generated in that block.
In the first line inside the block, we pass in our CSV headers with the CSV's << method, so they will serve as the first line of the CSV.
Now, we loop over the image links in each of the articles returned by DevToClient#get_my_articles
. For each image link, we call the DevToImageLink#to_csv_row
method, and then load the returned array into the CSV.
Finally, the return value of CSV#generate
is a string in CSV format. So now, we can use that code like this:
puts get_image_links_csv(api_key)
Using three standard library packages and a gem, we were able to make a convenient script for getting all our dev.to image links and converting them to a CSV. In my next Ruby tutorial, I will be looking at using a gem for giving this script a better user experience so it's easier to search the CSV for the image you want.
Top comments (1)
Very well written and easy to understand! Thanks for sharing!