Let's be lazy:
Here's a repo with a basic setup that maps from wpgraphql to json-linked-data in the schema.org vocabulary through a type-safe function. If that sounds complicated, here's the long explanation:
Intro to schema.org
To be more easily discoverably by search engines, WebSites can add structured data to their content. In this way google find's a site's own Ratings:
But it is also used to make the Web more interactive. Emails can include data about possible Actions, which GMail turned into this "View Pull Request" button here, so you don't have to open, read and click a link in the email:
The examples above are using a common, open vocabulary, schema.org, so other email-services could understand the structured data in github's email as well. More importantly, we can add data structured in json-ld and other formats to our own web pages, and for example make it easier for people to find an Event we're hosting close to them:
I don't know how many readers are hosting events, but I guess it would be easier to let Event hosts fill in the data dynamically, for example with WordPress.
Isn't there a plugin for it?
The short answer is Yes, many. This article is for typescript developers who work with GraphQL, we'll only use WordPress as a headless Content Management System (CMS). This article doesn't even include a single line of PHP, instead we'll generate typescript from a GraphQL schema. This approach actually works without WordPress, but it's easier to start with it.
From WordPress to GraphQL endpoint
The WPGraphQL Plugin is quickly installed from the admin panel of your WordPress installation.
You don't have to leave the Admin Panel to play around with queries either, you get a GraphQL IDE right inside WordPress:
That query retrieves all the posts in a given category, a short excerpt and a featured image - ideal to create a preview for a blog post.
From GraphQL endpoint to GraphQL schema
Now this is security relevant: When we have a graphql API, it can tell a client about its structure with a so-called introspection query. You can generate a graphql schema with such a query, if you don't already have access to the graphql schema. An attacker could do the same and try to find vulnerabilities in your API, that's why it's better to have it switched off in production. WPGraphQL has this switched off by default, you need to go into the settings to switch it on:
While it's switched on, we can use standard functionality from the graphql
package and node.js
to generate the schema and save it locally.
import fs from "fs";
import fetch from "node-fetch";
import {
getIntrospectionQuery,
printSchema,
buildClientSchema,
} from "graphql";
/**
* runs an introspection query on an endpoint and retrieves its result
* thanks to this gist:
* https://gist.github.com/craigbeck/b90915d49fda19d5b2b17ead14dcd6da
*/
async function main() {
const introspectionQuery = getIntrospectionQuery();
const response = await fetch("https://blog.example.com/graphql", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({ query: introspectionQuery }),
});
const { data } = await response.json();
const schema = buildClientSchema(data);
const outputFile = "./wpgraphql-schema.gql";
fs.writeFileSync(outputFile, printSchema(schema));
}
main();
Replace blog.example.com
with your domain or localhost, whereever your graphql endpoint can be found. To be able to run that script, you need to set "type": "module"
in your package.json (because it's in ES6 syntax).
And once you have the wpgraphql-schema.gql
file, please, switch off introspection.
From GraphQL schema to typescript
With the wpgraphql-schema.gql
file we can already create typescript types. In order for that to work we'll add graphql-generator to our project:
yarn add @graphql-codegen/cli @graphql-codegen/typescript @graphql-codegen/typescript-operations -D
To run the generator, it needs a little bit of configuration, this is done in the codegen.yml
-file:
overwrite: true
schema: "./wpgraphql-schema.gql"
documents: "src/**/*.graphql"
generates:
src/generated/wp-graphql.ts:
plugins:
- "typescript"
- "typescript-operations"
The first line tells it to overwrite existing .ts-files, the second one points to the graphql schema we just got from the endpoint, and the third is the glob-pattern or file where our queries, mutations, fragments and subscriptions can be found. If we don't have any "documents", we can still run the generator with:
yarn graphql-codegen --config codegen.yml
and get types for our graphql schema. So getting all graphql schema-types is configured with the plugins:
- "typescript"
part. Getting queries etc is configured with the - "typescript-operations"
part. That adds types for all "documents" it finds to our .ts-file. You could also generate separate .ts-files, this is up to you (to split server-features from client-features like queries and fragments, for example).
We could already use the wpgraphql Post
-type to create a mapper-function to schema.org/BlogPosting
. However, that type is pretty large and we might not always query all fields for a given blog post.
As you can see, we don't need to write documentation on the types from wpgraphql, it gets the documentation from the graphql schema.
To simplify our queries, we can create a graphql fragment like so:
fragment PostPreview on Post {
title
excerpt
featuredImage {
node {
sourceUrl
altText
description(format: RAW)
srcSet(size: THUMBNAIL)
}
}
slug
}
and now any time we want to query these exact same fields we can just reference it in a query:
query wpPostPreviewByCategory {
posts(where: { categoryName: "my-side-projects" }) {
edges {
node {
...PostPreview
}
}
}
}
That query gets us the posts from a particular category, but only the fields from the fragment for each post. This is great for reusability in graphql-queries, and graphql-codegen
generates the type PostPreviewFragment
for us, so we can create a mapper-function that consistently works across queries.
We can now begin with the mapper-function:
import { PostPreviewFragment } from "./../generated/wp-graphql";
//other imports
export function mapWpPostPreviewToSchemaBlogPost(
input: PostPreviewFragment,
wpBaseURL?: string
): BlogPosting {
// ...curious about wpBaseURL and BlogPosting? We'll use that to build our json-ld
}
From typescript to schema.org
Here's a good example of what a Blogposting can look like with json-ld and other formats that mix with HTML. The benefit of adding a script-tag with json-ld is that we can generate our website - the un_structured data - in another way than our publicly available _structured data. This gives you more flexibility, especially since it's allowed to add several script-tags with type="application/ld+json"
to web pages.
Of course, for the human readers of our website we would want something visual. For our blog post we might display a link, an image, the post's title and a short preview. The machines reading our website don't know which is what. To help them read it properly as a BlogPosting, the equivalent json-ld would look like this:
<script type="application/ld+json">
{
"@context":"https://schema.org",
"@type": "BlogPosting",
"@id": "https://shnyder.com/business-model-canvas-for-metaexplorer",
"name": "Business Model Canvas for MetaExplorer",
"abstract": "<p>Before starting out with my plans to turn metaexplorer into a business, I wrote a business plan – and because 2020 showed us what it thinks of plans…</p>\n",
"image": {
"@type": "ImageObject",
"name": "Metaexplorer business model canvas, 2020-12-28",
"contentUrl": "https://shnyder.com/wp-content/uploads/2020/12/IMG_20201228_234552-scaled.jpg",
"thumbnailUrl": "https://shnyder.com/wp-content/uploads/2020/12/IMG_20201228_234552-150x150.jpg"
}
}
</script>
As you can see in the example's "@id"
-field and image urls, it includes a domain name - my blog's domain. Relative URLs are also possible in json-ld, so on shnyder.com I could use this:
"@id": "/business-model-canvas-for-metaexplorer"
Here on dev.to I would want to include the full domain name. To let our mapping-function include my domain, we add the optional parameter wpBaseURL
.
So how do we get the json-ld from the graphql types? The library schema-dts
helps us do exactly that, and in a typesafe way. That means we get intellisense on the actual schema.org vocabulary (ontology, even).
You can install it with:
yarn add schema-dts
A naive approach would be to map only the "happy path" from graphql to json-ld, where we always have the data that we want and need. That approach would look like this:
import { PostPreviewFragment } from "./../generated/wp-graphql";
import { BlogPosting } from "schema-dts";
/**
* mapper-function to create schema.org/BlogPosting(s) for previews from fragments
* @param input a fragment of a WordPress blog post
* @param wpBaseURL the base domain of your wordpress installation. Used to add the slug
*/
export function mapWpPostPreviewToSchemaBlogPost(
input: PostPreviewFragment,
wpBaseURL?: string
): BlogPosting {
let featuredImgNode = input.featuredImage.node;
let thumbnailUrl = featuredImgNode.srcSet
.split(" ")
.find((val) => val.startsWith("http://"));
let output: BlogPosting = {
"@type": "BlogPosting",
...() => {wpBaseURL ? {"@id":`${wpBaseURL}/${input.slug}`} : undefined},
name: input.title,
abstract: input.excerpt,
image: {
"@type": "ImageObject",
description: featuredImgNode.description,
name: featuredImgNode.altText,
contentUrl: featuredImgNode.sourceUrl,
thumbnailUrl
}
};
return output;
}
And this does compile in a standard typescript configuration. The thing is that we can't always be sure to get a preview image, and other fields might be empty as well. Handling empty fields will prevent errors in the future. So for the mapper function, we need to set compilerOptions.strict
to true
in our tsconfig.json. If we try to compile again, we'll receive lots of errors such as:
- error TS2533: Object is possibly 'null' or 'undefined'.
and
Type 'null' is not assignable to type 'string | PronounceableTextLeaf | readonly PronounceableText[] | undefined'.
Since we don't want something like "title": undefined
in our json-ld, we have to remove the fields from the output-object altogether. The first thought to fix this might be to add lots of if-statements. But with the recent advances in ES6 and typescript, we have the option to create the output in a way that looks very much like a nested json-ld-object. Ant such a nested object is what we want to output. The main difference is that this nested object includes conditions for safe mapping as well:
/**
* mapper-function to create schema.org/BlogPosting(s) for previews from fragments
* @param input a fragment of a WordPress blog post
* @param wpBaseURL the base domain of your wordpress installation. Used to add the slug
*/
export function mapWpPostPreviewToSchemaBlogPost(
input?: PostPreviewFragment,
wpBaseURL?: string
): BlogPosting | null {
if (!input) return null;
let featuredImgNode = input && input.featuredImage && input.featuredImage.node;
let thumbnailUrl = featuredImgNode && featuredImgNode.srcSet && featuredImgNode
.srcSet.split(" ")
.find((val) => val.startsWith("http"));
let output: BlogPosting = {
"@type": "BlogPosting",
...(wpBaseURL && { "@id": `${wpBaseURL}/${input.slug}` }),
...(input.title && { name: input.title }),
...(input && input.excerpt && { abstract: input.excerpt }),
...(featuredImgNode && {
image: {
"@type": "ImageObject",
...(featuredImgNode.description && {
description: featuredImgNode.description,
}),
...(featuredImgNode.altText && {
name: featuredImgNode.altText,
}),
...(featuredImgNode.sourceUrl && {
contentUrl: featuredImgNode.sourceUrl,
}),
...(thumbnailUrl && { thumbnailUrl }),
},
}),
};
return output;
}
23 lines without null-checks, compared to 32 lines with strict type checks. Plus, this notation scales into nested structures. Not much extra effort - but the reliability has improved greatly. Let's look at it the code detail. First, the output is an Object typed "BlogPosting", and it has to include a "@type": "BlogPosting"
. Typescript types are lost during compilation, whereas "@type"
gets baked into json:
let output: BlogPosting = { "@type": "BlogPosting" };
The three dots that appear across the function are spread operators. Those operators are trying to spread the result of the evaluation in brackets ()
.
...(input.title && { name: input.title }),
Inside the brackets there's a logical operator, which returns the second part if the first one is truthy, otherwise the first one. That means if one of our input fields like input.title
is undefined, null or an empty string, this is what the spread operator gets to see:
let mytestvar = {...undefined};
console.log(mytestvar);
mytestvar = {...null};
console.log(mytestvar);
mytestvar = {...""};
console.log(mytestvar);
mytestvar = {...false};
console.log(mytestvar);
mytestvar = {...true};
console.log(mytestvar);
//always an object with no fields
This way of converting is helpful to secure our string-based values, but it's not a magic syntax for API conversion.
However, if we do have an input.title
, the second expression { name: input.title }
is an object, and its key(s) and value(s) are added to the containing object.
Conclusion
Now you have a pipeline to build structured data from your graphql API, independent of which frontend technology you choose. Structured data can have a good effect on SEO, some of it is displayed as special widgets in google. With schema-dts
you don't only avoid typos while you build structured data, your IDE will help you find the right fields for your @type
s.
If you're thinking about where to put those conversions in your architecture, maybe ideas like the middleman engine can even help improve the handling of data in your organization as a whole.
Top comments (0)