You're probably familiar with the QR (Quick Response) Code. Those little boxes of even littler black and white boxes. You've mostly used them to make payments. You've maybe even used one to add someone's contacts to your phone or to follow someone on social media. But what exactly are they?
Not Just an ID
For the longest time, I had thought that a QR Code was some unique combination of black and white squares like some sort of ID. That a QR Code is meant to be just one of a bazillion possible codes. That no other code like the one you're looking at exists. I had assumed that if you wanted a QR Code for yourself, you'd have to get one registered, like how you would get your own unique vehicle number plate registered. I've only recently learnt that it's not the point of a QR Code. A QR Code is unique in the same way that a random piece of text you'd note down on a piece of paper would be unique. So a QR Code is just written text, written in a form that's readable by a computer. You also don't have to get one registered; you can easily generate QR Codes that contain any content you want. Ever since I learnt this, I've played around with QR Codes a bit and I'd like to write about it.
Playing With Codes
If you scan a code using any of your payment apps, they attempt to make a payment. The QR scanner on your social media apps might attempt to follow a person. They don't show you what content is in the code. Therefore, you need a generic QR scanner app that makes no assumptions about the QR Code's intent. It should just scan the code and show you the contents encoded within. And there are two such apps that you likely already have: Google Lens on Android and the iOS camera app on the iPhone (iOS 11 and higher). These apps are a great way to demonstrate just the interpretation of the QR Code.
There are many websites that could generate a QR Code out of some content you provide. www.the-qrcode-generator.com is what we'll be using here. In the input box, you'll see a bunch of "data types" to pick from. These are the various formats that your QR Code's content could be in. Select the "Free Text" type to get the most freedom on what content you can put in. Let's put in the text "This is a QR code!".
The generated code looks like this:
When we scan the code using Google Lens, the app decodes it and shows us the contents!
Let's put in some different free text, but this time start it with a tel:
.
And we'll see that our scanner prompts us to make a call with this number.
What's Happening Here?
Your QR Code scanner has to interpret the code in two steps:
- Decode the QR Code to get the contents
- Decide what it should do based on the format of the contents (or whether it should do anything at all)
1. Decoding the QR Code
This is the part where the QR Code technology comes in. "QR Code" is a registered trademark of DENSO WAVE INCORPORATED. They invented the QR Code. There's an official site for it where they list out the features of a QR Code, the standards that have been refined over the years, etc. There's also an ISO standard for it. The standard defines the algorithm for encoding text into a QR Code and decoding the code back into text. It also defines the information required in the code like the QR version and error correction. All this information is encoded in dedicated parts of the QR Code. This article gives us very succinct visuals that highlight these parts.
2. Parsing the contents
Once the contents are decoded, the scanner now figures out what it's looking at based on its format. The format, or "data type" of the contents is not bound to the QR Code standards. We're now out of QR Code territory. This step is equivalent to typing out a piece of text and asking the computer what it's looking at. Is it a URL? A date? An email address? Or something else? There are quite a few standards that define these data types. A very popular list of data types is defined in the open-source ZXing library. Stated in its README is:
ZXing ("zebra crossing") is an open-source, multi-format 1D/2D barcode image processing library implemented in Java, with ports to other languages.
Its wiki has a list of all the data types it recognises with some explanation for each one. I had found a link to it in this Stack Overflow thread, so I'm linking that thread here as well in case it helps you out.
In our little experiment above, we encoded the text This is a QR code!
and tel:123456789
. The first one doesn't follow any certain format, so Google Lens simply showed it back to us. The second one follows the telephone numbers format and since Google Lens recognises this format, it prompted us to perform some action with this phone number. Notice that if we were to omit the tel:
at the beginning of the phone number, Google Lens would see it as just plain text.
The format most commonly used in QR Codes is the URL. A URL works great because it's short, it can link to literally any web content (or even app content), and the only user action required to use it is a single click. This works great for restaurant menus or ad campaigns. Your phone's operating system could also decide if it should open the URL directly in the appropriate app. This lets you do things like encode a Google Maps link to a specific location and let the user open it directly in the Google Maps app.
A pretty interesting format though is the MECARD format for describing a person's contact information. Let's put in the sample text that they provided in that page
MECARD:N:Owen,Sean;ADR:76 9th Avenue, 4th Floor, New York, NY 10011;TEL:12125551212;EMAIL:srowen@example.com;;
like so:
and Google Lens reads this as:
I wanted to demonstrate these different formats using just the "Free Text" input of the QR Code generator on purpose. It's pretty clear now that the other input modes the generator gives you such as "Contact" are only meant to give you the convenience of not having to type out all that messy syntax yourself. Under the hood though, it's all just text.
Can They Have Emojis?
In most cases, yes. But before we get into how or why, I need to be a little more honest about what is in a QR Code. This section is concerned with step one of a QR Code: "Decoding the QR Code". It's about how we turn the 1s and 0s in a QR to the characters we see in the QR scanner's output.
I mentioned in the previous section that the contents in a QR Code are "all just text". That's only part of the story. The reality is we can specify one of few "modes", aka the input character set of the contents we put in. Wikipedia has a list of the possible modes:
- Numeric only
- Alphanumeric
- Binary/byte
- Kanji/kana
The QR specification says that when you create a QR in binary/byte mode, the character encoding is ISO 8859-1. What is a character encoding? It describes how we're supposed to turn the 1s and 0s into text. ISO 8859-1 is basically ASCII, with some more characters put in. The QR generator we used seems to create the code in binary/byte mode by default. Hence, the scanner detects that the QR is in binary/byte mode and decodes the characters using ISO 8859-1. But ISO 8859-1 is just a small extension of ASCII. It doesn't describe how to turn 1s and 0s into emojis. So then how do QR scanners read emojis?
It turns out that in reality, many scanners will decode the QR Code using the UTF-8 character encoding by making use of an extension that the QR specificaton provides. Through this extension, a QR generator could provide a hint about the character encoding used. The answers on this StackOverflow post explain it well. So I guess our QR generator uses UTF-8 encoding through this extension, and this lets us use emojis in the code.
Nonstandard Standards
Since the format of the contents of a QR Code is independent of the QR Code itself, anybody can come up with their own format and still have a working QR Code. As long as they provide their users an application that could correctly interpret their custom format, there'll be no complaints. And that's exactly what happened. Over the years, as various companies, governments and organisations adopted the use of QR Codes in their applications, they came up with their own formats to meet their use cases. Some have decided to define a schema in a well established format like JSON or XML, while others have gone completely custom. Here's a short list of custom formats used in QR Codes:
- Short Payment Descriptor (SPAYD)
- European Payments Council QR Code
- Mobile deep linking formats on various mobile phone operating systems
- QRIS by Bank Indonesia. A valid QRIS code can be found in the header image of this news article.
- NPCI's Unified Payments Interface (more on this below)
Go ahead and scan the QR Codes shown in those pages, or copy and paste the sample contents into the QR Code generator. Chances are, your generic QR scanning app would not be able to parse the contents to prompt a user action, and so it would simply show the contents to you.
Unified Payments Interface
The Unified Payments Interface (UPI) is the official standard used to make payments across various payment platforms in India. If you're shopping in India and the shop you're in happens to accept payments through a QR Code, that QR Code is most likely UPI compliant. So naturally, I wanted to see what the UPI-formatted contents in a QR Code look like.
If you scan a UPI QR Code near you using a generic QR scanner, you'll find that the decoded content is a deep link that contains the information necessary to make a UPI payment. Unfortunately I couldn't find any official documentation on the structure of this deep link. The closest I could get to some structured documentation was this wiki on GitHub which seems like the results of someone else's research on this same topic. In its most basic form, the UPI deep link looks something like this:
upi://pay?pn=<payee-name>&pa=<payee-vpa>
This is a URL with a upi
scheme, a pay
user action and some parameters:
-
pn
the payee's name -
pa
the payee's virtual payment address (VPA)
If your QR Code scanner recognises URLs that don't start with http
or https
, the scanner might redirect you straight to the payment page of your default UPI compatible app. Google Lens doesn't seem to do this though.
upiqr.in is a QR Code generator that will generator a UPI-compliant code given the payee's name and the payee's VPA. Again, all this does is it creates a UPI deep link from those two values and encodes the deep link into a QR Code using a QR Code standard. From this, we could tell that the payee's name and VPA satisfy the minimum required information needed to create a valid UPI deep link, and hence, a valid UPI QR Code.
You can also create a UPI-compliant code by manually writing down the deep link in the www.the-qrcode-generator.com we used above.
Enter a valid VPA and you'll be able to make a payment through any UPI compatible app using your very own self-generated QR Code! Include the cu
and am
fields in the deep link and you've just created a QR Code that's bound to a specific amount of money!
Unanswered Questions
While trying out different QR Code generators, there was something peculiar I noticed. Different QR generators generate different-looking codes for the same exact content. Here's the QR Code from above, the one with the content "This is a QR code!":
Now check out this one with the same content, but from a different generator, www.qr-code-generator.com.
They clearly look different even though they store the same content! What gives? I'm guessing these generators use different values for parameters like the QR Code version, error correction level, etc. That could explain why they look different. Is there a QR scanner that could show us this metadata? That's something I haven't looked into yet.
Inspirations
There two things I had found online a few months ago that inspired me to experiment with QR Codes:
- A YouTube video on embedding an entire video game into a QR Code
- A Twitter thread on using QR Code for restaurant menus:
Because any data (of a reasonable size) can be stored in a QR Code, the possible use cases seem endless. The official site lists the many ways they can be used.
If you're interested, check out this post at my website! It's new and gonna change quite a bit over time. I plan to post notes as well, where I would document my learnings in software and programming.
Top comments (2)
Correct! If you look at the two QR codes, you can spot some similarities between them. The first one is smaller (less accurate) than the second, which is a bit bigger (based on the size of the position marker squares top left, top right, and bottom left).
The higher the QR codes error correction, the more columns and rows that are added. Higher error correction does also increase the minimum printing size, as you don't want all of the columns & rows to smush together :) That's why, if you're in the UK, the NHS test & trace QR codes are huuuuge (nearly full A4 size) - they need a high level of accuracy, so they need to be printed bigger.
Wow that makes complete sense! So you're saying the NHS QR codes have so many rows and columns that the squares end up being pretty tiny. It would be great to be able to see this metadata.