DEV Community

Arafat Tehsin
Arafat Tehsin

Posted on • Edited on

Immersive Reader for People with all the abilities

If you love Azure, you must be aware of the Azure Advent Calendar created, managed and executed by two fantastic MVPs Gregor Suttie and Richard Hooper. It means that you will have 75 videos with subsequent blog posts on each of the Azure related topic. Today is my turn and so is the start of my another developer series.

By the end of year 2019, we've seen almost a decade of disruptions in almost every sector of industry, from blockchain based agriculture to liquid biopsy in healthcare, from smart banking to personalized products; all have got one thing in common and that is: determination to make people's lives easier and enable them to achieve more.

You can't neglect the impact of AI across the globe. Even if some friends think / consider it's overrated, overhyped or a mere buzzword, you still expect from your bank to detect the anomaly when the transactions are not made by you or perform real-time (offline) translation if you're in the non-English speaking countries. Needs like these or similar are the reason for AI to stay and grow better.

developer (dɪveləpəʳ)
noun
Amazing human being making a positive impact on the world.
Similar: You!

As the software development industry is growing exponentially and the demand for technology stack is becoming substantial every year; it is evident that if you (as a developer) do not keep up to the trends, you may end up losing the importance of your role. An ample amount of time is always required to learn any component of your stack and Artificial Intelligence has always been one of them.

enter image description here

Fortunately! Just to make our lives easier; with the advent of cloud native applications or services, most of the AI services are now available at the endpoint of your software; be it a mobile or web app, bot or IoT device and so on.

Azure Cognitive Services are the similar kind of services, API & SDKs that lets you embed artificial intelligence to your applications without any in-depth knowledge of AI. These services let you add machine learning capabilities in your applications to solve countless problems ranging from social, business, environmental and beyond. Because of these services, you can analyze the image, listen, speak with the custom voice, understand the context of your sentence and empower your applications by integrating decision making into your software. All without any sort of special machine learning / data sciences knowledge.

Azure Cognitive Services has SDKs for many frameworks & runtimes but with REST API, the possibilities grow endless as you can connect your app developed in any language with them.

I will try my best (no promises here!) to accomodate different
language / runtime to brief you about each Cognitive Service that can
be used by developer of any stack!

Broadly, these services are divided into 5 distinguished categories;

Decision Language Speech Vision Web Search
Anomaly Detector Preview Immersive Reader Preview Speech to Text Computer Vision Bing Autosuggest
Content moderator Language Understanding Text to Speech Custom Vision Bing Custom Search
Personalizer QnA Maker Speech Translation Face Bing Entity Search
Text Analytics Speech Recognition Form Recogniser Preview Bing Image Search
Translator Text Speaker Recognition Preview Ink Recogniser Preview Bing News Search
Video Indexer Bing Spell Check
Bing Video Search
Bing Web Search

In today's post, we will talking about the latest Immersive Reader Preview and see how effectively it's bringing a positive change to the people with all the abilities.

Immersive Reader Preview

The Immersive Reader lets you (as a developer) embed text reading and comprehension capabilities into your apps. It empowers readers of all abilities with features such as translating languages, reading aloud, grammar and highlighting options along with other design elements.

This service currently supports more than 60 languages and it's continously being optimized based on inclusion research.

Despite being in its Preview stage, this service does not only implements the proven techniques to improve reading and comprehension for readers but also, the variant of this service is already being used in some of the schools.

If you want some inspiration, you don't want to miss reading this story about Immersive Reader!

SDK Integration

Currently, there are two SDKs for Immersive Reader which are opensource on GitHub

  1. JavaScript
  2. iOS (Swift CocoaPod)

If you're using JavaScript SDK (which we will, in this post), it invokes a web app on top of your existing app (via iframe). It's so simple that you just have to specify the content, write a few lines of code and viola, your app is ready with the Immersive Reader experience!

Azure AD Authentication for Cognitive Service

Previously (and still today), there are many Cognitive Services which work really well with just a Subscription Key. However, from July 2019, a new way of authentication has been introduced (which is already there for other Microsoft applications such as Dynamics 365 etc.) and some of the Cognitive Services support it where everytime you make a request to the Cognitive Service, you should pass an access token to retrieve the desired response. Now the question is, how to get this acces token? Well, it's already covered in the docs for Immersive Reader.

However, assuming some of the post's readers familiarity with Azure Cloudshell (CLI) may not be very high therefore, I've provided a visual walkhtrough to register the application in Azure AD and configure Role-based Access Control (RBAC) for Immersive Reader Cognitive Service.

So the steps are quite simple where you will just login to your Azure Portal and create Immersive Reader Cognitive Service.

Immersive Reader Creation

Now, you have to create a Service Principal (or Service Account / User) which will be used to create an access token everytime you will pass in Cognitive Services request.

In order to do that, you need to create an App Registration in Azure Active Directory, setup Application URI and define roles & scope for your app. Then you can create a Client Secret and save it as you may not be able to see it in plain text again. Below screenshots in order are for your reference.

enter image description here

enter image description here

enter image description here
enter image description here

enter image description here

Now after setting up the scope. You can now create RBAC (Role-based Access Control) on the Immersive Reader Cognitive Service.

enter image description here

enter image description here

Code Walkthrough

As I mentioned ealrier that we will be using the JavaScript SDK in this post therefore, I will go and create an ASP.NET Core MVC application.

Managing Configuration Secrets

Now, in your Secrets.json file, you just have to provide these values for authentication / token purposes.

enter image description here

{
"TenantId":"<your Azure tenant Id>",
"ClientId":"<your App Registration Client Id>",
"ClientSecret":"<your App Registration Client Secret>",
"Subdomain":"<Your immersive reader cognitive service subdomain>"
}
Enter fullscreen mode Exit fullscreen mode

Then you just have to tweat your existing HomeController.cs file with a little bit of code

You will then need to include the reference of the JS library for Immersive Reader:

<script type='text/javascript' src='https://contentstorage.onenote.office.net/onenoteltir/immersivereadersdk/immersive-reader-sdk.0.0.3.js'></script>
Enter fullscreen mode Exit fullscreen mode

After this, you can simply pass your content to your JS function.

async function handleLaunchImmersiveReader(id) {
        const data = {
            title: $('#title' + id).text().trim(),
            chunks: [{
                content: $('#text' + id).text().trim(),
                lang: 'en'
            }]
        };

        const token = await getImmersiveReaderTokenAsync();
        const subdomain = await getImmersiveReaderSubdomainAsync();

        ImmersiveReader.launchAsync(token, subdomain, data);
    }
Enter fullscreen mode Exit fullscreen mode

The launchAsync method of ImmersiveReader library will launch an iframe in your web application.

enter image description here

Basically, it returns a Promise<HTMLDivElement>, which resolves when the Immersive Reader is loaded. The Promise resolves to a div element whose only child is an iframe element that contains the Immersive Reader page.

So the Immersive Reader works with the plain, HTML, MathML & .docx
(Microsoft Word document) content. My example is for the plain text.
However, should you wish to use the other options, you just have to
supply mimeType in options.

Please note that you can specify a lot of options and language parameters and pass it to the Immersive Reader to tailor your app's experience. This SDK reference guide has everything!

A few notable features

enter image description here

  1. You can set the voice either Male or Female with the speed.
  2. You can choose the Font Size, Type and Theme
  3. You can use Grammar options such as Parts of speech highlighting, with Syl‧la‧bles
  4. You can also have a line by line focus, pictures on some common words
  5. Lastly, Immersive Reader can help you translate the whole document as well as word by word so you don't have to change the language to see what each word means.

Since it's just a JavaScript library so you can use this in any
applications or controls whatever supports JS from your hybrid mobile
app to Power Apps Component Framework control to anything!

It is absolutely fine if you still have some doubts of implementation and that's the reason I always keep my opensource demos on GitHub repo for your reference.

GitHub logo arafattehsin / CognitiveRocket

This repository is mainly used for the R&D of the Azure AI, .NET & Power Platform

Immersive Reader is super simple and easy to integrate and enable people with all the abilities to read, comprehend and have an immersive experience throughout their class.

Until next time!

Top comments (0)