WebSockets are a great way to achieve bi-directional communication between a mobile app and a backend. In December 2018, Amazon's API Gateway service launched serverless support for WebSockets.
Setting it all up is somewhat involved. I recommend starting with the Simple WebSockets Chat backend they provide as a demo. We'll focus on how to talk to it from Android, using Square's familiar OkHttp or JetBrains' newer Ktor.
Pre-requisites
- An activate an AWS account
- The latest version of the AWS CLI
-
wscat
installed vianpm
. - AWS SAM CLI installed via Homebrew (if on Mac.)
- Android Studio
Setting Up a Backend
Installing the serverless demo app is pretty straight-forward.
Checkout the Chat app:
git clone https://github.com/aws-samples/simple-websockets-chat-app.git
And deploy it to your account:
sam deploy --guided
After a while, that command will finish. You'll be able to see the provisioned resources by inspecting the outputs of the CloudFormation stack:
aws cloudformation describe-stacks \
--stack-name simple-websocket-chat-app \
--query 'Stacks[].Outputs'
It creates a DynamoDB table, three Lambda functions, an AWS IAM role, and an API Gateway with WebSockets support.
After you've glanced over the output to understand what all was created, let's zero in on the WebSocket endpoint itself. We'll need to find the URI that our client can use, to access it.
aws cloudformation describe-stacks \
--stack-name simple-websocket-chat-app \
--query 'Stacks[].Outputs[]' | \
jq -r '.[]|select(.OutputKey=="WebSocketURI").OutputValue'
This should output something like:
wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod
Command Line Testing using wscat
Let's quickly test the backend, to make sure it works as we expect. We'll use the wscat
command-line utility. The command below will open a long-lived connection to the API Gateway, and we'll be able to send and receive messages in a subshell:
$ wscat -c wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod
Connected (press CTRL+C to quit)
> {"action": "sendmessage", "data": "foo"}
< foo
The JSON format above is required by the app we deployed. You can change the value of "foo"
, but you can't change anything else.
If you try to pass something else, it doesn't work:
> Hello, how are you?
< {"message": "Forbidden", "connectionId":"Y0bEuc0UIAMCIiA=", "requestId":"Y0bwuGjXIAMFmEg="}
But, if you open up multiple terminal windows, and connect them all to the endpoint, they'll all receive a valid message.
Calling the API from Android via OkHttp
Now that we know the WebSocket API is working, let's start building an Android app to use as a client, instead.
WebSockets have been supported in OkHttp since 3.5, which came out all the way back in 2016.
Initializing a WebSocket client is straight-forward:
val request = Request.Builder()
.url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
.build()
val listener = object: WebSocketListener() {
override fun onMessage(ws: WebSocket, mess: String) {
// Called asynchronously when messages arrive
}
}
val ws = OkHttpClient()
.newWebSocket(request, listener)
To send a message to the API Gateway, all we have to do is:
ws.send(JSONObject()
.put("action", "sendmessage")
.put("data", "Hello from Android!")
.toString())
We can add a button to our UI with ViewBinding, and fire the WebSocket message whenever we click it:
ui.button.setOnClickListener {
ws.send(JSONObject()
.put("action", "Hello from Android!")
.put("data", command)
.toString())
}
Since all of the threading is handled inside OkHttp, there really isn't a lot more to it. Save a handle to your view binding and to to your WebSocket client when you create your Activity
:
class MainActivity : AppCompatActivity() {
private lateinit var ui: ActivityMainBinding
private lateinit var ws: WebSocket
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
ui = ActivityMainBinding.inflate(layoutInflater)
setContentView(ui.root)
connect() // As above
}
Calling the API from Android via Ktor
Ktor is built with the assumption that you're using Coroutines, and managing your own scope/context. This makes it a more flexible tool, but adds some additional complexity.
The basic setup for the tool is going to look like this:
private suspend fun connect(ktor: HttpClient, u: Url) {
ktor.wss(Get, u.host, u.port, u.encodedPath) {
// Access to a WebSocket session
}
}
private /* not suspend */ fun connect() {
val url = Url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
val ktor = HttpClient(OkHttp) {
install(WebSockets)
}
lifecycleScope.launch(Dispatchers.IO) {
connect(ktor, url)
}
}
Inside the trailing closure, you have access to an instance of DefaultClientWebSocketSession
. It has two important members:
- A
ReceiveChannel
namedingoing
, and - A
SendChannel
namedoutgoing
.
SendChannel
and ReceiveChannel
are the inlet and outlet to a Kotlin Channel
, which is basically like a suspendible queue.
It's pretty trivial to send and receive some simple messages inside the WebSocket session closure:
ktor.wss(Get, u.host, u.port, u.encodedPath) {
// Send a message outbound
val json = JSONObject()
.put("action", "sendmessage")
.put("data", "Hello from Android!")
.toString()
outgoing.send(Frame.Text(json))
// Receive an inbound message
val frame = incoming.receive()
if (frame is Frame.Text) {
ui.status.append(frame.readText())
}
}
However, we're missing a bunch of functionality that we had in our OkHttp solution. Namely:
- We want to send and receive data at the same time, in a loop, and
- We want to get notifications when the connection opens and closes.
Parallel send and receive in Ktor
Our goal here is ultimately to dispatch a stream of messages to the outgoing
channel, and to consume a stream of messages from the ingoing
channel, at the same time. The basic approach we'll use is to launch two bits of work asynchronously, and then wait for them to finalize:
ktor.wss(Get, u.host, u.port, u.encodedPath) {
awaitAll(async {
// Code that will send messages
}, async {
// Code that will receive messages
})
}
We need some stream of events to trigger the send messages. Adding a button element to our UI makes sense. But, we'd like to model the button clicks as a stream of commands, by the time we dispatch events over the WebSocket.
Let's first create an extension function to map button clicks into a Flow<Unit>
(credit to StackOverflow):
private fun View.clicks(): Flow<Unit> = callbackFlow {
setOnClickListener { offer(Unit) }
awaitClose { setOnClickListener(null) }
}
Now, we can listen to a flow of events from a button, map them into the format we need, and send them:
ui.button.clicks()
.map { click -> "Hello from Android!" }
.map { message -> JSONObject()
.put("action", "sendmessage")
.put("data", message)
.toString()
}
.map { json -> Frame.Text(json) }
.collect { outgoing.send(it) }
That will work well for the outgoing events. Now, we just need to respond to inbound events, in the second async
block.
In this case, there isn't a lot of value to mapping the result. The value we receive over the socket is the contents associated with the "data"
key in the messages. So for example, we might get "Hello from Android!"
:
incoming.consumeEach { frame ->
if (frame is Frame.Text) {
ui.status.append(frame.readText())
}
}
When it's all said and done, we end up with something like this:
ktor.wss(Get, u.host, u.port, u.encodedPath) {
awaitAll(async {
ui.button.clicks()
.map { click -> JSONObject()
.put("action", "sendmessage")
.put("data", "Hello from Android!")
.toString()
}
.map { json -> Frame.Text(json) }
.collect { outgoing.send(it) }
}, async {
incoming.consumeEach { frame ->
if (frame is Frame.Text) {
ui.status.append(frame.readText())
}
}
}
})
Ktor's lifecycle events
OkHttp allowed us to override callbacks on the WebSocketListener
, to get notified of various lifecycle events:
val listener = WebSocketListener() {
override fun onOpen(ws: WebSocket, res: Response) {
// when WebSocket is first opened
}
override fun onClosed(ws: WebSocket, code: Int, reason: String) {
// when WebSocket is closed
}
}
Ktor doesn't work like that. They suggest some different approaches to recover those events.
The easiest one to recover is the event when the socket opens. It's just the first thing that happens inside of the wss
session closure:
ktor.wss(Get, u.host, u.port, u.encodedPath) {
ui.status.append("Connected to $u!")
}
To get more insight into the WebSocket termination, we can expand our processing in our receive block:
incoming.consumeEach { frame ->
when (frame) {
is Frame.Text -> { /* as above */ }
is Frame.Close -> {
val reason = closeReason.await()!!.message
ui.status.append("Closing: $reason")
}
}
}
Catching failures is also fairly easy:
private fun connect() {
val url = Url("wss://azgu36n0vf.execute-api.us-east-1.amazonaws.com/Prod")
val client = HttpClient(OkHttp) {
install(WebSockets)
}
lifecycleScope.launch(Dispatchers.IO) {
try {
connect(client, url)
} catch (e: Throwable) {
val message = "WebSocket failed: ${e.message}"
ui.status.append(message)
}
}
}
Wrapping Up
Well, there you have it: a rough and dirty explanation of how to use OkHttp and Ktor to consume an Amazon API Gateway WebSockets API. 🥳.
Top comments (0)