K for Fullstack Frontend

Posted on Oct 18, 2024

Adding a CU to Your Arweave Gateway

#web3 #blockchain #arweave #storage

Many projects in the crypto space want to bring general-purpose computations onchain. And there is much to gain from such an endeavor—verifiability and provenance for everyone. AO set out to solve this issue in a decentralized way. Yet, such a system lives and dies by the nodes that make up its network. Fewer nodes mean less computing power.

AR.IO gateways allow you to run an AO Compute Unit (CU) as a sidecar to reduce the latency for reading data from AO processes. This guide will explain CUs and setting one up for your gateway.

Prerequisites

To follow this guide, you need a functional AR.IO gateway.

The system should have at least 8GB of memory to execute the ArNS process; younger/smaller processes also work with 4GB.

Note: This guide assumes that you set up a gateway using my D_D Academy course, but if you used another method, you can copy the code for the .env and docker-compose files.

What is an AO CU?

The CU is a web service that executes AO processes, which are AO’s equivalent to smart contracts. So, the CU is AO’s workhorse. It takes messages and uses them as input for processes, which, in turn, compute outputs the CU can relay back to clients.

When a client sends a message directly to a CU via the dry-run endpoint, it will compute a result and send it as a reply. Yet, it won’t store the message or the result on Arweave, allowing your CU to provide read access to AO processes without being part of the testnet.

How does a CU work?

CUs are responsible for executing messages with processes. In contrast, Messaging Units (MUs) use Scheduler Units (SUs) to write messages to Arweave. A dry run only involves CUs, so it’s a good example for this explanation. Figure 1 illustrates this process.

The client sends a message with a process ID and data to the CU.
The CU uses the process ID to check if it already has a local memory snapshot for the process.
1. If the snapshot exists, the CU loads it.
2. If the snapshot doesn’t exist, the CU loads the process module and all previous messages via a gateway’s GraphQL endpoint from Arweave and recreates the snapshot.
The CU executes the process with the new message as input.
The CU sends the result of that computation to the client.

Figure 1: AO CU and client interaction

Note: The gateway used by the CU to load the messages doesn’t have to be your own gateway. However, if you want to use your own gateway, ensure it indexes the messages of processes you want to execute.

What are the benefits of running a dedicated CU?

The main benefit right now is improved resilience. If a shared testnet CU goes down, your gateway can continue to resolve ArNS names with its dedicated CU.

Depending on your gateway's available computing resources, a dedicated CU can improve ArNS resolution performance, as a co-located CU has lower latency to your gateway, and you don’t share it with other gateways.

When AO goes mainnet, computations on CUs will incur gas fees, so a dedicated CU might become a way to subsidize gas fees so your users don’t have to pay to use your processes.

Adding a CU to your gateway deployment

The CU running alongside your gateway is available as a Docker container, so you can execute it just like you would with the other services in your gateway cluster.

You must add a new docker-compose.ao.yaml to define the CU container and update your .env file with the new configuration values.

Note: We will go through this process with the deployment from my D_D Academy course, which uses AWS CodeDeploy for gateway upgrades. If this doesn’t apply to your deployment, you can skip to the “Updating the Docker Compose definitions”.

Creating a new revision

First, create a new revision by copying your last revision from the revisions directory. The resulting directory should have a path similar to this:

/path/to/your/gateway/code/revisions/r18+cu

I’m using r18 as my base, but you can use any revision from 17 upwards.

Updating the Docker Compose definitions

Next, create a new file at r18+cu/source/docker-compose.ao.yaml with the following content:

    services:
      ao-cu:
        image: ghcr.io/permaweb/ao-cu:latest
        restart: on-failure
        environment:
          NODE_CONFIG_ENV: ${NODE_CONFIG_ENV:-development}
          NODE_HEAPDUMP_OPTIONS: ${NODE_HEAPDUMP_OPTIONS:-nosignal}
          DEBUG: ${DEBUG:-*}
          WALLET: ${CU_WALLET:-}
          WALLET_FILE: ${CU_WALLET_FILE:-}
          ALLOW_PROCESSES: ${ALLOW_PROCESSES:-}
          ALLOW_OWNERS: ${ALLOW_OWNERS:-}
          PROCESS_CHECKPOINT_TRUSTED_OWNERS: ${PROCESS_CHECKPOINT_TRUSTED_OWNERS:-fcoN_xJeisVsPXA-trzVAuIiqO3ydLQxM-L4XbrQKzY}
          GATEWAY_URL: ${GATEWAY_URL:-http://envoy:3000}
          UPLOADER_URL: ${UPLOADER_URL:-http://envoy:3000/bundler}
          ARWEAVE_URL: ${ARWEAVE_URL:-}
          GRAPHQL_URL: ${GRAPHQL_URL:-}
          CHECKPOINT_GRAPHQL_URL: ${CHECKPOINT_GRAPHQL_URL:-}
          PORT: ${CU_PORT:-6363}
          ENABLE_METRICS_ENDPOINT: ${ENABLE_METRICS_ENDPOINT:-}
          DB_MODE: ${DB_MODE:-}
          DB_URL: ${DB_URL:-}
          PROCESS_WASM_MEMORY_MAX_LIMIT: ${PROCESS_WASM_MEMORY_MAX_LIMIT:-}
          PROCESS_WASM_COMPUTE_MAX_LIMIT: ${PROCESS_WASM_COMPUTE_MAX_LIMIT:-}
          PROCESS_WASM_SUPPORTED_FORMATS: ${PROCESS_WASM_SUPPORTED_FORMATS:-}
          PROCESS_WASM_SUPPORTED_EXTENSIONS: ${PROCESS_WASM_SUPPORTED_EXTENSIONS:-}
          WASM_EVALUATION_MAX_WORKERS: ${WASM_EVALUATION_MAX_WORKERS:-}
          WASM_BINARY_FILE_DIRECTORY: ${WASM_BINARY_FILE_DIRECTORY:-}
          WASM_MODULE_CACHE_MAX_SIZE: ${WASM_MODULE_CACHE_MAX_SIZE:-}
          WASM_INSTANCE_CACHE_MAX_SIZE: ${WASM_INSTANCE_CACHE_MAX_SIZE:-}
          PROCESS_CHECKPOINT_FILE_DIRECTORY: ${PROCESS_CHECKPOINT_FILE_DIRECTORY:-}
          PROCESS_MEMORY_CACHE_MAX_SIZE: ${PROCESS_MEMORY_CACHE_MAX_SIZE:-}
          PROCESS_MEMORY_CACHE_TTL: ${PROCESS_MEMORY_CACHE_TTL:-}
          PROCESS_MEMORY_CACHE_FILE_DIR: ${PROCESS_MEMORY_CACHE_FILE_DIR:-}
          PROCESS_MEMORY_CACHE_CHECKPOINT_INTERVAL: ${PROCESS_MEMORY_CACHE_CHECKPOINT_INTERVAL:-}
          PROCESS_CHECKPOINT_CREATION_THROTTLE: ${PROCESS_CHECKPOINT_CREATION_THROTTLE:-}
          DISABLE_PROCESS_CHECKPOINT_CREATION: ${DISABLE_PROCESS_CHECKPOINT_CREATION:-}
          EAGER_CHECKPOINT_ACCUMULATED_GAS_THRESHOLD: ${EAGER_CHECKPOINT_ACCUMULATED_GAS_THRESHOLD:-}
          MEM_MONITOR_INTERVAL: ${MEM_MONITOR_INTERVAL:-}
          BUSY_THRESHOLD: ${BUSY_THRESHOLD:-}
          RESTRICT_PROCESSES: ${RESTRICT_PROCESSES:-}
        ports:
          - ${CU_PORT:-6363}:${CU_PORT:-6363}
        networks:
          - ar-io-network

    networks:
      ar-io-network:
        external: true

This file will download the CU image and start the CU container at http://ao-cu:6363.

To tell the gateway about this, add the following lines to your .env file inside the r18+cu/source directory:

CU_WALLET='{"d": "YUbT...'
ARWEAVE_URL=https://arweave.net
GRAPHQL_URL=https://arweave.net/graphql
AO_CU_URL=http://ao-cu:6363
DISABLE_PROCESS_CHECKPOINT_CREATION=true
WASM_MODULE_CACHE_MAX_SIZE=5GiB
PROCESS_MEMORY_CACHE_MAX_SIZE=10GiB

Replace the CU_WALLET value with a dedicated Arweave wallet JWK (i.e., not your gateway or observer wallet.)

You can keep using your values if you already have the ARWEAVE_URL and GRAPHQL_URL set.

Note: If you use your own gateway as a source for the CU, ensure it unbundles and indexes TXs related to the AO processes you want to execute.
The following environment variables activate unbundling and indexing for AO message TXs:
ANS104_UNBUNDLE_FILTER='{ "or": [ { "isNestedBundle": true}, { "and": [ { "attributes": { "owner_address": "JNC6vBhjHY1EPwV3pEeNmrsgFMxH5d38_LHsZ7jful8" }} , { "tags": [{ "name": "Bundler-App-Name", "value": "AO" }]}]}]}'
ANS104_INDEX_FILTER={ "always" }

DISABLE_PROCESS_CHECKPOINT_CREATION prevents your CU from trying to store a checkpoint to Arweave. These checkpoints improve performance, but the wallet in your CU needs to pay for it, which can become costly for big snapshots.

The WASM_MODULE_CACHE_MAX_SIZE and PROCESS_MEMORY_CACHE_MAX_SIZE variables ensure the CU doesn't clutter your disk.

After you update the Docker Compose definitions and configuration, move on to the CodeDeploy scripts.

Note: If your gateway's resolver doesn’t respond within 500ms, the gateway will fall back to other resolvers. Also, the CU has to recreate the memory snapshot of the ArNS process when it resolves an ArNS for the first time, which can take quite some time (Some gateway operators reported >30 minutes)

Updating the CodeDeploy Scripts

CodeDeploy will execute a script at each stage of the deployment. You must update two scripts so the CU starts alongside your gateway.

First, update r18+cu/scripts/3-after-install with the following code:

#!/usr/bin/env bash
set -e

export INSTANCE_ID=$(curl http://169.254.169.254/latest/meta-data/instance-id)
echo "INSTANCE_ID=$INSTANCE_ID" >> /opt/ar-io-node/.env

cat <<EOF > /etc/systemd/system/ar-io-ao-cu.service
[Unit]
Description=ar-io-ao-cu

[Service]
WorkingDirectory=/opt/ar-io-node
Restart=always
RestartSec=10s
ExecStart=/usr/bin/docker-compose -f /opt/ar-io-node/docker-compose.ao.yaml up
ExecStop=/usr/bin/docker-compose -f /opt/ar-io-node/docker-compose.ao.yaml down
TimeoutSec=60

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable ar-io-ao-cu

This script will tell the OS about the CU service, how to start and stop it, and which other services it relies on.

Next, add the following line to the r18+cu/scripts/4-application-start script to start the CU after CodeDeploy uploads the new Docker Compose definitions and configuration files.

systemctl start ar-io-ao-cu

Deploying the revision

Now that you’ve set up everything deploy the revision with the following command:

scripts/deploy-revision r18+cu

After the deployment is complete, check the CU using the following URL. Replace example.com with your gateway domain.

https://example.com/ao/cu

If everything went correctly, the response should look similar to this:

{
  "address": "your-cu-wallet-address",
  "timestamp": 1234567890
}

Using the dedicated CU in your frontend

Now that your CU is running, you can use it with the aoconnect client. It uses a testnet CU by default, but you can configure it with your new CU and then use it to read data from AO processes with the dryrun, result, and results functions.

To create an aoconnect instance that uses your dedicated CU, use the following code:

import {connect} from "@permaweb/aoconnect"

const readClient = connect({
  CU_URL: "https://example.com/ao/cu"
})

const result = await readClient.dryrun({
  process: "a-process-id",
  data: "some-data",
  tags: [{name: 'Action', value: 'A-Read-Only-Action'}],
})

Aoconnect will use the default MU to save messages to Arweave, so calls to message and spawn are unaffected by this change and will work as before.

Note: If you get errors about a missing processId, check that your load balancer and CDN (e.g.,CloudFront, Cloudflare, etc.) relay query parameters.

Summary

Thanks to the Docker images, running a dedicated CU is easy. Ensure you have enough memory and time to wait for big processes to run.

With your own CU, you aren’t reliant on testnet CU’s for reading data from AO anymore, and if you host it on a high-end machine, you might even get a solid performance improvement!

DEV Community