DEV Community

Cover image for Optimization of u128 decoding from base62
Roman
Roman

Posted on • Edited on

Optimization of u128 decoding from base62

In the previous article, we looked at u128 to base62 encoding, now we will implement and optimize the inverse operation of decoding u128 from base62, this can be useful, for example, for more compact storage in memory or database.

In order to understand which optimizations can be applied, let's start with a simple, naive implementation:

const BASE62_N: usize = 62;

pub fn base62_to_u128_naive(base62_str: &str) -> Option<u128> {
    let base62_str: [u8; 22] = base62_str.as_bytes().try_into().ok()?;
    let mut n: u128 = 0;
    for digit in &base62_str {
        let number = match digit {
            d if (b'0'..=b'9').contains(d) => d - 48,
            d if (b'A'..=b'Z').contains(d) => d - 55,
            d if (b'a'..=b'z').contains(d) => d - 61,
            _ => return None,
        };
        n = n
            .checked_mul(BASE62_N as u128)
            .and_then(|n| n.checked_add(number as u128))?;
    }
    Some(n)
}
Enter fullscreen mode Exit fullscreen mode

Let's analyze what is happening here, the function signature shows that we accept a string and get an optional u128 number as a result, since in case of incorrect data in the string we will not be able to decode it:

fn base62_to_u128_naive(base62_str: &str) -> Option<u128>
Enter fullscreen mode Exit fullscreen mode

All identifiers must be 22 characters long, confirm this condition:

let base62_str: [u8; 22] = base62_str.as_bytes().try_into().ok()?;
Enter fullscreen mode Exit fullscreen mode

Next, we calculate the value of the number in a loop from the digits of the base62 string:

for digit in &base62_str
Enter fullscreen mode Exit fullscreen mode

We need to convert each digit to decimal, for example "B" would be "11" in decimal, we do this in match, checking each of the possible intervals of allowed characters:

let number = match digit
Enter fullscreen mode Exit fullscreen mode

Now we calculate the number by multiplying by the base 62 and adding the next digit. A 22-character number in base 62 holds more numbers than a 128-bit number, so we need to check for overflow when converting:

n = n.checked_mul(BASE62_N as u128).and_then(|n| n.checked_add(number as u128))?;
Enter fullscreen mode Exit fullscreen mode

We got the result, returning:

Some(n)
Enter fullscreen mode Exit fullscreen mode

Let's look at the benchmark results, what the first version of the function is capable of (CPU i5-4690):

bench_base62_to_u128_naive 1000000: 0.121s 125.75Mib/s, 8240895.48/s
Enter fullscreen mode Exit fullscreen mode

Not bad, the speed is already a little bit faster than the speed of even the optimized encoder from the first article. All due to the fact that there are no expensive division operations during decoding, but we can do better!

Let's start with optimization. We will start by getting rid of match. Instead of checking the intervals and subtracting the offset, we can create an array with all possible variants, the byte representation of the character from the base62 string will be the offset (index), and the value is a digit in decimal:

const BASE62_MAP: [u8; 123] = [
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 255, 255, 255,
    255, 255, 255, 255, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
    29, 30, 31, 32, 33, 34, 35, 255, 255, 255, 255, 255, 255, 36, 37, 38, 39, 40, 41, 42, 43, 44,
    45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
];
Enter fullscreen mode Exit fullscreen mode

Now, for conversion, it is enough to take the value by index, for example, BASE62_MAP[b'A'] will be 10.

The next step is to deal with multiplication and addition in a loop. These overflow-checked operations (checked variants) require more instructions than normal unchecked operations. Since overflow is only possible at the last step, we can move its check outside the loop by converting only the first 21 characters in the loop, and the last one after the loop:

for digit in &base62_str[..21] {
        // ...
        n = n * BASE62_N as u128 + number as u128;
}
Enter fullscreen mode Exit fullscreen mode

Another problem here is that the multiplication itself (as well as addition) of 128-bit numbers is more expensive than, for example, multiplying 64-bit numbers, which is easy to see by looking at the generated assembler for both of these operations. In the previous article, we replaced 128-bit division with 64-bit division, but here we will do the same for multiplication and addition. To do this, we will split the original base62 number into several blocks along the digit boundaries, convert each block separately, then collect the entire number from these parts:

3322222222221111111111 -> 33 2222222222 1111111111
                          n3         n2         n1
n = n3 * 62^20 + n2 * 62^10 + n1
Enter fullscreen mode Exit fullscreen mode

So the final optimized version of the decoder, which we got, looks like this:

const BASE62_N: usize = 62;
const BASE62_POW_10: u128 = (BASE62_N as u128).pow(10);
const BASE62_POW_20: u128 = (BASE62_N as u128).pow(20);
const BASE62_MAP: [u8; 123] = [
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 255, 255, 255,
    255, 255, 255, 255, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
    29, 30, 31, 32, 33, 34, 35, 255, 255, 255, 255, 255, 255, 36, 37, 38, 39, 40, 41, 42, 43, 44,
    45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
];

pub fn base62_to_u128(base62_str: &str) -> Option<u128> {
    let base62_str: [u8; 22] = base62_str.as_bytes().try_into().ok()?;
    let mut n_section: u64 = 0;
    for digit in &base62_str[12..22] {
        let number = *BASE62_MAP.get(*digit as usize).filter(|it| **it <= 61)?;
        n_section = n_section * BASE62_N as u64 + number as u64;
    }
    let mut n = n_section as u128;
    n_section = 0;
    for digit in &base62_str[2..12] {
        let number = *BASE62_MAP.get(*digit as usize).filter(|it| **it <= 61)?;
        n_section = n_section * BASE62_N as u64 + number as u64;
    }
    n += n_section as u128 * BASE62_POW_10;
    n_section = 0;
    for digit in &base62_str[0..2] {
        let number = *BASE62_MAP.get(*digit as usize).filter(|it| **it <= 61)?;
        n_section = n_section * BASE62_N as u64 + number as u64;
    }
    n.checked_add((n_section as u128).checked_mul(BASE62_POW_20)?)
}
Enter fullscreen mode Exit fullscreen mode

Some clarifications for the code. Here we calculate a number based on blocks of digits (less significant digits at the end of a string):

for digit in &base62_str[12..22]
// ...
for digit in &base62_str[2..12]
// ...
for digit in &base62_str[0..2]
// ..
Enter fullscreen mode Exit fullscreen mode

This line converts a base 62 digit to a decimal number. At the same time, since the data may be incorrect, a check is added that digit is in the allowed range:

let number = *BASE62_MAP.get(*digit as usize).filter(|it| **it <= 61)?;
Enter fullscreen mode Exit fullscreen mode

We make sure that the top 2 digits do not cause overflow:

n.checked_add((n_section as u128).checked_mul(BASE62_POW_20)?)
Enter fullscreen mode Exit fullscreen mode

Let's check the benchmark, was it worth our efforts?

bench_base62_to_u128 10000000: 0.182s 836.24Mib/s, 54803747.40/s
Enter fullscreen mode Exit fullscreen mode

Wow that's impressive! I even had to increase the number of iterations by a factor of 10 to get stable measurements. As a result, the improvement compared to the original version is 6.65 times.

My links

I'm making my perfect todo, note-taking app Heaplist, you can try it here heaplist.app
And I have a twitter @rsk

Top comments (1)

Collapse
 
rsk profile image
Roman

that was fun to research on this topic