DEV Community

Gealber Morales
Gealber Morales

Posted on • Originally published at gealber.com

Challenge #18

This challenge has more code, but according to the description is not a big deal. Here is the short description that you can find always here:

Now this is easy. Keep in mind, the code is 64-bit, because it uses 64-bit value(s). That is why I've omitted code fragments for 32-bit ARM and MIPS. So what does it do?

As always with large code, will avoid to display the whole code in this article, because large code of assembly can cause brain damage. I bet there's a "legit" study about that...academy. Also because you can find the code example in the original website here.

Analysis

As always let's break this code into blocks, in order to understand it better. The first block that I'm going to analyze it's this one

    ;; check if the string in rsp
    ;; has length 36
    sub rsp, 24
    mov QWORD PTR [rsp], r8
    mov QWORD PTR [rsp+8], r9
    call    strlen
    cmp rax, 36
    mov edx, -1
    jne .L28 ;; EXIT PROGRAM

    mov r12, rbx ;; copy rdi string into r12
    xor ebp, ebp ;; ebp = 0
    jmp .L35
Enter fullscreen mode Exit fullscreen mode

I added some notations into the block code, here we have a call to strlen, famous C function, to know the length of a string(assuming it's \0 terminated). In case our string length is not 36 we jump to tag .L28, where is the exit of the program, returning -1. This should count as a failure example, so on failure we are returning -1.

Another thing to identify here, it's the presence of a loop, between tags
.L33 and .L.42. Over what iterate this loop? What is the stop condition here? It's easy to notice that we are iterating over the string we just copied into r12 register. Take a look at this snippet

;; START OF A LOOP
;; ------------------>
.L33:
    ;; if char it's hex, end program with -1
    movsx   edi, BYTE PTR [r12]
    call    isxdigit
    test    eax, eax
    je  .L37 ;; EXIT PROGRAM WITH -1

;; ... continuation other tags come between these two

.L42:
    ;; check if we are in the last char
    cmp BYTE PTR [r12], 0
    jne .L33
    ;; <<<<<<<<<<<<<<<<<<<
    ;; END OF A LOOP
    ;; <<<<<<<<<<<<<<<<<<<
Enter fullscreen mode Exit fullscreen mode

If you noticed, at the end of tag .L42 we have a check for the amazing \0 null terminating character. Also at the start of the tag .L33, we have a copy of one character from the string in r12. This give us the certainty we are iterating over string in r12.

Another point to notice here, is the call to isxdigit, for checking if the current character it's a hexadecimal digit. In case is not, we finish the program as well with -1. Given that this is inside a loop, we can infer that each character in the string should be a valid hexadecimal digit, otherwise we exit the program with failure code -1.

We haven't analyzed everything and we have already an idea, that this program it's checking if all the characters on the string are hex. Cool, let's continue

The loop

The body of the loop gives us some insights as well. For example tag .L32, contains the increments of the loop, here you can notice that we have a counter in ebp register, and we also increment the string in r12 of course.

.L32:
    ;; increase counter and string pointer position
    add ebp, 1
    add r12, 1
    cmp ebp, 37
    je  .L34
.L35:
    cmp ebp, 8
    jne .L43
.L29:
    ;; if negative sign, continue
    cmp BYTE PTR [r12], 45 ;; '-' ascii, negative number
    je  .L32
.L37:
    ;; if we get here the program will exit with -1
    mov edx, -1
Enter fullscreen mode Exit fullscreen mode

Another operation here is the comparison to 37 of our counter in ebp. Remember our string's length must be 36. Also keep in mind C strings have a null terminated character.

We can also see that on tag .L29 we perform a check to '-' character, ignoring it in case we found it. Literally jumping to the increment steps.

Now here there's something interesting as well, on tag .L35 we check our counter with 8, in case it's not equal we jump to tag .L43. Which also have several checks on our counter in ebp. The checks can be resumed in this way: if counter is 13, 18 or 23 check for '-' on tag .L29, if counter is 36 end the program. The default case, is just to read another character from our string on tag .L33.

First question that comes to my mind is, why 13, 18 and 23? That question it's easy to answer when you take a look at tag .L34. I'll add it here as well, with some comments. In this part of the code, we make use of strtoul to convert chunks of the string, from hexadecimal string representation to a unsigned long in C. These chunks of the string are divided into 5 ranges, from 0 to 8, from 9 to 13, from 14 to 18, from 19 to 23 and from 24 to 36. The results of these conversions are been stored in r15, r14, r13 and rcx.

;; ANALYZING ON 5 STEPS
.L34:
    ;; rbx from 0 to 36
    ;; ------------------------------------------------------------------------------------------------------------------|
    ;;          8                   4                       4                       4                       12           |
    ;; rbx --- rbx + 8| rbx + 9 ---- rbx + 13 | rbx + 14 ---- rbx + 18 | rbx + 19 ---- rbx + 23 | rbx + 24 ---- rbx + 36 |
    ;; ------------------------------------------------------------------------------------------------------------------|

    ;; convert string in rdi, intial string into number
    mov edx, 16 ;; 3rd argument, base of number to convert
    xor esi, esi ;; 2nd argument, esi it's passed as NULL, we don't need to store the address of the first valid char
    mov rdi, rbx ;; 1st argument
    call    strtoul

    lea rdi, [rbx+9] ;; 1st arg
    mov edx, 16 ;; 3rd arg
    xor esi, esi ;;  2nd arg
    mov DWORD PTR [r15], eax ;; value of number just converted on previous call
    call    strtoul

    lea rdi, [rbx+14] ;; 1st arg
    mov edx, 16 ;; 2nd arg
    xor esi, esi ;; 3rd arg
    mov WORD PTR [r14], ax ;; value of previous
    call    strtoul

    lea rdi, [rbx+19] ;; 1st arg
    mov edx, 16 ;; 2nd arg
    xor esi, esi ;; 3rd arg
    mov WORD PTR [r13+0], ax ;; value of previous call
    call    strtoul

    mov rcx, QWORD PTR [rsp]
    lea rdi, [rbx+24] ;; 1st arg
    mov edx, 16 ;; 2nd arg
    xor esi, esi ;; 3rd arg
    mov WORD PTR [rcx], ax ;; previous call returned
    call    strtoull

    mov rcx, QWORD PTR [rsp+8]
    xor edx, edx ;; goes as 0
    mov QWORD PTR [rcx], rax
    jmp .L28
Enter fullscreen mode Exit fullscreen mode

Flow

Let's put this into a flow diagram, will be easier to understand

Flow Diagram

Looking at the flowchart, what we have it's easier to understand. We are iterating over the string with this format <8 hex digits>-<4 hex digits>-<4hex digits>-<4 hex digits>-<12 hex digits>. Now we have it!!! This is the format for an uuid, for example 45ab0c9c-873f-422e-963c-27d13c3fdac9.

Formal description

We are checking if the string provided has an uuid format.

Conclusion

Quite interesting how things are implemented in assembly.

Top comments (0)