DEV Community

Cover image for How to access string values in Go
Azeez Lukman
Azeez Lukman

Posted on • Edited on

How to access string values in Go

GO represents strings as byte slices using under the hood. This means you can access different indexes of a string like you would for a slice variable.

A byte slice is a slice whose underlying type is a slice. Byte slices are more like lists of bytes that represent UTF-8 encodings of Unicode code points.

Strings are immutable, unicode compliant and are UTF-8 encoded

Accessing the individual bytes of a string

I mentioned above that a string is a slice of bytes. We can access every individual byte in a string

package main


import (
    "fmt"
)


func printBytes(s string) {


    fmt.Printf("Bytes: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%x ", s[i])


    }


}


func main() {


    string := "Hello String"


    printBytes(string)


}
Enter fullscreen mode Exit fullscreen mode

outputs:

String: Hello World  
Bytes: 48 65 6c 6c 6f 20 57 6f 72 6c 64 
Enter fullscreen mode Exit fullscreen mode

We print the bytes in the string 'Hello String' by looping through the string using len() method. the len() method returns the number of bytes in the string, we then use the returned number to loop through the string and access the bytes at each index. The bytes are printed in hexadecimal formats using %x format.

Accessing individual characters of a string

Let's modify the above program a little bit to print the characters of the string.

package main


import (
    "fmt"
)


func printBytes(s string) {


    fmt.Printf("Bytes: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%x ", s[i])


    }


}


func printChars(s string) {


    fmt.Printf("Characters: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%c ", s[i])


    }


}


func main() {


    name := "Hello World"


    fmt.Printf("String: %s\n", name)


    printChars(name)


    fmt.Printf("\n")


    printBytes(name)

}
Enter fullscreen mode Exit fullscreen mode
String: Hello World
Characters: H e l l o   W o r l d
Bytes: 48 65 6c 6c 6f 20 57 6f 72 6c 64
Enter fullscreen mode Exit fullscreen mode

The logic remains the same as above, but this time, you would notice the use of %c format specifier, which is used to to print the characters of the string in the method.

In UTF-8 encoding a code point can occupy more than 1 byte, so this method of accessing the characters is not well suited since we are only assuming that each code point occupies one byte. A better approach is to use runes

package main


import (
    "fmt"
)


func printBytes(s string) {


    fmt.Printf("Bytes: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%x ", s[i])


    }


}


func printChars(s string) {


    fmt.Printf("Characters: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%c ", s[i])


    }


}


func main() {


    testSring := "Señor"


    fmt.Printf("String: %s\n", testSring)


    printChars(testSring)


    fmt.Printf("\n")


    printBytes(testSring)


}
Enter fullscreen mode Exit fullscreen mode

This outputs:

String: Señor
Characters: S e à ± o r
Bytes: 53 65 c3 b1 6f 72
Enter fullscreen mode Exit fullscreen mode

Notice that the program breaks, the characters returns à ± instead for ñ. The reason is that the Unicode code point of ñ is U+00F1 and its UTF-8 encoding occupies 2 bytes c3 and b1. We are trying to print characters assuming that each code point will be one byte long which is wrong.

Rune

A rune is simply a character. It is a builtin data type in Go. Rune literals are 32-bit integer values that represents a Unicode Codepoint.

package main


import (
    "fmt"
)


func printBytes(s string) {


    fmt.Printf("Bytes: ")


    for i := 0; i < len(s); i++ {


        fmt.Printf("%x ", s[i])


    }


}


func printChars(s string) {


    fmt.Printf("Characters: ")


    runes := []rune(s)


    for i := 0; i < len(runes); i++ {


        fmt.Printf("%c ", runes[i])


    }


}


func main() {


    testString := "Señor"


    fmt.Printf("String: %s\n", testString)


    printChars(testString)


    fmt.Printf("\n")


    printBytes(testString)


}
Enter fullscreen mode Exit fullscreen mode
String: Señor
Characters: S e ñ o r
Bytes: 53 65 c3 b1 6f 72
Enter fullscreen mode Exit fullscreen mode

In this example, the string is converted to a slice of runes using []rune. We then loop over it and display the characters. This works because a rune can represent any number of bytes the code point has.

Accessing specific characters in a string

now we have seen how to access all the characters of a string. Let's see how we can access the individual indexes of the string. Remember that a string in Go is a slice of bytes so we can easily access the character at a specific index like we would for a slice, or an array without needing to loop through the string or convert it to a rune.

package main


import (
    "fmt"
)


func main() {
    testString := "Hello String"


    fmt.Println(testString[2])


    fmt.Println(testString[1])


    fmt.Println(testString[4])
}
Enter fullscreen mode Exit fullscreen mode
108
101
111
Enter fullscreen mode Exit fullscreen mode

This returns the unicode code points for the specified indexes

Trying to access an index that is larger than your string's length throws an index out of range error, since the index specified exceeds the available range in your declared string

That was swift, all we did was declare the string and specify the index we would like to access. This is actually not our intended purpose, we still need to be able to access the actual character and not it's unicode value.

To access the character, we convert the Unicode code point using the builtin string method string()

package main


import (
    "fmt"
)


func main() {
    testString := "Hello String"


    fmt.Println(string(testString[2]))


    fmt.Println(string(testString[1]))


    fmt.Println(string(testString[4]))
}
Enter fullscreen mode Exit fullscreen mode
l
e
o
Enter fullscreen mode Exit fullscreen mode

A simple program to check if a string begins with a lower case letter or an upper case letter

Using our knowledge on accessing string values, we are going to write a small Go program that reports if a string passed in begins with a lower-case or upper-case letter

Declare package and and a function that checks if the whether the string has a lower-case letter at the begining.

There is no perform any checks if the parameter is an empty string, so the function checks for that first and returns false is it's an empty string

Next is the actual work, Go comparisons can automatically compare values within a range, in this case, we are checking if the first slice index of the string parameter exists within the range of lower-case letters.

package main


// startsWithLowerCase reports whether the string has a lower-case letter at the beginning.
func startsWithLowerCase(str string) bool {


    if len(str) == 0 {


        return false


    }


    c := str[0]


    return 'a' <= c && c <= 'z'


}
Enter fullscreen mode Exit fullscreen mode

startsWithUpperCase function also compares the first letter of the string parameter across a range, but this time, it compares across a range of capital letters. add this function to your program

// startsWithUpperCase reports whether the string has an upper-case letter at the beginning.
func startsWithUpperCase(str string) bool {


    if len(str) == 0 {


        return false


    }


    c := str[0]


    return 'A' <= c && c <= 'Z'


}
Enter fullscreen mode Exit fullscreen mode

It's time to wrap up and test out program, declare the main function. Inside the main function, you would declare your test string and call the functions passing the testString as parameter. We want to properly report our results so we use fmt.Printf to format our report and print to the console

func main() {

    testString := "Hello String"


    fmt.Printf("'%s' begins with upper-case letter? %t \n",


        testString,


        startsWithUpperCase(testString))


    fmt.Printf("'%s' begins with lower-case letter? %t \n",


        testString,


        startsWithLowerCase(testString))


}
Enter fullscreen mode Exit fullscreen mode
'Hello String' begins with upper-case letter? true 
'Hello String' begins with lower-case letter? false
Enter fullscreen mode Exit fullscreen mode

Cooool right? You have just created an enterprise grade program. Yes, startsWithLowerCase is the same logic used to in Go time package for the purpose of preventing matching strings like "Month" when looking for "Mon"

Conclusion

With this deep dive on accessing held in Go strings, you're ready to take over the world. But before that, There’s only one way to learn to develop Go programs: write a lot of code. Keep coding and taking over the world is only a matter of time.

Thank you for reading, I'm Azeez Lukman and here's a developer's journey building something awesome every day. Please let's meet on Twitter, LinkedIn and GitHub and anywhere else @robogeeek95

Top comments (4)

Collapse
 
andrewpillar profile image
Andrew Pillar • Edited

In this example, the string is converted to a slice of runes using []rune. We then loop over it and display the characters.

Iterating over a string will normally give you the runes within that string, so the conversion wouldn't be necessary, see this playground snippet as an example play.golang.org/p/LkdB8zO4Cu_d.

Also from Effective Go,

For strings, the range does more work for you, breaking out individual Unicode code points by parsing the UTF-8.

Collapse
 
robogeek95 profile image
Azeez Lukman

You're right Andrew. Thanks for pointing that out.

In addition to that, there's an even better way to access runes in a string:

package main

func main() {
    testString := "Señor"
    for index, rune := range testString {
        fmt.Printf("rune '%c' at index  %d\n", rune, index)
    }
}
Enter fullscreen mode Exit fullscreen mode

which gives:

rune 'S' at index  0
rune 'e' at index  1
rune 'ñ' at index  2
rune 'o' at index  4
rune 'r' at index  5
Enter fullscreen mode Exit fullscreen mode
Collapse
 
andrewpillar profile image
Andrew Pillar

Yeah, %c would format it as a char, so you can avoid casting it to a string like I did with my fmt.Println call.

Collapse
 
kishanbsh profile image
Kishan B

Very cool #TIL