10.1. String Basics#

Strings are sequences of characters, and they are used in a wide range of programming applications. Julia provides extensive functionality for working with strings and characters, including support for so-called Unicode characters. However, Julia also works efficiently using standard so-called ASCII characters and string, which we will focus on here to keep the presentation shorter.

10.1.1. Characters#

ASCII is a 7-bit character set containing 128 characters. It contains the numbers from 0-9, the upper and lower case English letters from A to Z, and some special characters.

The numbers 0 - 31 are used for so-called control characters, including for example the carriage return (number 13). The numbers 32 - 126 define the standard characters (the first character below, corresponding to number 32, is the space character):

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

Julia defines the type Char for representing characters, and you can create one using single quotes:

c = 'Q'
'Q': ASCII/Unicode U+0051 (category Lu: Letter, uppercase)

You can also define the character directly using its number:

c_and = Char(38)
'&': ASCII/Unicode U+0026 (category Po: Punctuation, other)

and you can find the number of a Char by converting to Int:

c_at = Int('@')
64

The control characters can be created using a backslash notation, for example the carriage return:

c_CR = '\n'
'\n': ASCII/Unicode U+000A (category Cc: Other, control)

You can also do comparisons and a limited amount of arithmetic with Char values (from Julia documentation):

julia> 'A' < 'a'
true

julia> 'A' <= 'a' <= 'Z'
false

julia> 'A' <= 'X' <= 'Z'
true

julia> 'x' - 'a'
23

10.1.2. String creation#

A string can be created from a sequence of characters using double quotes:

str1 = "Hello world!\nJulia is fun\n"
"Hello world!\nJulia is fun\n"

Note that the string is shown using the backslash syntax for the control characters, but when you print it they will be interpreted the correct way (for example in this case, using a carriage return):

print(str1)
Hello world!
Julia is fun

Since the double quote and the backslash characters have special meanings, as well as the dollar character, you need to put an extra backslash in front of them if used in a string:

str2 = "I \"have\" \$50, A\\b\n"
print(str2)
I "have" $50, A\b

Strings can also be created using triple quotes, which is convenient for multiple lines. In this case, the double quote character does not need the extra backslash:

str3 = """
    This is some multi-line text. 
    Carriage return is inserted for new lines.
    Double quotes can be used as-is: "
    But backslash and dollar still need extra backslashes: \\, \$
    The indentation of the lines is determined by the position of the final triple-quote.
"""
print(str3)
    This is some multi-line text. 
    Carriage return is inserted for new lines.
    Double quotes can be used as-is: "
    But backslash and dollar still need extra backslashes: \, $
    The indentation of the lines is determined by the position of the final triple-quote.

10.1.3. String concatenation#

You can concatenate multiple strings by passing them to the string function. An alternative syntax is to use the multiplication * operator between the strings:

str4a = "Hello"
str4b = "World"
str4ab = string(str4a, " ", str4b, "\n")  # Concatenation
str5ab = str4a * " " * str4b * "\n"       # Same thing

print(str4ab)
print(str5ab)
Hello World
Hello World

Alternatively, Julia allows interpolation into string literals using the $ symbol. This is often a more natural syntax for string concatenation:

str6ab = "$str4a $str4b\n"
print(str6ab)
Hello World

Interpolation allows for general expressions after the $ sign, inside parentheses unless just a single variable name. These will be evaluated and converted to strings:

vec = rand(3)
println("Random vector: $vec")
println("sin of 45 degrees = $(sind(45))")
Random vector: [0.00895262685300413, 0.771799934191051, 0.8719362139443655]
sin of 45 degrees = 0.7071067811865476

Another convenient notation is the power operator ^ with string and integer arguments, which concatenates multiple copies of the string:

"12345 "^9
"12345 12345 12345 12345 12345 12345 12345 12345 12345 "

10.1.4. String comparison#

You can lexicographically compare strings using the standard comparison operators (from Julia documentation):

julia> "abracadabra" < "xylophone"
true

julia> "abracadabra" == "xylophone"
false

julia> "Hello, world." != "Goodbye, world."
true

julia> "1 + 2 = 3" == "1 + 2 = $(1 + 2)"
true

10.1.5. String indexing#

Julia strings behave in many ways like a 1D array of characters (but beware of Unicode characters, then the indices might not be consecutive). For example, you can extract a single character from a string by indexing with an integer:

str = "abcdefghij"
str[7]
'g': ASCII/Unicode U+0067 (category Ll: Letter, lowercase)

You can extract a substring by indexing with a range:

str[7:end]
"ghij"

Note that integer indexing returns a character, but range indexing returns a string. This means e.g. that indexing with a range of length 1 returns a single character as a string:

str[7:7]
"g"

Alternatively, it is possible to create a view into a string using the type SubString:

sub = SubString(str, 7, 10)
"ghij"

The length of the string can be found with the length function, which can be used to loop over the characters:

for i = 1:length(str)
    println("Character #$i = '$(str[i])'")
end
Character #1 = 'a'
Character #2 = 'b'
Character #3 = 'c'
Character #4 = 'd'
Character #5 = 'e'
Character #6 = 'f'
Character #7 = 'g'
Character #8 = 'h'
Character #9 = 'i'
Character #10 = 'j'

Strings in Julia are immutable, which means you cannot change the content of a string after it is created:

str[4] = 'A'    # Error - cannot change strings
MethodError: no method matching setindex!(::String, ::Char, ::Int64)

Stacktrace:
 [1] top-level scope
   @ In[18]:1

10.1.6. Example: Check if string is a palindrome#

A palindrome is a sequence of characters which reads the same backward as forward. Using array operations and string comparisons, it is trivial to check if a string is a palindrome:

function is_palindrome(str)
    return str[end:-1:1] == str
end

strings = ["racecar", "Sit on a Potato Pan, Otis", "sitonapotatopanotis"]
for str in strings
    println("\"$str\": ", is_palindrome(str))
end
"racecar": true
"Sit on a Potato Pan, Otis": false
"sitonapotatopanotis": true

However, we can practice recursion, string indexing, and substrings by writing the following recursive version of the function:

function is_palindrome_recursive(str)
    if length(str)  1
        return true
    elseif str[1] == str[end]
        return is_palindrome_recursive(str[2:end-1])
    else
        return false
    end
end

for str in strings
    println("\"$str\": ", is_palindrome_recursive(str))
end
"racecar": true
"Sit on a Potato Pan, Otis": false
"sitonapotatopanotis": true