10.1. String Basics#
Strings are sequences of characters used to represent text, one of the most fundamental types of data. Julia has powerful, first-class support for working with strings.
A key feature of Julia is its excellent handling of Unicode characters, which allows it to represent text from virtually any language in the world (e.g., "Hello, Γειά σου, नमस्ते, こんにちは, Привіт"). For simplicity in this introduction, we will focus on the ASCII character set, which covers English letters, numbers, and common symbols.
10.1.1. Characters#
A string is a sequence of individual characters. The simplest character set is ASCII (American Standard Code for Information Interchange), a 7-bit code that maps 128 integers to characters. This set includes numbers, upper and lower case English letters, and punctuation.
Codes 0-31 are non-printable control characters (like newline or tab). Codes 32-126 are the standard printable characters you see below (code 32 is the space character):
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
In Julia, individual characters have the type Char and are created using single quotes.
# Single quotes denote a single character, not a string.
c = 'Q'
'Q': ASCII/Unicode U+0051 (category Lu: Letter, uppercase)
You can also create a Char from its underlying ASCII integer code.
# The ASCII code for the ampersand '&' is 38.
c_and = Char(38)
'&': ASCII/Unicode U+0026 (category Po: Punctuation, other)
Conversely, you can get the integer code for a Char by converting it to an Int.
# The '@' symbol corresponds to ASCII code 64.
c_at = Int('@')
64
Special control characters can be created using a backslash, known as an escape sequence. The most common is \n, the newline character.
# `\n` represents a single newline character.
c_newline = '\n'
'\n': ASCII/Unicode U+000A (category Cc: Other, control)
Because characters are backed by integers, you can perform comparisons and simple arithmetic on them. These operations work on the characters’ underlying integer codes.
# 'A' (65) is less than 'a' (97)
julia> 'A' < 'a'
true
# This checks if 'a' is between 'A' and 'Z'. It isn't.
julia> 'A' <= 'a' <= 'Z'
false
# This checks if 'X' is between 'A' and 'Z'. It is.
julia> 'A' <= 'X' <= 'Z'
true
# This calculates the distance between 'x' and 'a' in the alphabet.
# Int('x') - Int('a') = 120 - 97 = 23
julia> 'x' - 'a'
23
10.1.2. Creating Strings#
Strings are created from a sequence of characters using double quotes.
# A string containing two lines, separated by the newline character `\n`.
str1 = "Hello world!\nJulia is fun\n"
"Hello world!\nJulia is fun\n"
When you display a string variable, Julia shows its literal representation, including escape sequences like \n. However, functions like print() and println() will interpret these sequences to produce formatted output.
# The `\n` characters are interpreted as line breaks.
print(str1)
Hello world!
Julia is fun
Because ", \, and $ have special meanings, you must escape them with a preceding backslash to include them as literal characters in a string.
# Use \" for a double quote, \\ for a backslash, and \$ for a dollar sign.
str2 = "I \"have\" \$50, and the path is C:\\temp\n"
print(str2)
I "have" $50, and the path is C:\temp
For multi-line text, triple quotes ("""...""") are very convenient. Within a triple-quoted string, you don’t need to escape double quotes, and newlines are preserved automatically.
str3 = """
This is some multi-line text.
Newlines are automatically inserted.
Double quotes can be used freely: "See?"
But you still need to escape backslashes and dollar signs: \\ and \$
The indentation of the final triple-quote determines how much leading
whitespace is removed from each line.
"""
print(str3)
This is some multi-line text.
Newlines are automatically inserted.
Double quotes can be used freely: "See?"
But you still need to escape backslashes and dollar signs: \ and $
The indentation of the final triple-quote determines how much leading
whitespace is removed from each line.
10.1.3. Combining and Repeating Strings#
You can combine (or concatenate) strings using the string function or, more commonly, the * operator.
str_a = "Hello"
str_b = "World"
# Method 1: The `string` function can join any number of arguments.
str_c = string(str_a, " ", str_b, "!\n")
# Method 2: The `*` operator provides a more concise syntax.
str_d = str_a * " " * str_b * "!\n"
print(str_c)
print(str_d)
Hello World!
Hello World!
10.1.3.1. String Interpolation#
The most idiomatic and powerful way to build strings is interpolation, which embeds the value of a variable or expression directly into a string literal using the $ symbol. This is often the clearest and most convenient syntax.
# Julia automatically inserts the values of `str_a` and `str_b`.
interpolated_str = "$str_a $str_b\n"
print(interpolated_str)
Hello World
For more complex expressions, wrap the code in parentheses $(...).
vec = rand(3)
println("A random vector: $vec")
# The expression inside the parentheses is evaluated first.
println("The sine of 45 degrees is $(sind(45)).")
A random vector: [0.04279953052982588, 0.8772905022364308, 0.7360836181754195]
The sine of 45 degrees is 0.7071067811865476.
You can also repeat a string multiple times using the power ^ operator.
# Repeats the string 5 times.
"Na "^8 * "Batman!"
"Na Na Na Na Na Na Na Na Batman!"
10.1.4. String Comparison#
You can compare strings lexicographically (i.e., in dictionary order) using the standard comparison operators.
julia> "apple" < "banana" # 'a' comes before 'b'
true
julia> "Zebra" < "apple" # Uppercase 'Z' comes before lowercase 'a' in ASCII
true
julia> "Hello" == "Hello" # Checks for exact equality
true
julia> "1 + 2 = 3" == "1 + 2 = $(1 + 2)" # Interpolation happens before comparison
true
10.1.5. String Indexing and Slicing#
You can access parts of a string using array-like indexing. For ASCII strings, this is straightforward.
(Note: For Unicode strings containing characters that use multiple bytes, indices may not be consecutive. This is a more advanced topic, but it’s good to be aware of!)
To get a single character, use an integer index:
str = "abcdefghij"
# Get the character at the 7th position.
c = str[7]
println("The character is '$c' and its type is $(typeof(c)).")
The character is 'g' and its type is Char.
To extract a new string (a substring), use a range of indices:
# Get the substring from the 7th position to the end.
s = str[7:end]
println("The substring is \"$s\" and its type is $(typeof(s)).")
The substring is "ghij" and its type is String.
Notice the key difference: indexing with an integer returns a Char, while indexing with a range returns a String.
# A range of length one (7:7) produces a string containing one character.
str[7:7]
"g"
For performance-critical code where you want to avoid creating a new copy of the string data, you can create a SubString, which is a view into the original string.
# This creates a view into `str` from index 7 to 10 without allocating a new string.
sub = SubString(str, 7, 10)
"ghij"
You can find the number of characters in a string with the length function.
for i in 1:length(str)
println("Character #$i is '$(str[i])'")
end
Character #1 is 'a'
Character #2 is 'b'
Character #3 is 'c'
Character #4 is 'd'
Character #5 is 'e'
Character #6 is 'f'
Character #7 is 'g'
Character #8 is 'h'
Character #9 is 'i'
Character #10 is 'j'
10.1.5.1. Strings are Immutable#
An important property of Julia strings is that they are immutable. This means you cannot change a string after it has been created. If you need to modify a string, you must create a new one.
str[4] = 'A'
MethodError: no method matching setindex!(::String, ::Char, ::Int64)
The function `setindex!` exists, but no method is defined for this combination of argument types.
Stacktrace:
[1] top-level scope
@ In[18]:1
10.1.6. Example: Palindrome Checker#
A palindrome is a word or phrase that reads the same forwards and backwards, like “racecar” or “madam”.
Thanks to Julia’s powerful slicing syntax, writing a function to check for palindromes is remarkably simple.
function is_palindrome(str)
# `str[end:-1:1]` creates a reversed copy of the string.
# We then simply check if the reversed string is equal to the original.
return str == str[end:-1:1]
end
strings = ["racecar", "hello", "sitonapotatopanotis", "(())", ")(()"]
for str in strings
println("Is \"$str\" a palindrome? $(is_palindrome(str))")
end
Is "racecar" a palindrome? true
Is "hello" a palindrome? false
Is "sitonapotatopanotis" a palindrome? true
Is "(())" a palindrome? false
Is ")(()" a palindrome? true
This is also a great opportunity to practice recursion. A recursive palindrome checker works by this logic:
Base Case: Any string with 0 or 1 characters is a palindrome.
Recursive Step: A longer string is a palindrome if its first and last characters match, and the substring between them is also a palindrome.
function is_palindrome_recursive(str)
# Base case: empty or single-character strings are palindromes.
if length(str) <= 1
return true
# Recursive step: check if outer characters match and inner string is a palindrome.
elseif str[1] == str[end]
return is_palindrome_recursive(str[2:end-1])
# If outer characters don't match, it's not a palindrome.
else
return false
end
end
for str in strings
println("Is \"$str\" a palindrome? (recursive check): $(is_palindrome_recursive(str))")
end
Is "racecar" a palindrome? (recursive check): true
Is "hello" a palindrome? (recursive check): false
Is "sitonapotatopanotis" a palindrome? (recursive check): true
Is "(())" a palindrome? (recursive check): false
Is ")(()" a palindrome? (recursive check): true