Strings and Characters

Strings and characters are used as follows:

Strings are collections of characters.
Strings have the type String and characters have the type Character.
Strings can be used to work with text in a Unicode-compliant way.
Strings are immutable.

String and character literals are enclosed in double quotation marks ("):


_10let someString = "Hello, world!"

String literals may contain escape sequences. An escape sequence starts with a backslash (\):

\0: Null character
\\: Backslash
\t: Horizontal tab
\n: Line feed
\r: Carriage return
\": Double quotation mark
\': Single quotation mark
\u: A Unicode scalar value, written as \u{x}, where x is a 1–8 digit hexadecimal number, which needs to be a valid Unicode scalar value (i.e., in the range 0 to 0xD7FF and 0xE000 to 0x10FFFF inclusive).


_10// Declare a constant which contains two lines of text
_10// (separated by the line feed character `\n`), and ends
_10// with a thumbs up emoji, which has code point U+1F44D (0x1F44D).
_10//
_10let thumbsUpText =
_10    "This is the first line.\nThis is the second line with an emoji: \u{1F44D}"

The type Character represents a single, human-readable character. Characters are extended grapheme clusters, which consist of one or more Unicode scalars.

For example, the single character ü can be represented in several ways in Unicode. First, it can be represented by a single Unicode scalar value ü ("LATIN SMALL LETTER U WITH DIAERESIS", code point U+00FC). Second, the same single character can be represented by two Unicode scalar values: u ("LATIN SMALL LETTER U", code point +0075), and "COMBINING DIAERESIS" (code point U+0308). The combining Unicode scalar value is applied to the scalar before it, which turns a u into a ü.

Still, both variants represent the same human-readable character ü:


_10let singleScalar: Character = "\u{FC}"
_10// `singleScalar` is `ü`
_10let twoScalars: Character = "\u{75}\u{308}"
_10// `twoScalars` is `ü`

Another example where multiple Unicode scalar values are rendered as a single, human-readable character is a flag emoji. These emojis consist of two "REGIONAL INDICATOR SYMBOL LETTER" Unicode scalar values:


_10// Declare a constant for a string with a single character, the emoji
_10// for the Canadian flag, which consists of two Unicode scalar values:
_10// - REGIONAL INDICATOR SYMBOL LETTER C (U+1F1E8)
_10// - REGIONAL INDICATOR SYMBOL LETTER A (U+1F1E6)
_10//
_10let canadianFlag: Character = "\u{1F1E8}\u{1F1E6}"
_10// `canadianFlag` is `🇨🇦`

String fields and functions

Strings have multiple built-in functions you can use:

_10let length: Int

Returns the number of characters in the string as an integer.

_10let example = "hello" _10 _10// Find the number of elements of the string. _10let length = example.length _10// `length` is `5`
_10let utf8: [UInt8]

The byte array of the UTF-8 encoding.

_10let flowers = "Flowers \u{1F490}" _10let bytes = flowers.utf8 _10// `bytes` is `[70, 108, 111, 119, 101, 114, 115, 32, 240, 159, 146, 144]`
_10view fun concat(_ other: String): String

Concatenates the string other to the end of the original string, but does not modify the original string. This function creates a new string whose length is the sum of the lengths of the string the function is called on and the string given as a parameter.

_10let example = "hello" _10let new = "world" _10 _10// Concatenate the new string onto the example string and return the new string. _10let helloWorld = example.concat(new) _10// `helloWorld` is now `"helloworld"`
_10view fun slice(from: Int, upTo: Int): String

Returns a string slice of the characters in the given string from start index from up to, but not including, the end index upTo. This function creates a new string whose length is upTo - from. It does not modify the original string. If either of the parameters are out of the bounds of the string, or the indices are invalid (from > upTo), then the function will fail.

_11let example = "helloworld" _11 _11// Create a new slice of part of the original string. _11let slice = example.slice(from: 3, upTo: 6) _11// `slice` is now `"low"` _11 _11// Run-time error: Out of bounds index, the program aborts. _11let outOfBounds = example.slice(from: 2, upTo: 10) _11 _11// Run-time error: Invalid indices, the program aborts. _11let invalidIndices = example.slice(from: 2, upTo: 1)
_10view fun decodeHex(): [UInt8]

Returns an array containing the bytes represented by the given hexadecimal string.

The given string must only contain hexadecimal characters and must have an even length. If the string is malformed, the program aborts.

_10let example = "436164656e636521" _10 _10example.decodeHex() // is `[67, 97, 100, 101, 110, 99, 101, 33]`
_10view fun toLower(): String

Returns a string where all upper case letters are replaced with lowercase characters.

_10let example = "Flowers" _10 _10example.toLower() // is `flowers`
_10view fun replaceAll(of: String, with: String): String

Returns a string where all occurences of of are replaced with with. If of is empty, it matches at the beginning of the string and after each UTF-8 sequence yielding k+1 replacements for a string of length k.

_10let example = "abababa" _10 _10example.replaceAll(of: "a", with: "o") // is `obobobo`
_10view fun split(separator: String): [String]

Returns the variable-sized array of strings created splitting the receiver string on the separator.

_10let example = "hello world" _10 _10example.split(separator: " ") // is `["hello", "world"]`

The String type also provides the following functions:

_10view fun String.encodeHex(_ data: [UInt8]): String

Returns a hexadecimal string for the given byte array

_10let data = [1 as UInt8, 2, 3, 0xCA, 0xDE] _10 _10String.encodeHex(data) // is `"010203cade"`
_10view fun String.join(_ strings: [String], separator: String): String

Returns the string created by joining the array of strings with the provided separator.

_10let strings = ["hello", "world"] _10String.join(strings, " ") // is "hello world"

Strings are also indexable, returning a Character value.


_10let str = "abc"
_10let c = str[0] // is the Character "a"

_10view fun String.fromUTF8(_ input: [UInt8]): String?

Attempts to convert a UTF-8 encoded byte array into a String. This function returns nil if the byte array contains invalid UTF-8, such as incomplete codepoint sequences or undefined graphemes.

For a given string s, String.fromUTF8(s.utf8) is equivalent to wrapping s up in an optional.

Character fields and functions

Character values can be converted into String values using the toString function:

_10view fun toString(): String`

Returns the string representation of the character.

_10let c: Character = "x" _10 _10c.toString() // is "x"
_10view fun String.fromCharacters(_ characters: [Character]): String

Builds a new String value from an array of Characters. Because Strings are immutable, this operation makes a copy of the input array.

_10let rawUwU: [Character] = ["U", "w", "U"] _10let uwu: String = String.fromCharacters(rawUwU) // "UwU"
_10let utf8: [UInt8]

The byte array of the UTF-8 encoding.

_10let a: Character = "a" _10let a_bytes = a.utf8 // `a_bytes` is `[97]` _10 _10let bouquet: Character = "\u{1F490}" _10let bouquet_bytes = bouquet.utf8 // `bouquet_bytes` is `[240, 159, 146, 144]`

String fields and functions​

Character fields and functions​

Rate this page

String fields and functions

Character fields and functions