0Pricing
Learn Rust Coding · Lesson

UTF-8 and Chars

Unicode handling.

Rust Strings Are UTF-8

Every Rust String and &str is encoded as UTF-8. This means text can include any Unicode character, not just ASCII.

fn main() {
    let s = String::from("caf\u{e9}");
    println!("{}", s);
}

What Is UTF-8?

UTF-8 encodes each Unicode character using one to four bytes. ASCII characters use a single byte; accented letters and emoji use more.

fn main() {
    let ascii = "A";          // 1 byte
    let accent = "\u{e9}";    // 2 bytes (e-acute)
    println!("{} bytes vs {} bytes", ascii.len(), accent.len());
}

All lessons in this course

  1. String vs &str
  2. Slices
  3. String Methods
  4. UTF-8 and Chars
← Back to Learn Rust Coding