CS208 Introduction to Computer Systems Wednesday, 8 April 2026 + Today - characters - character encodings - string functions + Recap: N-bit two's complement integer representation - how to recognize >= 0 vs. < 0 (negative if leftmost bit = 1) - negate N: complement, then add 1 - negate N: "what do I have to add to this number to get 0?" - integer types in C (exact sizes are compiler-dependent) char : 8 bits / 1 byte short : 16 bits / 2 bytes int : 32 bits / 4 bytes long : 64 bits / 8 bytes - Intel machines are little endian, so if you have int x = 0x1234ABCD; then x is stored in memory with the CD byte first and the 12 byte last + String functions to know - strlen - strcmp - strcpy, strncpy - strcat, strncat - be aware of the mess that is strlcpy, strscpy, strpcpy,... + Characters - https://sandbox.jeffondich.com/encoder - Standards mapping characters to integers ASCII ISO-8859-1 Unicode - Unicode "codepoint" - the integer that represents one character U+0041 <-> A U+00E9 <-> é U+2191 <-> ↑ U+1F60A <-> 😊 + Lab (25 minutes or so, plus debrief) + Character encodings - given a codepoint, how am I going to store it as a byte sequence? (in memory or in a file) - note: "encode" != "encrypt" - why is an "encoding" different from a "codepoint" - easy encodings - UTF-16LE ↑ <-> U+2191 <-> 91 21 [2-byte sequence] - UTF-16BE ↑ <-> U+2191 <-> 21 91 [2-byte sequence] + Friday: UTF-8