Fun with bits and character encodings
Starter code: more-bits-package.tar
Upload via Moodle as: more-bits.tar
Goals
- Practice bit operations (~, |, &, ^, <<, >>) in C
- Dig into the details of the UTF-8 character encoding
- Practice writing your own tests
Rubric
2 - to_upper correctness
2 - to_lower correctness
3 - middle_bits correctness
3 - to_utf8 correctness
2 - from_utf8 correctness
3 - code quality
Note that "correctness" in this rubric includes "meets all documented specifications". So, for example, if you don't follow the requirement "to_upper may only use bitwise operations on the chars in s" in the documentation for to_upper, that comes out of your "to_upper correctness" score.
Your assignment
- Implement the five functions documented in the
bits.hfile in more-bits-package.tar. - Put your implementations in
more-bits.c(a starter version of which is provided in the package). - Do not change
more-bits.h. - Submit your file(s) as
more-bits.tarvia Moodle.
The only file you are required to submit is more-bits.c.
You may include your own main.c and Makefile
(starters of which are included in more-bits-package.tar), but we will use
our own main.c and Makefile to do our testing and grading.
A little advice
- Read the function definitions in
bits.hcarefully. They are there to help you! - Start with
to_lower,to_upper, andmiddle_bits. These don’t require any understanding of UTF-8, and just involve bit operations. - For help experimenting with UTF-8, you can use my UTF-8 encoding tool.
- For
to_utf8andfrom_utf8, start by getting single-byte codepoints to work, then two-byte codepoints, etc. Worry about cleaning up your duplicated code after you have everything working.