index > 2.1.4 Representation of data in computer systems


All symbols on a computer are actually just binary code. The computer matches the symbol with the code and displays the correct symbol. The list of codes and their matching characters is called the character set.

ASCII stands for American Standard Code for Information Interchange. It is 7 bits and provides 127 characters or symbols and the null character, so 128 characters in total. Extended ASCII uses 8 bits to provide 256 characters. The issue with ASCII is that it does not include symbols from other languages or mathematical symbols etc.

Unicode uses 16 bits and provides over 65,000 possibilities. 32 bit Unicode provides over 4 billion possibilities. Unicode has characters for multiple languages and for many specialist uses. Unicode keeps the same original codes for ASCII, but also adds to it. ASCII is considered a subset of Unicode.