Those who prefer visual representation of the process-
Unicode: The code, normally in hex, represents the character itself.
UTF-8: The process of reading(decoding) and writing(encoding) those Unicode characters.
UTF can be UTF-8, UTF-16 and UTF-32Encoding of Unicode into UTF-8 is as below:
Now let's check how Unicode is encoded in utf-8:
A Chinese character: 汐Unicode value of 汐 in hex: U+6C50convert 6C50 to binary: 01101100 01010000position of data bits: 0110 110001 010000position of header bits: 1110 10 10 encode 6C50 as UTF-8: 11100110 10110001 10010000
UTF-8 can have any of the following formats:
Link to he editable image is here.