Unicode
: is a Coded Character Set - a mapping between a set of abstract characters and a set of integers (code points).
Code Point <-> Character DescriptionU+0041 | A Latin Capital letter A (https://symbl.cc/en/0041/) U+0042 | B Latin Capital letter B (https://symbl.cc/en/0042/)... |U+005A | Z Latin Capital letter Z (https://symbl.cc/en/005A/)... |U+5301 | 匁 Ideograph Japanese unit of weight (1/1000 of a kan) CJK 匁 (https://symbl.cc/en/5301/)... |U+1F525 | 🔥 Fire Emoji (https://symbl.cc/en/1F525-fire-emoji/)
UTF-8
: is a Character-Encoding Scheme - a mapping between one or more coded character sets (Unicode code points) and a set of octet (eight-bit byte) sequences.
Character Code Point <-> Bytes(Hex)A U+0041 | 41 (https://symbl.cc/en/0041/)B U+0042 | 42 (https://symbl.cc/en/0042/)... |Z U+005A | 5A (https://symbl.cc/en/005A/)... |匁 U+5301 | E5 8C 81 (https://symbl.cc/en/5301/)... |🔥 U+1F525 | F0 9F 94 A5 (https://symbl.cc/en/1F525-fire-emoji/)
Refer to Java Charset Class
If you're interested in Unicode
and UTF-8
in Java, my video may help.