Unicode

From Resonite Wiki
Revision as of 17:44, 17 April 2024 by PJB (talk | contribs) (Add some info about encodings, since we have stuff mentioning that.)

Unicode is the international standard used to represent text on almost all modern computer systems. Resonite also uses Unicode, so various ProtoFlux nodes and concepts have functionality relevant for it.

Encodings

Unicode defines various encodings to define how text is represented as bytes, in memory, over the network and in files. Different systems and programming languages use different encodings by default, but they all represent the same data.

UTF-8

UTF-8 is the encoding most commonly seen in text files and over the network.[1] It is variable length, with a single code point being represented as 1-4 bytes of data.

UTF-16

UTF-16 is used by Resonite itself to represent Strings and chars. Most code points are represented as 16-bit values (a char), however like UTF-8 it is variable length, code points outside the "Basic Multilingual Plane" (such as emojis) requiring 2 chars in a "surrogate pair".


UTF-32

UTF-32 represents code points as 32-bit values. This is not a variable length encoding like UTF-8 and UTF-16, but it does waste a large amount of space for most text.

  1. At least in the west.