Unicode

From Resonite Wiki
Revision as of 13:11, 10 September 2025 by Kisaragi marine (talk | contribs) (Tentatively added to Category:Data structure since UTFs define internal byte-level layouts, but might be better under a dedicated encoding category (Follow-up to previous edit), add "References" section (this edit).)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Unicode is the international standard used to represent text on almost all modern computer systems. Resonite also uses Unicode, so various ProtoFlux nodes and concepts have functionality relevant for it.

Encodings

Unicode defines various encodings to define how text is represented as bytes, in memory, over the network and in files. Different systems and programming languages use different encodings by default, but they all represent the same data.

UTF-8

UTF-8 is the encoding most commonly seen in text files and over the network.[1] It is variable length, with a single code point being represented as 1-4 bytes of data.

UTF-16

UTF-16 is used by Resonite itself to represent Strings and chars. Most code points are represented as 16-bit values (a char), however like UTF-8 it is variable length, code points outside the "Basic Multilingual Plane" (such as emojis) requiring 2 chars in a "surrogate pair".

UTF-32

UTF-32 represents code points as 32-bit values. This is not a variable length encoding like UTF-8 and UTF-16, but it does waste a large amount of space for most text.

References

  1. At least in the west.