Followers

Tuesday, November 4, 2008

What is the definition of UTF-8?

What is the definition of UTF-8?
FAQ - UTF-8, UTF-16, UTF-32 & BOM
UTF-8 is the byte-oriented encoding form of Unicode. For details of its definition, see Section 2.5 “Encoding Forms” and Section 3.9 “ Unicode Encoding Forms ” in the Unicode Standard. See, in particular, Table 3-5 UTF-8 Bit Distribution and Table 3-6 Well-formed UTF-8 Byte Sequences, which give succinct summaries of the encoding form. Also see sample code which implements conversions between UTF-8 and other encoding forms

No comments: