<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://toaq.me/index.php?action=history&amp;feed=atom&amp;title=Unicode</id>
	<title>Unicode - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://toaq.me/index.php?action=history&amp;feed=atom&amp;title=Unicode"/>
	<link rel="alternate" type="text/html" href="https://toaq.me/index.php?title=Unicode&amp;action=history"/>
	<updated>2026-05-03T12:11:53Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://toaq.me/index.php?title=Unicode&amp;diff=1954&amp;oldid=prev</id>
		<title>Laqme at 23:47, 27 May 2024</title>
		<link rel="alternate" type="text/html" href="https://toaq.me/index.php?title=Unicode&amp;diff=1954&amp;oldid=prev"/>
		<updated>2024-05-27T23:47:42Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 23:47, 27 May 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&#039;&#039;&#039;Unicode&#039;&#039;&#039; is a &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;system &lt;/del&gt;that assigns numeric codes to writing systems from all over the world.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&#039;&#039;&#039;Unicode&#039;&#039;&#039; is a &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;text encoding standard &lt;/ins&gt;that assigns numeric codes to writing systems from all over the world.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This page compiles some facts about Unicode that Toaqists may find useful to know.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This page compiles some facts about Unicode that Toaqists may find useful to know.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key toaqwiki-mediawiki-:diff:1.41:old-1952:rev-1954:php=table --&gt;
&lt;/table&gt;</summary>
		<author><name>Laqme</name></author>
	</entry>
	<entry>
		<id>https://toaq.me/index.php?title=Unicode&amp;diff=1952&amp;oldid=prev</id>
		<title>Laqme: Initial article</title>
		<link rel="alternate" type="text/html" href="https://toaq.me/index.php?title=Unicode&amp;diff=1952&amp;oldid=prev"/>
		<updated>2024-05-27T23:42:47Z</updated>

		<summary type="html">&lt;p&gt;Initial article&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Unicode&amp;#039;&amp;#039;&amp;#039; is a system that assigns numeric codes to writing systems from all over the world.&lt;br /&gt;
&lt;br /&gt;
This page compiles some facts about Unicode that Toaqists may find useful to know.&lt;br /&gt;
&lt;br /&gt;
== Unicode vs. UTF-8 ==&lt;br /&gt;
Unicode, the &amp;#039;&amp;#039;&amp;#039;standard&amp;#039;&amp;#039;&amp;#039;, dictates (for example) that the character {{t|ꝡ}} is represented by the number 42849, or A761 in hexadecimal.&lt;br /&gt;
&lt;br /&gt;
These &amp;quot;codepoint numbers&amp;quot; are usually written as U+ followed by the hexadecimal number: U+A761.&lt;br /&gt;
&lt;br /&gt;
UTF-8, an &amp;#039;&amp;#039;&amp;#039;encoding&amp;#039;&amp;#039;&amp;#039;, dictates how to encode that number across bytes in a file: it says U+A761 is &amp;lt;code&amp;gt;ea 9d a1&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
There are other encodings of Unicode, but they are not as commonly used. For example, in UTF-16, the encoding of U+A761 is simply &amp;lt;code&amp;gt;a7 61&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Combining characters and normalization ==&lt;br /&gt;
A letter with a diacritic like {{t|é}} can be represented as a &amp;#039;&amp;#039;&amp;#039;precomposed&amp;#039;&amp;#039;&amp;#039; form (&amp;lt;code&amp;gt;é&amp;lt;/code&amp;gt; U+00E9 {{small caps|latin small letter e with acute}}) or as a sequence of a base letter and combining characters (&amp;lt;code&amp;gt;e&amp;lt;/code&amp;gt; U+0065 {{small caps|latin small letter e}} and &amp;lt;code&amp;gt;◌́&amp;lt;/code&amp;gt; U+0301 {{small caps|combining acute accent}}).&lt;br /&gt;
&lt;br /&gt;
Unicode text may be &amp;#039;&amp;#039;&amp;#039;normalized&amp;#039;&amp;#039;&amp;#039; to smooth over these differences: either by precomposing everything as much as possible (normalization form C or NFC) or by decomposing everything into combining characters (normalization form D or NFD).&lt;br /&gt;
&lt;br /&gt;
Normalization also pins down the &amp;#039;&amp;#039;order&amp;#039;&amp;#039; of combining characters. Underdots come before hats. The string &amp;lt;code&amp;gt;é + underdot&amp;lt;/code&amp;gt; has NFC &amp;lt;code&amp;gt;ẹ + acute&amp;lt;/code&amp;gt; and NFD &amp;lt;code&amp;gt;e + underdot + acute&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Dotless {{t|ı}} and normalization ===&lt;br /&gt;
The letter {{t|í}} decomposes into &amp;lt;code&amp;gt;i + acute&amp;lt;/code&amp;gt;, not &amp;lt;code&amp;gt;ı + acute&amp;lt;/code&amp;gt;. Placing diacritics on a dotless &amp;lt;code&amp;gt;ı&amp;lt;/code&amp;gt; may produce wrong-looking results. Compare:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; &lt;br /&gt;
! Letter&lt;br /&gt;
! NFC&lt;br /&gt;
! NFD&lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;text-align:center;&amp;quot; | {{t|ı}} || &amp;lt;code&amp;gt;ı&amp;lt;/code&amp;gt; || &amp;lt;code&amp;gt;ı&amp;lt;/code&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;text-align:center;&amp;quot; | {{t|î}} || &amp;lt;code&amp;gt;î&amp;lt;/code&amp;gt; || &amp;lt;code&amp;gt;i + circumflex&amp;lt;/code&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;text-align:center;&amp;quot; | {{t|ı̣}} || &amp;lt;code&amp;gt;ı + underdot&amp;lt;/code&amp;gt; || &amp;lt;code&amp;gt;ı + underdot&amp;lt;/code&amp;gt; &lt;br /&gt;
|-&lt;br /&gt;
| style=&amp;quot;text-align:center;&amp;quot; | {{t|ị̂}} || &amp;lt;code&amp;gt;ị + circumflex&amp;lt;/code&amp;gt; || &amp;lt;code&amp;gt;i + underdot + circumflex&amp;lt;/code&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Precomposed tone–underdot combos ===&lt;br /&gt;
Not all Toaq tone–underdot combos have precomposed characters. This table shows precomposed characters in green and NFC forms in red:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable toaq&amp;quot; style=&amp;quot;text-align: center;&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! {{done|1}}&lt;br /&gt;
! {{done|2}}&lt;br /&gt;
! {{done|3}}&lt;br /&gt;
! {{done|4}}&lt;br /&gt;
|-&lt;br /&gt;
! a&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ạ&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ạ́&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ạ̈&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ậ&lt;br /&gt;
|-&lt;br /&gt;
! u&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ụ&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ụ́&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ụ̈&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ụ̂&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! ı&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ı̣&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ị́&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ị̈&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ị̂&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
! o&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ọ&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ọ́&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ọ̈&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ộ&lt;br /&gt;
|-&lt;br /&gt;
! e&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ẹ&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ẹ́&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightpink;   | &amp;#039;&amp;#039;&amp;#039;ẹ̈&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
| style=&amp;quot;background-color: lightgreen;&amp;quot; |    ệ&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Paradoxically, depending on the font and operating system, the &amp;quot;abnormal&amp;quot; forms (like &amp;lt;code&amp;gt;é + underdot&amp;lt;/code&amp;gt;) may show up more correctly. They are demonstrated in the table below:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable toaq&amp;quot; style=&amp;quot;text-align: center;&amp;quot;&lt;br /&gt;
!&lt;br /&gt;
! {{done|1}}&lt;br /&gt;
! {{done|2}}&lt;br /&gt;
! {{done|3}}&lt;br /&gt;
! {{done|4}}&lt;br /&gt;
|-&lt;br /&gt;
! a&lt;br /&gt;
| a&amp;amp;#x323; || á&amp;amp;#x323; || ä&amp;amp;#x323; || â&amp;amp;#x323;&lt;br /&gt;
|-&lt;br /&gt;
! u&lt;br /&gt;
| u&amp;amp;#x323; || ú&amp;amp;#x323; || ü&amp;amp;#x323; || û&amp;amp;#x323;&lt;br /&gt;
|-&lt;br /&gt;
! ı&lt;br /&gt;
| ı&amp;amp;#x323; || í&amp;amp;#x323; || ï&amp;amp;#x323; || î&amp;amp;#x323;&lt;br /&gt;
|-&lt;br /&gt;
! o&lt;br /&gt;
| o&amp;amp;#x323; || ó&amp;amp;#x323; || ö&amp;amp;#x323; || ô&amp;amp;#x323;&lt;br /&gt;
|-&lt;br /&gt;
! e&lt;br /&gt;
| e&amp;amp;#x323; || é&amp;amp;#x323; || ë&amp;amp;#x323; || ê&amp;amp;#x323;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
(MediaWiki [https://www.mediawiki.org/wiki/Unicode_normalization_considerations normalizes page contents], meaning that the above table has had to use HTML entities to get the desired effect (e.g. &amp;lt;code&amp;gt;í&amp;amp;amp;#x323;&amp;lt;/code&amp;gt; for {{t|ị́}}). [[Template:T]] will do this for you.)&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
* [[Latin writing system]]&lt;br /&gt;
* [[Input methods]]&lt;/div&gt;</summary>
		<author><name>Laqme</name></author>
	</entry>
</feed>