Latin writing system: Difference between revisions

3,953 bytes added ,  23:07, 20 December 2022
fix small issues, resequence content, add a section on prefix marking together with its very own unicode screwery tables
(remove schwa)
(fix small issues, resequence content, add a section on prefix marking together with its very own unicode screwery tables)
Line 9: Line 9:
| /m/ || /b/ || /pʰ/ || /f/ || /n/ || /d/ || /tʰ/ || /d͡z/ || /t͡sʰ/ || /s/ || /ɾ/ || /l/ || /ɲ/ || /d͡ʑ/ || /t͡ɕʰ/ || /ɕ/ || /w~j/ || /ŋ/ || /ɡ/ || /kʰ/ || /ʔ/ || /h/ || /a/ || /u/ || /i/ || /o/ || /ɛ/
| /m/ || /b/ || /pʰ/ || /f/ || /n/ || /d/ || /tʰ/ || /d͡z/ || /t͡sʰ/ || /s/ || /ɾ/ || /l/ || /ɲ/ || /d͡ʑ/ || /t͡ɕʰ/ || /ɕ/ || /w~j/ || /ŋ/ || /ɡ/ || /kʰ/ || /ʔ/ || /h/ || /a/ || /u/ || /i/ || /o/ || /ɛ/
|}
|}
In recognition of the fact that {{VY}} may be a lesser-available glyph in fonts and on keyboards, the [[refgram]] designates {{t|v}} as an emergency replacement.


In '''semi-native order''', the consonants are ordered in the Latin/Unicode way ({{t|b, c, ch, d…}}) while the vowels are still at the end, in {{t|a, u, ı, o, e}} order.
In '''semi-native order''', the consonants are ordered in the Latin/Unicode way ({{t|b, c, ch, d…}}) while the vowels are still at the end, in {{t|a, u, ı, o, e}} order.
Line 14: Line 16:
In '''non-native''' or '''Latin order''', the whole alphabet is ordered like the Latin alphabet: {{t|a, b, c, ch, d…}}
In '''non-native''' or '''Latin order''', the whole alphabet is ordered like the Latin alphabet: {{t|a, b, c, ch, d…}}


The vowel {{t|ı}} is written without its dot, to avoid confusion with the tone diacritics listed below.
The vowel {{t|ı}} is written without its dot to avoid confusion with the tone diacritics listed below (stylistically as well as from a point of readability).


== Diacritics ==
== Diacritics ==
=== Tone marking ===
The following diacritics are placed on the first vowel ({{t|a, u, ı, o, e}}) of a word to mark non-default [[tone]] on the whole word:
The following diacritics are placed on the first vowel ({{t|a, u, ı, o, e}}) of a word to mark non-default [[tone]] on the whole word:


Line 31: Line 35:
|}
|}


=== Sparse tone marking style ===
==== Sparse tone marking style ====
Before [[Toaq Delta]], a Toaq text could have chosen ''not'' to mark the most common tone, {{tone|4}}. This was called '''sparse tone marking style'''.
Before [[Toaq Delta]], a Toaq text could have chosen ''not'' to mark the most common tone, {{tone|4}}. At the time, this practice was dubbed '''sparse tone marking style'''.


A verb could never carry {{tone|8}}, so there would’ve been no confusion as long as the reader knew enough Toaq to tell particles from verbs. Therefore, this practice was acceptable in informal writing but discouraged in educational materials. This practice was made in connection with the theory that stated that {{tone|4}} was actually an inherent, or “default”, tone for verbs just as much as {{tone|8}} was for particles.
A verb could never carry {{tone|8}}, so there would’ve been no confusion as long as the reader knew enough Toaq to tell particles from verbs. Therefore, the practice was acceptable in informal writing but discouraged in educational materials. Its supporters states that {{tone|4}} is actually tenacious to analayze as an inherent, or “default”, tone for verbs just as much as {{tone|8}} was for particles.


[[Toaq Delta]] removed {{tone|8}} and the notion of a neutral tone altogether; {{done|1}}, although unmarked, is always understood as falling tone. Thus, one could say that with the introduction of the new four-[[tone]] system, sparse tone marking has become the standard, with both the phonology and the orthography backing it.
[[Toaq Delta]] removed {{tone|8}} and the notion of a neutral tone altogether; {{done|1}}, although unmarked, is always understood as falling tone. Thus, one could say that with the introduction of the new four-[[tone]] system, sparse tone marking has become the standard, with both the phonology and the orthography backing it.
=== Prefix marking ===
In addition, the underdot ({{t|ạ}}, U+0323) is used to mark the presence of a [[prefix]], more specifically the last in a run of prefixes if any are present. It may be replaced by the ASCII hyphen (-) in case the underdot isn’t available on your keyboard. While the underdot falls on the first vowel of the prefix [[raku]] (so where a tone mark would’ve gone), the hyphen should be placed between the last prefix and the word’s stem. For example, {{t|kı-}} + {{t|ne-}} + {{t|shı}} may be written as {{t|kınẹshı}} or {{t|kıne-shı}}; {{t|hao-}} + {{t|chuq}} = {{t|hạochuq}} or {{t|hao-chuq}}.
==== Tone–underdot combos ====
The new [[Delta]] orthography poses a slight challenge for fonts trying to render it as there isn’t a uniform set of precomposed tone+underdot characters to choose from and one has to rely on using a combining diacritic. Specifically, {{t|ı̣}} (ı underdot) comes out janky in some fonts because the <code>ı</code> glyph may be missing an [https://fontforge.org/docs/tutorial/editexample6.html#anchoring-marks anchoring mark]. In fact, out of the 20 possible vowel+diacritic combinations, only 7 have precompositions:
{| class="wikitable toaq" style="text-align: center;"
!
! {{done|1}}
! {{done|2}}
! {{done|3}}
! {{done|4}}
|-
! a
| style="background-color: lightgreen;" |    ạ
| style="background-color: lightpink;  | '''ạ́'''
| style="background-color: lightpink;  | '''ạ̈'''
| style="background-color: lightgreen;" |    ậ
|-
! u
| style="background-color: lightgreen;" |    ụ
| style="background-color: lightpink;  | '''ụ́'''
| style="background-color: lightpink;  | '''ụ̈'''
| style="background-color: lightpink;  | '''ụ̂'''
|-
! ı
| style="background-color: lightpink;  | '''ı̣'''
| style="background-color: lightpink;  | '''ị́'''
| style="background-color: lightpink;  | '''ị̈'''
| style="background-color: lightpink;  | '''ị̂'''
|-
! o
| style="background-color: lightgreen;" |    ọ
| style="background-color: lightpink;  | '''ọ́'''
| style="background-color: lightpink;  | '''ọ̈'''
| style="background-color: lightgreen;" |    ộ
|-
! e
| style="background-color: lightgreen;" |    ẹ
| style="background-color: lightpink;  | '''ẹ́'''
| style="background-color: lightpink;  | '''ẹ̈'''
| style="background-color: lightgreen;" |    ệ
|}
The grapheme clusters in the cells in bold red consist of a precomposed vowel+underdot glyph and a combining tone diacritic. Each cell was normalized with [[wiki:Unicode equivalence#Normalization|Unicode normalization form C]].
It appears that the most consistent as well as font- and input-friendly approach is to precompose the vowel with the tone mark and then add a combining underdot (U+0323):
{| class="wikitable toaq" style="text-align: center;"
!
! {{done|1}}
! {{done|2}}
! {{done|3}}
! {{done|4}}
|-
! a
| a&#x323; || á&#x323; || ä&#x323; || â&#x323;
|-
! u
| u&#x323; || ú&#x323; || ü&#x323; || û&#x323;
|-
! ı
| ı&#x323; || í&#x323; || ï&#x323; || î&#x323;
|-
! o
| o&#x323; || ó&#x323; || ö&#x323; || ô&#x323;
|-
! e
| e&#x323; || é&#x323; || ë&#x323; || ê&#x323;
|}
: '''MediaWiki note:''' The wiki software has been normalizing all page content [https://www.mediawiki.org/wiki/Unicode_normalization_considerations since time immemorial], meaning that the above table has had to use HTML entities to get the desired effect (e.g., <code>í&#x323;</code> for {{t|ị́}} – don’t do this elsewhere). As far as I’m aware, there’s no way to sidestep this, so expect janky-looking underdots until/unless we patch the font used on this wiki (Commissioner) to include the anchor points.


== See also ==
== See also ==
* [https://toaq.net/refgram/02/ "Symbols and sounds"] in the [[Reference grammar]].
* [https://toaq.net/refgram/orthography/ ''Orthography''] in the [[Reference grammar]].
* [[Input methods]] for writing Toaq's diacritics.
* [[Input methods]] for writing Toaq's diacritics.
* [[Hoelai]], the major non-Latin writing system.
* [[Deranı]], the other, non-Latin writing system.