What is Toaq? (for linguists)

From The Toaq Wiki
Revision as of 03:23, 11 August 2024 by Isı (talk | contribs) (→‎Writing a parser: Fix typo)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

This article explains Toaq's origin and goals to an audience of linguists.

What is Toaq?

Kaqgaınáqchoqjaokaqbıu.
see1sgthe\manuseof\thetelescope
I saw the man [who's] using the telescope.

Kaqgaınáqchôqjáokaqbıu.
see1sgthe\manadv\usethe\telescope
I saw the man [by] using the telescope.

This wiki is about a constructed language called Toaq. Constructed languages, like Esperanto or Toki Pona, are those deliberately created by people for some purpose. Toaq is developed and spoken by a small community of hobbyists.

Toaq's primary goal is to be free of syntactic ambiguities like Everybody likes somebody or I saw the man with the telescope. The syntax of Toaq is carefully designed so that every sentence has precisely one meaning. Thus, its syntax-to-semantics transform can be implemented as a deterministic computer program.

Meanwhile, Toaq tries to preserve a high degree of humanism. It would be simple to achieve our goal by assigning a phonology to a set of mathematical symbols, but such a language wouldn't look anything like human language, and would be difficult for humans to speak and process. Toaq's syntax is modeled directly after that of natural languages; its lack of ambiguity should, ideally, seem to be a perfect coincidence.

A very brief history of logical languages

Interest in a "mathematically planned human language" runs centuries into the past. (Consider Leibniz's characteristica universalis, which inspired Frege's Begriffschrift.) Toaq's lineage can be traced back to Loglan, developed in the 1950s to investigate the Sapir–Whorf hypothesis. The idea was roughly that, if language shapes thought, then speakers of a logical language would think more logically. Loglan was conceived of as a sort of speakable predicate logic. Its successor, Lojban, furthered the effort, and its designers hoped that it would see use as a machine interlingua: a syntactically unambiguous language that puts humans and computers on a level playing field for communication.

In the past half-century, the Sapir–Whorf hypothesis has become largely disfavored. Advances in artificial intelligence show us that computers have no trouble engaging meaningfully with natural language, no matter its syntactic ambiguity. Toaq's development, thus, proceeds more for its own sake than that of its predecessors.

Within the conlang community, people disagree on what a "logical language" is. For some, merely being based in spirit on predicate logic is enough. By demanding an unambiguous syntax that still conforms to rigorous linguistic ideas of what makes language human, Toaq has set the bar high. Can it be cleared at all?

Writing a parser

Jıa de máq nha
= ⟦nha⟧(⟦jıa de máq⟧)
= PROMISE(⟦jıa⟧(⟦de máq⟧))
= …
= PROMISE(λ𝘸. ∃𝘦. τ(𝘦) ⊆ t ∧
    beautiful.𝘸(a)(𝘦)) | t > t₀ | inanimate(a)

The community is working on a parser, called Kuna, which translates Toaq sentences into their logical forms. To do so, we must thoroughly develop the syntax and semantics of the language. In the process, the developers have taken an amateur interest in natural language semantics. A human-oriented language whose syntax is small and unambiguous turns out to be an attractive testbed for implementing semantics research.

For Toaq to describe everyday situations, and for us to describe Toaq, we have to pick a theory of "speech acts", a theory of "tense", a theory of "plurality/distributivity", et cetera. How many ideas from linguistics research must we combine to make a "complete" language? Do the ideas play nicely together in practice? If learning grammar means absorbing strict rules about scope and quantification, can humans learn to reliably produce correct sentences? The parser's development influences our usage, and vice versa. Step by step, we arrive at a better picture of how a language can be both human and computer-parseable.

Earlier loglangs made little effort to connect with linguists' ideas of how language is structured. A Lojban parser produces ad-hoc syntax trees resembling those of a programming language. Toaq tries to bridge the gap. The basic structure of our parse trees follows Heim and Kratzer's Semantics in Generative Grammar.

Why bother?

󱚳󱚹󱛍󱚲󱚵󱚲󱛍󱚹󱚱 (pıunuım)
skin-star
"freckle"
󱚷󱚹󱛂󱛀󱛃󱛍󱚺󱛎󱚹 (tıqshoaı)
tile-wing
"butterfly"

Toaq's secondary purpose is to be aesthetically pleasing. Its speakers are excited about language and language creation. Its phonology and lexicon are designed from scratch. Engaging with Toaq can mean anything from contributing software, to inventing interesting words, to making beautiful calligraphy. We are as indebted to Montague as we are to Tolkien. The language's interests straddle academia, art, and fantasy.

The point is not to introduce Toaq as a new lingua franca, nor do we hope to change how we think. Rather, it lets us explore a space where language meets logic and nature meets artifice. We let semantic theories roam freely in a syntactic utopia. Ultimately, Toaq asks a question all syntacticians ask: How do we say what we mean?

If any of this sounds meaningful, or even just fascinating — we'd be delighted to see you on Discord. Laojaı íme súqbo! Jemu.png