Semantics

Revision as of 00:07, 23 July 2023 by Loekıa (talk | contribs) (better word)

Toaq is a loglang, which means that given any sentence, we can unambiguously derive its meaning in logic notation. Semantics, the study of meaning, guides us in determining what those results should look like, and how we might use our knowledge of syntax to derive them.

The refgram tells you that 󱚼󱚲󱛍󱚹 󱚵󱚲󱛍󱛃 󱚺󱛊󱚺 󱛘󱚷󱚹󱛂󱚻󱚺󱛙 󱚵󱛌󱚹󱛍󱚴 󱛘󱚵󱛊󱚺󱛎󱛃󱛄󱚲󱛍󱚺󱛙 (Luı nuo sá tıqra nîe náokua) translates to . The reality is that this isn't "just" logic notation: it's a very specific notation that has been purpose-built for describing natural language semantics, and this article will help you understand the core concepts behind it.

Models

To help us reason about meaning more directly, mathematicians have come up with the idea of a model: a mathematical object that tells us exactly how to interpret statements in a given formal language. In its most basic form, a model has three parts:

  • A signature, which is the set of all words and symbols found in the language, along with their syntactic properties.
  • A domain, which is the set of all objects, functions, relations, etc. that the language is capable of representing.
  • An interpretation, which is a function defining which symbols correspond to which elements of the domain.

For example, consider the language of basic arithmetic. A model for this language might look like this:

 

As it turns out, Toaq is a formal language too, which means we can reason about it using models. Now, being a human language, Toaq's semantics are quite a bit more complicated than that of arithmetic, but luckily for us, models are a pretty flexible concept, and we can extend them with extra features as we need them.

In its most basic form, a model for Toaq might look something like this:

 

As you can see, this model holds not just concepts like the meaning of "muao", but also context-sensitive information, such as what "káto" and "jí" refer to.

Say that you have an idea of what the world is like—maybe you have a mental model in your head, or maybe you have a database to look things up in. If your knowledge is complete enough, then that model lets you answer a question, or tell whether what someone said is true, by interpreting their words and then "looking up" the answer. But more often than not, people are working with incomplete knowledge. In this case, if someone tells you something, a model lets you interpret their words and then work backwards from the meaning to figure out what must be true about the world.

A note for the adventurous: There are alternative approaches to semantics that don't involve models, such as proof-theoretic semantics, in which the meaning of a statement is determined purely by its relationships to other statements in a formal proof system. There have been some attempts to apply this approach to Lojban and Toaq semantics[1][2], but when it comes to natural language semantics, the model-based approach described here is far more common.

Basic notation

Now, we're ready to talk about notation. When you see something like  , what you're looking at are a bunch of things from the domain of the model. A lot of these words ( ,  ,  ,  ,  ) are functions; some others ( ) represent literal "things" from the domain, like physical objects, people, and ideas, which we'll call individuals. Together, these words form an expression that shows you how to calculate the truth value of a specific sentence (in this case, 󱚵󱚴󱛍󱛃 󱚺󱛊󱚺 󱛘󱛄󱚺󱚷󱛃󱛙 󱛘󱚰󱛊󱚲󱛍󱚺󱛎󱛃󱛙 (Neo sá kato múao)), given that you have a model.

There's an important subtlety here: In languages like English and mathematical logic, you can use words to form statements such as "The sky is blue" and " ", or you can use them to form smaller expressions, like "the author of this book" and " ". But in the semantic notation we're looking at, there are no statements, only expressions, because the point of semantics is to examine the values that things denote, including the values of statements themselves. As such, it doesn't make sense to call this a "logic notation", because on its own, it can't form statements. Instead, we'll call it a semantic calculus.

One interesting thing about this notation is that every expression has a type, like some programming languages do. These include:

  •  , the type of individuals, which encompasses everything you can refer to in Toaq. This is a rather broad category, so to help us get more specific when we need it, it includes a couple of subtypes:
    •  , the type of events (things that can happen). More on them later.
    •  , the type of time intervals
  •  , the type of truth values, such as 'true' and 'false'
  •  , the type of worlds (frames of reference to evaluate claims by). More on them later.

There are also functions, for which we use angle brackets:   is the type of functions that take an event as their input, and return a truth value as their output. Functions can take or return other functions: for example,   is the type of functions that take a function from events to truth values, and return a function from time intervals to truth values.

To keep all these types straight, we give each a dedicated set of variables:

  •   for individuals
    •   for events
    •   for time intervals
  •   for worlds
  •   for functions (the exact type is left to context)

It turns out we don't need variables for truth values, so we don't assign them any.

Be careful when reading these letters, because italics are meaningful. There are some tricky pairs of symbols such as  , which is a world variable, versus  , which is a constant referring to the real world, and  , which is a time interval variable, versus  , which is a constant referring to the salient time interval.

Another important feature of this language is that it has a special syntax for writing functions, known as a lambda expression. They're easy to spot because they start with the Greek letter  , and have two components: a variable name representing the function's input, and an expression representing the function's output. For example,   is a function that takes a value   as its input, and outputs the value  . Since   is a variable of type  , and   is an expression of type  , we can tell that this function has type  . Similarly,   is a function of type   which computes whether   is an event of the speaker whispering.

You can apply these lambda functions to an argument in the same way you would apply a named function: by placing them before their argument, surrounded with parentheses. For example, if we say that   is the function  , then   and   are two ways of saying the same thing—they both evaluate to  .

Finally, here are some common symbols that you'll see. It's no coincidence that these match the symbols used in mathematical logic, and even share the same syntax! But in the world of semantics, you should learn think of them as functions rather than operators that get special syntactic treatment.

  • Conjunctions, which attach to two truth values and output a new truth value (type  )
    •   for "and"
    •   for "or"
  • Polarizers, which attach to one truth value and output a new truth value (type  )
    •   for "not"
    •   for "indeed", "in fact"
  • Quantifiers, which take a predicate (and optionally, another predicate to restrict the domain) and output a truth value (type   or  )
    •   for "some"
    •   for "every"

For example, we might write the interpretation of "indeed, every person is living or dead" as  .

Events

One of the most basic jobs of any semantic theory is to define how verbs work. The traditional approach, used widely throughout mathematics, is to represent 󱚴󱚺 󱚾󱛊󱚹 󱛘󱚵󱛊󱚺󱛎󱛃󱚰󱚹󱛙 (Fa jí náomı) as  , where the verb is interpreted as a function (here,  ) receiving the subject and any objects as arguments. But sadly, this approach is unable to account for tense, aspect, or adverbs.

Modern semantics research has settled on a single concept to overcome all of these issues: events. An event is an extra argument passed to a verb representing the action itself; the instance of that verb "happening". For instance,   computes whether   is an event of the speaker going to the sea. Whereas the first two arguments represent the participants in the action (the goer and the destination), e stands for the thing that connects them: the going, or the journey. Then, a sentence like 󱚴󱚺 󱚾󱛊󱚹 󱛘󱚵󱛊󱚺󱛎󱛃󱚰󱚹󱛙 (Fa jí náomı) can be understood as claiming that there is such an event:  . This system is credited to philosopher Donald Davidson, giving it the name Davidsonian event semantics.

This gives us a systematic way to deal with adverbs: to modify the verb, modify the event variable introduced by the verb. This is intended to reflect the intuition that "I slept briefly" has the same meaning as "My sleep was brief". For example, 󱚵󱚲󱛍󱛃 󱚾󱛌󱚹󱚱 󱚾󱛊󱚹 (Nuo jîm jí) can be interpreted as  . And prepositions work similarly: for 󱚼󱚺󱛎󱛃 󱚾󱛊󱚹 󱚵󱛌󱚹󱛍󱚴 󱛘󱚾󱛊󱚹󱛍󱛃󱛙 (Lao jí nîe jío) we would use   — "whether there is some event of me waiting that is inside the building".

With events in our toolbox, tense and aspect also fall into place. If we imagine that every event has a temporal footprint (the points in time at which it takes place), then it seems reasonable that there should be a function to access this information. We call this  , the temporal trace function (type  ). Aspect is then understood as making a claim about an event's temporal structure, relative to a reference time determined by the tense. For instance, 󱚷󱚺󱚱 (tam) makes the claim that the event's temporal trace lies fully within the reference time:  . (This one comes up a lot, because 󱚷󱚺󱚱 (tam) is the default aspect.) And 󱚼󱚲󱛍󱚹 (luı) makes the claim that the event's temporal trace comes before the reference time:  .

So including aspect, the complete interpretation of 󱚴󱚺 󱚾󱛊󱚹 󱛘󱚵󱛊󱚺󱛎󱛃󱚰󱚹󱛙 (Fa jí náomı) should be  . This is a little cumbersome to read, so you will sometimes see it abbreviated to   when we're being lazy.

Presuppositions

Some statements carry a set of assumptions in addition to their main semantic content. When we say "The current king of France is bald", it is assumed that there is a current king of France. And likewise, the sentence 󱚼󱚲󱛍󱚹 󱚵󱚲󱛍󱛃 󱚺󱛊󱚺 󱛘󱚷󱚹󱛂󱚻󱚺󱛙 󱚵󱛌󱚹󱛍󱚴 󱛘󱚵󱛊󱚺󱛎󱛃󱛄󱚲󱛍󱚺󱛙 (Luı nuo sá tıqra nîe náokua) carries the assumption that 󱚵󱛊󱚺󱛎󱛃󱛄󱚲󱛍󱚺 (náokua) actually refers to a bathroom. (It would be nonsensical to say such a thing while pointing to, say, a car!) The technical term for an assumption of this kind is a presupposition.

There's a trick that we can use to write presuppositions alongside a semantic expression: by leveraging the mathematical notion of an expression being undefined. Just as   is undefined when  , "the current king of France" should be undefined when France has no king. In semantic notation, we write this as  . This restricts the possible models to only those that set   to be a king of France.

Note that this   clause can appear anywhere within an expression, not just at the top level. One example where it needs to be embedded in a sub-expression is in 󱛃󱚺󱛂 󱚷󱛊󱚲 󱛘󱚶󱚴󱛍󱛃󱛙 󱛄󱛊󱚴 󱛘󱚳󱚺󱛎󱛃󱛙 󱛌󱚺󱛂 (Gaq tú deo ké pao âq). This becomes:  . Moving the   clause to the top level wouldn't work, because it uses the variable  , which is only available inside the scope of the   function.

In lambda expressions, you might also come across the syntax  , where   is imagined to be a quantifier restricted by  . This is the same thing as writing  .

Worlds

Another important concept for any semantic theory to cover is modality: the treatment of words such as 󱛀󱚴 (she), 󱚶󱚺󱛎󱚹 (daı), 󱚺󱛎󱛃 (ao), and 󱚶󱚹 (). We use these words to make claims not about the actual state of the world, but about possibilities, obligations, or beliefs. The tried and true approach to modality, named after philosopher Saul Kripke, is known as Kripke semantics.

In Kripke semantics, we imagine that there are a multitude of worlds: one world,  , represents the real world, while others represent alternate timelines. Then, every verb is extended to take a world argument: for example,   computes whether there is an event of the speaker whispering in the real world, with the world variable being written in a subscript for readability.

In this framework, we can understand modals as making claims about alternate worlds. For instance, 󱛀󱛌󱚴 󱛔 󱛁󱚺󱛋 󱚷󱚺󱛎󱛃 󱚺󱚹 󱚺󱛊󱚲󱛂 󱛘󱚴󱛊󱚺󱚴󱚲󱛍󱚺󱛂󱛙 󱛔 󱚵󱛋󱚺 󱚿󱛃 󱚺󱛊󱚲󱛂 󱛆󱛊󱛃󱛂 (Shê, ꝡä tao sı súq fáfuaq, nä cho súq hóq) means "in all possible worlds, minimally different from the real world, in which you go to see the movie, you like it". In semantic notation, that looks like:  . The function   is the part that stands for "  is a possible world minimally different from the real world". The technical term for this function is the accessibility relation, because it defines which worlds we can "access" and talk about using the modal 󱛀󱚴 (she).

Some modals, such as 󱚶󱚺󱛎󱚹 (daı), use the quantifier   instead of  , because for something to be possible, it only needs to be true in one possible world. Other modals, such as 󱚶󱚹 (), use a completely different accessibility relation ( ) to talk about acceptable worlds rather than possible worlds. And other modals, such as 󱚺󱛎󱛃 (ao), use an accessibility relation that presupposes that the complement is not true in the reference world, to achieve a counterfactual effect. This world metaphor really is flexible enough to account for all modals!

Note that similarly to events, we sometimes get lazy and neglect to write the world arguments on verbs.

Propositions

A proposition is, in the broadest sense, anything that bears a truth value, such as a fact, a belief, or the meaning of a sentence. We can say that the sentence "Die Erde ist ein Planet" expresses the proposition that the Earth is a planet, and likewise, in the sentence "I believe that I saw a ghost", we can identify "that I saw a ghost" as referring to the proposition that the speaker saw a ghost.

In Toaq, we use the complementizer 󱛁󱚺󱛋 (ꝡä) to create a reference to a proposition, which can then become the complement of another verb. So, our semantic theory needs to account for this construct, and it turns out that it's best to use two different "interpretations" of propositions for this purpose.

The first interpretation is propositions as functions. The idea is to interpret a complementizer phrase as a function which takes a world as an input, and outputs the truth value of the proposition in that world (type  ). So for example, in 󱚿󱚹 󱚾󱛊󱚹 󱛔 󱛁󱚺󱛋 󱚸󱚺 󱚻󱚲󱛂󱛀󱚲󱛍󱚺 (Chı jí, ꝡä za ruqshua), we would interpret the complementizer phrase as  , and pass this as an argument to the main verb, giving  . Note that it would be wrong to interpret the complementizer phrase as  , because this evaluates to a simple truth value, which fails to capture the statement's semantic content. No one goes around saying "I believe [TRUE]" or "I believe [FALSE]". By using a function, we capture the statement's intension (its abstract connotation) rather than its extension (the concrete truth value held by the statement in the real world).

This approach is nice and simple, but it does have limitations. In Toaq, we can not only reference propositions with 󱛁󱚺󱛋 (ꝡä), but we can also assign them to variables, or even quantify over them, as in the sentence 󱚶󱚲󱛍󱚺 󱚾󱛊󱚹 󱚺󱛊󱚹󱛍󱚺 󱛘󱚻󱚺󱛎󱚹󱛙 (Dua jí sía raı). A naive approach to interpreting this sentence would be  , where the variable 󱚻󱛊󱚺󱛎󱚹 (ráı) is taken to range over functions of type  . But if you let Toaq variables range over functions from the model, this now lets you construct the liar paradox, a sentence which contradicts itself: 󱚺󱚺󱛆󱚲 󱚵󱛊󱚹 󱛘󱚻󱚲󱛍󱚺󱛂󱚺󱚴󱛙 (Sahu ní ruaqse). Interpreting this sentence, we get  , which is problematic. Philosophers have studied this paradox extensively, and come up with a few different possible responses:

  • Restrict the language's syntax so that it can't even express the liar paradox (not an option for a human language like Toaq)
  • Allow models to contain contradictions, by departing from classical logic in some way (for example, by adding a 3rd truth value, or otherwise weakening the logic to prevent explosion)
  • Use a more specific notion of truth for propositions, so that the language doesn't literally contain its own truth predicate

Both of the last two options will work, and we should ensure that our semantic notation can accommodate either of them as resolutions to the paradox. This is where the second interpretation comes in: propositions as individuals. The idea is to let some individuals stand for propositions, and use the functions   and   (both of type  ) to access their semantic content. There could also be a function   (type  ) which lets you convert propositions in the other direction, from functions to individuals. With this approach, quantifying over propositions, as in 󱚶󱚲󱛍󱚺 󱚾󱛊󱚹 󱚺󱛊󱚹󱛍󱚺 󱛘󱚻󱚺󱛎󱚹󱛙 (Dua jí sía raı), looks like this:  . Note the use of   to convert the variable   into an  , which enables us to reuse the same version of   that takes   propositions.

The consequence of this approach is that we now have a layer of abstraction to play with (  and  ), so that models are free to apply any reasonable resolution to the liar paradox. For example, we can allow the contradiction to exist by setting   directly equal to  , or we can let   and   refer to some more specific notion of truth that holds up to the liar paradox, such as Kripkean truth[3] or stable/categorical truth[4].

Properties

A property is an incomplete proposition; a claim with blanks to be filled. A simple example would be the property "◯ is red", also known as "to be red" or simply "redness". By filling in the blank, you get a proposition: "the apple is red". Properties can be arbitrarily complex, containing nested clauses or even multiple blanks: for example, "◯ can't believe that ◯ is not butter". In Toaq, properties are marked by the complementizer 󱚼󱛋󱚺 ().

The good news is that once you understand the semantics behind propositions, properties aren't far out of reach. We still have the same concerns: capturing the intension rather than the extension, and enabling variables to refer to properties without creating a paradox. Properties are represented in the same way as propositions, just with an extra parameter or two added for the blanks. This means we get both function-type and individual-type representations.

We use the function representation whenever a property in Toaq is spelled out explicitly with the complementizer 󱚼󱛋󱚺 (). For example, the property in 󱚼󱚴󱛍󱛃 󱚾󱛊󱚹 󱛔 󱚼󱛋󱚺 󱚵󱚲󱛍󱛃 󱚾󱛊󱚺 󱛚 (Leo jí, lä nuo já) would be interpreted as  , a function of type  . And for a property with two blanks, you would use a function of type  .

But whenever a Toaq variable is used as a property, we need to fall back to the properties as individuals approach, using   (type  ) or   (type  ) to access its semantic content. So, the correct interpretation of 󱚿󱚴 󱚽󱛊󱚺󱛎󱛃 󱚺󱛊󱚺 󱛘󱚾󱚲󱛍󱚺󱛙 (Che nháo sá jua) would be  .

Notes

  1. brismu, a sketch of an inferential approach to Lojban semantics
  2. Hoemuı, the beginnings of a sketch of an inferential approach to Toaq semantics (super outdated)
  3. Kripke, S., 1975, “Outline of a theory of truth”, Journal of Philosophy, 72: 690–716.
  4. The Revision Theory of Truth (Stanford Encyclopedia of Philosophy)