oxaliq.net - learning and making

thinking about what language is for a moment

by way of an opening query: what does a language need? naturally our base case spoken languages have sounds and our base case signed languages have gestures. for each of these, we have an articulatory mechanism: the vocal tract, or the face and hands and arms; and a perceptual mechanism: the auditory system, or the visual system. the unique thing about language among other forms of communication is how languages use time. it might seem basic, but it's easy to forget that 'my cat scratched the post' and 'the post scratched my cat' mean very different things. and the same sounds in a different sequence can become a smattering of ideas as in 'the my scratched cat post' or even lose meaning all together as in 'cra catsm sde thymst opsh'. languages differ on how they use time--some languages allow more freedom for word order than others, but relationships between articulations through time are essential to all known languages. for now, let's start with those articulatory bits! (more on time and on meaning later)

phonemes

what even are they?

no talk of letters here! no 'ghoti' is pronounced 'fish' jokes! instead, let's imagine how to describe the difference in articulation and in reception between 'mine' and 'fine' and 'wine' and 'dine' or between ''. these words rhyme! which (for short words) is a way of saying that most of their sounds are the same, and the similar bits come at the end. rhyme is a little more complicated than that, encompassing stress patternsit is uncontroversial to suggest that there is a basic articulatory/perceptual unit in any given modality which, when strung together, produces the basic units of meaning. this basic unit, no matter the modality, is called the 'phoneme'.historically, called a 'chereme' in sign languagesin selecting the words i have, i've already hinted at how linguists support the existence of these phonemes. 'dine' and 'fine' together form a 'minimal pair' of words with a different meaning whose difference is found in only one perceptually distinct part of their articulation. because we know from english language usage that 'mine' and 'fine' and 'wine' and 'dine' are distinct words with distinct meaning, we have a clue that there is some phonemic difference between /m/ and /f/ and /w/ and /d/. by collecting more examples of these minimal pairs ('do' and 'moo' and 'wed' and 'dead' and on and on) we can begin to describe the physical sounds associated with each phoneme and how each is articulated.

human bodies are inexact things--perception is important here! it does us no good to describe an extra-tightly clenched middle finger in a closed hand shape as indicative of a distinct phoneme as it would be unlikely to be perceptible to an interlocutor and so could never disambiguate between two signs. environments are noisy and so articulation is also important! in my dialect of english the word 'put' /pʊt/, in a noisy environment, might be pronounced roughly [pʰʊtʰ]. in casual speach, however this same word is frequently realized as [pʰɵʔ] with the only audible consonant at the end being the glottal closure of 'uh-oh'. the only ghost of the exaggerated realization is typically an inaudible tongue placement behind the alveolar ridge. a speaker recognizes what the phoneme 'could be' with more effort, but typically such effort is unnecessary for understanding. this suggests that there phonemes are not simply sounds or handshapes or mouth movements. something must be underlying the equality of meaning between [pʰʊtʰ] and [pʰɵʔ]

there's a fairly wide consensus amongst linguists that, despite being the minimal constituent needed to represent meaning in language, phonemes are not atomic. a phoneme can be decomposed into constituent features and minimal pairs of phonemes can be shown to be distinct only in their realization of one feature. by way of example, the [b] in the word 'shabby' and the [m] in the word 'shammy' differ only in that the [m] is pronounced with air passing through the nasal cavity. the feature [+/- nasal] is therefore taken to be a salient feature in english phonology

all well and good, but things start to get tricky when we start defining features. firstly, there is no single agreed upon set of features by which to analyze all languages of a given modality. as stated, there's broad agreement that phonetic features exist and many proposed features are uncontroversial, yet even linguists analyzing the same language can disagree upon featural details. vowels in particular are quite slippery to analyze, with [+/- back], [+/- close], [+/- front], [+/- low], [+/- high], [+/- tongue root retracted], [+/- rounded] among the features present in different systems. there are also some linguists who, relying on auditory analysis, analyze vowels primarily via formant analysis. (formants are measures of what is sometimes referred to as 'resonance' or 'vowel color' -- they are the pitches above the fundamental frequency with the greatest relative amplitude.) it is this amateur crank's opinion that because articulation and perception are subject to different constraints and pressures, what is deemed a feature can elide a relationship between speach actor and interlocutor. thankfully, should latl allow for user definition of features and their phonemes, it can remain agnostic to the hairy work of actual linguistics

users should therefore be able to define their own phonetic feature sets and use those to compose their phonemes. (i'm going to sneak in the undefended assertion here that users should be able to use other users' definitions as well. forgive me.) if you're reading this and are familiar with linguistics, you might now be wondering about the curious case of place of articulation. should place features be treated as hierarchichal -- should [coronal] place of articulation be required for [+/- anterior] feature of the crown of the tongue? if so, how are coarticulations like [tʷ] or [k͡p] to be expressed in featural terms? here again, latl will allow for the definition of hierarchichal features and make no assumptions about their use

yet another problem is hiding in the view i've thus provided of phonological features. there is a wide (but not universal) belief that distinctive features in phonology are inherently binary. this is convenient from a computational perspective, but may not be descriptive of real language. firstly, it is possible to analyze [coronal] in the previous paragraph as a unary feature relevant to place of articulation. more distressingly, a proposed feature set that includes [+/- high] and [+/- low] predicts the nonsense value set: {[+ high] [+ low]}. one approach to this conundrum is to propose a feature scale [-1/0/1 height]. this is far from a settled matter, but latl should prioritize a user's ability to define such feature scales over implementation considerations or linguistic debate

it's been a few paragraphs without any mention of sign languages, so it is worth gesturing at how their phonological features relate to these considerations to ensure latl doesn't start it's life with a modality bias. sign languages are widely understood to have phonological systems that are featural. as is the case with spoken languages, specifics of feature sets vary based on language and researcher. features can be salient to a language and form minimal pairs ie [+/- palm prone] is one way of reading the difference between the ASL fingerspelling signs for /p/ and /k/. research suggests that there is a high degree of hierarchichal complexity in the phonological features of sign languages, which maps very neatly to the place of articulation problem in spoken languages. features related to handshape, such as [+/- flex] or [+/- extension] only make sense in regards to selected fingers. i have not seen any research about featural scales in sign languages, but it would be unsurprising to analogize the same issues arising from nonsense combinations of binary features

let's zoom back out to phonemes for a moment to add another wrinkle to the featural representation. the notion (unconscious or not) a speaker of a language has for what constitutes a single sound is understood to be a 'bundle' of features, but not every feature holds the same importance in every environment. by way of example, the /t/ phoneme in my dialect of english can be realized in a number of different ways depending on its location. it can be aspirated [tʰɑk] with [+ spread glottis] (or [+ delayed onset] if you prefer an auditory approach) in 'tock', without aspiration [stɑk] [- spread glottis] in 'stock', or as a flap [ˈbʌ.ɾək] in 'buttock'. this flap differs from the others at least in having [+ sonorant] and [+ voice], but retaining [coronal] [+ anterior]. yet, if i heard *[ɾɑk] in isolation, i would assume the speaker was referring to a stone or a genre of music. this situation is called allophony and latl must maintain a way to treat phonemes like /t/ as salient bundles of features distinct from the more discrete phones [tʰ], [t], [ɾ] (or [ʔ] from the earlier example) whose features are more specified. once again, we see a similar situation with regards the ASL phoneme, /e handshape/ which has allophonic representations [+ open aperture] (the unmarked /e/ familiar in the fingerspelled alphabet) and [- open aperture] in certain environments

warning! that [r] in my dialect of english, is an allophone of two different phonemes! the realization of the words /bæt.ər/ and /bæd.ər/ ('batter' and 'badder') is the same: [bæɾ.ɚ]. this 'under-specification' of not unique to my dialect of english and some linguists propose an archiphoneme /D/ which is a kind of set of /t/ and /d/ to account for this. in this view 'batter' and 'badder' are orthographically distinct, but phonemically both /bæD.ər/. is this 'really' what is going on? i'm not qualified to say, but i am confident that latl can be made to handle this situation without straining our abstractions too much

to recap thus far, we have phonemes, which for the purpose of latl are bundles of features of some value. features may be defined by the user of latl into feature systems, whereby they are usually but not always binary and may each have a dependency on another feature in the system. phonemes may have features of varying saliency allowing for allophony. these allophones are phones whose features are slightly different but retain the salient features of their phoneme, whether that phoneme is specified or an underspecified archiphoneme that could represent multiple phonemes. as an additional item, it is helpful to have a shorthand to refer to phonemes and their allophones, ie /t/, [tʰ], [t], and [ɾ] or /D/

an EBNF grammar (because grammars are fun!) of this relationship might be

phoneme = positive-integer * phone { phoneme } ) ; (* a phoneme must be a set of phones and optional (archi-)phoneme *)
phone = positive-integer * feature ; (* a phone must be a set of features *)
feature = ( value, identifier ) | positive-integer * feature ; (* a feature must be a value with some identifier or a set of (dependent) features *)
value = non-negative integer ;
identifier = letter, { letter | "-" } ; (* lispy identifiers assumed for now *)
non-negative-integer = digit , { digit } ; (* from here i'll take for granted the definition of digits and letters *)

this grammar is insufficient to the purpose, but i include it to point at the recursive nature of both phonemes and features revealed by the constraints defined so far. an additional constraint must be that features are bound in a global feature system and a featural definition of a phone requires values for every possible feature within that feature system. additionally, a feature value can be any within a bound set where each feature can be associated with a different set; so, [+/- nasal] and [-1/0/1 high] can exist within the same feature system, but any instance of [nasal] must have a value of [+] or [-] and any value of [high] must have [-1], [0], or [1]. the grammar handwaves with non-negative-integer by analogy with enums in many programming languages. this grammar also defines a language that would be repetitive and finicky to work with. instead of optimizing, i'd like to take a moment to consider the phoneme already solved in latl and think a little bit about how they're used

lexemes

zooming out a little to the fundamental unit of meaning

note! for the purpose of this exploration, a lexeme is assumed to be synonymous with 'root morpheme'. if you don't know what this note means, please be aware that i'm being a little bit of a crank again. if you do know what this note means and are suspicious, run with me here for a sec; we'll get to it

for now i'll posit that a lexeme is an ordered sequence of phoneme(s) that corresponds to a productive, atomic meaning. a lexeme MAY be subject to derivation rules which transform its meaning or its role in an utterance, for now called 'derived forms'. this definition allows for any 'part of speach' so long as the lexeme is not derived. taking for granted, for a moment, the category 'word', here's a selection of english words that fit this definition of lexeme: 'a', 'she', 'her', 'for', 'four', 'write', 'right', 'quick', 'quit', 'dirigible', 'abstract'

included are 'function words' (the closed set of grammatically necessary words without independent meaning) like 'a', 'she', 'her', and 'for'. 'content words' are also included (the open set of words with semantic weight) beginning with 'four'. but of course, i've also chosen these words to illustrate some potential traps. we have some phonetic ambiguities: 'for' and 'four' are distinct in some english dialects, but i pronounce them both /fɔɹ/. 'write' and 'right' are indistinguishable from each other in every english and sound something like /ɹajt/. the situation is tricky in this case semantically as well! this is one sequence of sounds upon which multiple different etymologies (encoding mark-making, correctness, directionality, or politics) have converged. if the written forms are any hint, there should be at least two separate lexemes

i've also snuck in the pair 'she' and 'her'. traditionally, 'her' is held to be a derived form of 'she' violating our 'root morpheme' assumption. leaving aside the linguistic reasons to consider 'her' a derived form, there's still the question of what plausible derivation rule could turn the sound sequence /ʃi/ into /hɜɹ/? (the ancestral form of 'her' probably was transparently derived from the ancestral form of 'she', but in this project i'm concerned with how these derivations are obscured by language change through time)

the write/right example and the she/her example, in slightly different ways, both recall the bidirectional nature of language. an idealized speaker *knows* which specific meaning (specific lexeme?!) of /ɹajt/ they are referring to, but their interlocutor must derive the appropriate meaning from context. likewise, a proficient speaker produces /ʃi/ and /hɜɹ/ in the appropriate position within a sentence without difficulty, while a language learner may struggle to hear the connection between the two forms. (other interesting possibilities include using one or the other form in all locations or in random distribution; analogizing the regularity of /hi/->/hɪm/ ('he'/'him') to /ʃi/->/ʃɪm/ where a /hɜɹ/ is expected; or using /hi/, /ʃi/, /ðej/ 'they' or other third person pronouns interchangeably. all of these point at some other juicy stuff that will have to be shelved for now.) this bidirectionality means that latl will need to support the mapping of a sequence of phonemes to an arbitrary number of lexemes, although for now it's safe to assume that a lexeme has only one associated sequence of phonemes. (ignoring, for the moment, variant pronunciations as in 'the' /ði/~/ðə/)

a lexeme will probably need some additional stuff, tho. at the very least a 'dictionary definition' and, of course, a shorthand, ie /ʃi/, /hɜɹ/, or /ɹajt/. there's absolutely more to what latl will require from a lexeme (and users should be able to extend the lexeme primitive to their own ends) but that will have to wait for now

before moving on

morphosyntax

where meaning and phonology and time start getting funky