Login

okarinaofsteiner · (This post was last modified: 01-26-2024, 07:37 PM by okarinaofsteiner.)

https://genoplot.com/discussions/topic/1...readings/1

Ryukendo Wrote:The Sino-Tibetan language family is one of the largest in the world and has one of its most extensive literatures. It is however shrouded in mystery, and is one of the least well-constructed of all the major language families. Among these languages, Chinese, Tibetan (which has many varieties), Newar in Nepal, and Burmese in Burma have extensive written records; some other written languages like Tangut or Zhangzhung are now extinct. Most other languages, which are spoken by ethnicities organized in small independent kingdoms or chiefdoms, down to horticulturalists and hunter-gatherer bands (in the Burma-India-China border area or along South Asian side of the Himalayan foothills), or by ethnic groups under Chinese, Tibetan, Burmese, Bhutanese or Nepali control, were written down only in the 18th-20th Ct, and many of the languages are somewhat poorly documented or are even recently discovered, including many of those in NE India, N Burma, and Nepal. There are many difficulties with working with this material, not least because many of the words in ST languages are very short (often monosyllabic), and some languages, such as Chinese, have become very poor in morphology (no inflections, derivations etc) and even consonant-poor.

Nevertheless, work continues. The major source of information about ST languages is the Sino-Tibetan Etymological Database and Thesaurus: https://stedt.berkeley.edu, run from UC Berkeley since the 1930s and a prelude to much future work. The first work meant to come from this database was never published, and the first attempt at doing anything with the (by then large accumulation of) data came out only in the 1970s, by Benedict. By this time, it was clear that these languages were related, but because the subgroups of ST were not known and so not even the proto-stage of the subgroups of ST were reconstructed, you can imagine how difficult it was to reconstruct proto-ST--think about reconstructing proto-Indo-European from modern English, Sri Lankan, Albanian, Armenian etc! There were few definite sound laws, though there were family resemblances all over the place (e.g. a c in one language becoming c-h, g or x or h in others, but in a highly irregular and unpredictable way). Nevertheless, we know at least the following facts: that proto-Sino-Tibetan was quite inflecting, with many prefixes, suffixes and infixes. It was mostly monosyllabic, and because of the inflection often had very complex consonant clusters.

Today, people are embarking on the slow and painstaking work of collecting ever more data from less-known languages, and reconstructing subgroups of ST in individual papers, books and dissertations, such as proto-Bodic (all the varieties and languages related to Tibetan), or proto-Karen. Furthermore, many subgroups are now recognized: Rgyalrong, Himalayish Qiangic, Lolo-Burmese etc., though there are far too many subgroups (many more than the primary branchings of Indo-European) and their deeper interrelationships are unclear, resulting in a family tree that is still far too bushy and disorganized.

Recently, major breakthroughs have taken place that have clarified the pattern of spread for this language family.

First, two breakthrough papers have been published in ST linguistics, both using Bayesian Phylogenetic analysis to automatically classify languages using linguist-provided cognate sets. The two papers are this: https://www.pnas.org/doi/10.1073/pnas.1817972116 and this: https://www.nature.com/articles/s41586-019-1153-z.
An excellent picture of the distribution of ST languages can be found in the supplements for the first paper, page 3: https://www.pnas.org/doi/10.1073/pnas.18...-materials

I know many people are skeptical of such analytic methods, but the main point here is not that Bayesian Phylogenetic methods can confirm language groupings--this still requires the painstaking work of manually identifying regular sound changes--but that such methods can give us promising hypotheses that linguists can then investigate further. Furthermore, if some linguists propose a subgrouping based on some preliminary patterns they notice, the confirmation of such groupings using these methods further highlights their viability as targets of research. If multiple groups of researchers using different sets of cognates and different methodologies independently recover the same subgroupings, this may serve as strong (albeit not conclusive) evidence that these groupings are legitimate.

Another set of issues lies in the quality and size of the data; in this case, the STEDT data is the gold standard for ST research for the first paper, and the second paper additionally involves wordlists provided by professional linguists working on the languages themselves. Bayesian phylogenetic inference requires the researchers to make a number of technical choices about parameters, including a model for how words appear and disappear over time, how much loaning there is, how frequently regular or irregular sound changes occur, and how likely it is in general that trees of different shapes occur in language families (tree prior); ideally, a robust analysis would explore a wide range of parameters, inference methods and a wide range of priors and show that the results are robust to these (which theoretically, given enough data, they should be). These issues are discussed extensively here: http://www.sfs.uni-tuebingen.de/~yanovic...9-subm.pdf, where a paper using such methods to place Yeniseian as a sub-branch of Na-Dene is extensively criticized as the results were not robust to choices of parameter, implying that the dataset was not actually large enough and of high enough quality to perform robust inference. If you check the supplements of the papers linked for Sino-Tibetan, which all demonstrate high levels of convergence regardless of the parameters chosen, this is not an issue for the conclusions they make.

Something to mention here is that Bayesian inference allows for us to assign a certain probability to multiple models, giving us a natural way to represent uncertainty, and thus in the Indo-European case often infers nuclear IE as being a mixture of trees of different shapes, represented by a densitree (plotting all possible trees with intensity proportional to the likelihood of the tree being correct). We get the same issue for high-level groupings of ST here: this is not telling us about “language mixture”, but rather about uncertainty. A set of densitrees are provided for the second paper in the supplementary materials. One can think about the densitree as representing how likely linguists are to find regular sound correspondences or cognate sets conclusively showing that one grouping is correct over another if they went through things with a fine-grained comb; the more intense the densitree for a particular grouping, the more likely. Of course, these Bayesian analyses were done over lexicon (the cognacy of words). Much more could be done using shared paradigmatic morphology; Yeniseian and Dene were connected using reconstructed paradigmatic morphology, but share very little lexicon, so this kind of Bayesian analysis would not be able to detect that.

Strikingly, these two papers independently recover at least 6 groups of ST languages:

A large group that includes Bodic (Tibetan and all its varieties, including the prestige language spoken in Bhutan--Dzongkha, and Sikkimese), Lolo-Burmese (Burmese needs no introduction; Lolo includes many populations in S Yunnan, N Burma and NW Thailand we find in genetic studies such as Jinuo, Hanhi, and Lahu which are distinctive from other mainland Southeast Asians in tending to peak in Yellow River Neolithic and the "Austroasiatic" Vietnam_N component and be lacking in Dai-related, Hmong-related and Austronesian-related components), Rgyalrong, Qiangic and Naic (including pastoralists and agriculturalists of the Tibetan foothills on the Chinese side of Tibet from Qinghai down to Sichuan, featuring such populations as Naxi from genetic studies, ancient Tangut speakers who founded the Xixia state, and other warlike tribal polities the Chinese have known generically as "Qiang" in their history), plus a range of assorted groups including Nuosu, Ersu and so on. This large group has not really been emphasized in previous studies (though Blench and Post 2014 placed a bunch of these together on their tree), receives very high support in both studies (posterior probabilities around 1), and is the first long-range subgroup proposal for ST--and a very major one--I that I think will stand the test of time.

A major "Sal" group (from the word for Sun), that has only recently been proposed, that unites all Bodo-Garo tribals distributed throughout the Brahmaputra valley in NE India and W Burma, with some groups in far N Burma and S China, is also recovered with high probability in both papers.

A major Kuki-Chin-Naga group, also only tentatively proposed so far, uniting some peoples of Nagaland in far Eastern India and populations all along Western Burma, is also supported in both studies. Karen languages (included only in the second paper's analyses), a major group of languages spoken by some militarily powerful ethnic groups on the border of Burma and Thailand, may also be in this group.

A group uniting two sets of languages in far southeastern Tibet, termed Tani-Yidu in one paper and Tani-Digarish in the other, has only recently been proposed by some linguists but also receives strong support in both papers.

Himalayish, including most of the ST languages spoken in Nepal and important local prestige languages such as Newar, are recovered in both papers, and known as the “Kiranti” group in the former and Himalayish in the latter.

Chinese/Sinitic languages, though this was never in doubt.

There are also other small, independent groups recovered, especially for the second paper which samples more languages, such as Nungish, Kinnauri and so on—-though these groups have been well-established among linguists for some time.

Excitingly, the two papers agree that the large Bodic-Lolo-Burmese-centered group is a crown group for ST. Both papers also agree that Chinese, Sal, and Kuki-Chin-Naga tend to be the stem groups/basal branches for ST (first groups to branch off), followed by Tani-Digarish, followed by various Himalayan languages, followed by the ST crown group.

Both groups’ papers agree that Chinese/Sinitic is the most likely to be the outgroup to all other ST languages, but this is as a rather weak outgroup, approaching a toss-up between Chinese and the other basal groups. The shape of the overall tree is therefore much less certain than for IE (where we know for sure the split order is Anatolian, then Tocharian, then nuclear IE); however, the presence of some long-range groups, and the positioning of various groups relative to the crown group is starting to get clearer. This is a surprising level of convergence between independent researchers!

In addition, both groups recover support for the traditional Yellow River homeland for ST languages, in the former case supported by cognates for foxtail millet, pigs and sheep in both Chinese and non-Chinese branches of the family, but less so for other domesticates (such as rice, wheat, barley, horse, or cow).

The second advance is in ancient and present-day DNA, which has shown that the ST peoples expanded in two waves; after reaching the Upper Yellow River Valley, near the Qinghai region, ST speakers expanded both directly into the Tibetan plateau and in another wave southwards along the mountains and valleys, reaching Burma and wrapping around the middle elevations of the Tibetan foothills into Nepal and Kashmir. This is supported strongly by Y-chromosomal studies: https://link.springer.com/article/10.100...018-1461-2. https://onlinelibrary-wiley-com.ezp-prod...11.00690.x, and by studies using ancient DNA, which finds that present-day Tibetoburman populations derive from two waves of migration taking two different routes from the Upper Yellow River region:
https://www.nature.com/articles/s41467-0...7-2#MOESM1
The Y-chromosomes have been recovered from ancient DNA in such a way as to confirm the role of the Upper Yellow River Yangshao-derived population in the genesis of Tibetoburmans in two routes, as seen in page 7 of the supps: https://static-content.springer.com/...MOESM1_ESM.pdf

More and more autosomal analyses have also been published for present-day ST speakers, placing more and more groups into the academic and public domains, including this on the Tibetan-Chinese border areas: https://www.cell.com/cell-reports/fullte...22)00481-8 and this on Thailand, which has many ST groups on the far north, near the Burma-China-border: https://academic.oup.com/mbe/article/38/8/3459/6255759.

How can the tree of ST languages, which proposes that some widely-separated languages (the “Sal” group, the Kuki-Chin-Nagas, and Sinitic) are first to split off and the Tibetan-Lolo-Burmese group the last, be reconciled with the aDNA and present-day DNA data? It seems like what could have happened is that initial levels of deep diversity, which spread far from the Upper Yellow River area, were overprinted by groups successively closer to the ST crown group expanding out of a region between Tibet, Burma, and China. Furthermore, the weird centre of gravity for diversity in this family (located around the N Burma, Tibet, India area) would be resolved under the author’s scenarios because diversity around the YR River Valley to the East has been completely purged by the historical expansion of Chinese and the diversity in the North purged by Tibetan, pushing the C of gravity to the SW, closer to Burma and India.

There are a few more unresolved questions that I think can be fruitfully answered with future aDNA and linguistic work:

Why are the distributions of non-YR ancestry in present-day ST groups so different? Why is it that Lolo-Burmese groups of the Southern Lolo Branch alone have a YR_N + Vietnam_N combination out of all populations in E Asia, while all other Tibetoburman populations of the Tibet-China-Burma borders have YR_N + Vietnam_N + “Tai” components? Is it because ancient Sichuan had only Vietnam_N, and then addition of YR_N and then “Tai” and “Hmong”-type ancestries? This can be fruitfully answered using aDNA work.

Is the similarity between Yi, Naxi, and Middle-elevation groups along the Himalayas to YR_MN because they all have a little bit of the “Tai” component, that higher-elevation Tibetans do not have? Note that YR_MN is a little bit more SE Asian-shifted than Upper_YR_LN because Upper_YR_LN received a bit of gene flow from N Asian HGs (“ANA”). Groups like Yi and Naxi clearly have a little bit of the “Tai” and “Hmong”-type ancestries, whether the rest came from Upper_YR_LN or YR_MN.

Are there extremely ancient, pre-Neolithic (as in before the fully-fledged agropastoralist package) dispersals of ST languages into the Himalayas, Tibet and the Brahmaputra valley that were subsequently overprinted by later, more fully agriculturalist dispersals? Especially because the estimated dates for ST divergence in both papers are a little too old, before the fully Neolithic package had appeared and when hunting and gathering were still important for the Neolithic populations of the Yellow River Valley. This is something both sets of researchers talk about, and is also addressed in this short paper that includes archaeologists of the Neolithic such as Ruth Mace: https://www.nature.com/articles/s41598-020-77404-4

What about various “para-Sinitic” languages like Bai and Tujia, who were not included in the linguistic papers but whose affinity with Chinese was always mysterious? Linguists have always wondered whether the close similarity of these languages with Chinese was because of massive layers of ancient loans from Old Chinese, or because they were long-lost relatives of the otherwise very lonely Sinitic family of languages.

About this last point: in recent decades, work has picked up again on some extremely intriguing languages discovered and described in the 1920s-1980s in N and W Guizhou. These languages include the Caijia, Longjia and Luren languages, which are all quite poorly documented and either extinct or on their way. These languages preserve extremely interesting similarities with Old Chinese, and may be the last vestiges of para-Sinitic ST languages. They may give us a tantalizing glimpse into a universe of diverse Sino-Tibetan languages in the YR Valley and surrounds that was wiped out by millennia of Chinese domination. Andreas Holzl from Potsdam university has been trying to chase down all the documentation he can about these languages, buying volumes of field notes from antique shops (!): http://www.elpublishing.org/docs/1/20/ldd20_02.pdf. Since Caijia is still alive, he urges linguists to document as much Caijia as they can.

Future work on Bai, Waxiang, Caijia, Luren and Longjia may help supply an Eastern wing to a family that hashitherto been lacking it, and supply some lost siblings to the Sinitic branch of the family. I consider this to be some of the most exciting work happening on the linguistic side. Watch this space.

okarinaofsteiner · (This post was last modified: 01-26-2024, 08:36 PM by okarinaofsteiner.)

Related: https://genoplot.com/discussions/topic/1...nd-munda/1

https://genoplot.com/discussions/post/187466

Ryukendo Wrote:The closer we get to Old Chinese, the closer Chinese looks like a regular Sino-Tibetan language with complex consonant clusters and productive morphology, and the less it looks like present-day Chinese languages.

I have mentioned here why it is extremely unlikely that Chinese became so unlike other ST languages by contact with Austroasiatic languages. Chinese, Vietnamese, Tai-Kradai and Hmong-Mien basically all gained tone around the same time, from the same processes and in the same way. Literally the same collapse in consonant contrasts caused the same three-way tone distinction to appear in all four language groups. This was combined with the loss of many consonant clusters. Except for Vietnamese, no Austroasiatic languages have anything indicating such an intimate level of language contact.

Here is what Martha Ratcliff, who reconstructed proto-Hmong-Mien in the 2010s, has to say about the process of tonogenesis (slide 25):

Quote:The origin of HM tones
• All languages in the Sinosphere (Chinese, Vietnamese, TaiKadai languages, Hmong-Mien languages) seem to have developed tone in the same way: from loss of final laryngeal consonant contrasts in a first wave, doubled by loss of initial laryngeal consonant contrasts in a second wave. Tone is thus a language contact feature, although which language developed tone first is unknowable.
• The origin of tones in Vietnamese was discovered by André Haudricourt in 1954. This was possible due to the fact that close relatives of Vietnamese (Muong, Thavung, etc.) are atonal—we can see the “before” and “after” within one compact family, Vietic. His account of tonogenesis has served as a model for the study of all other tone languages in the area.

Chinese borrowings into Hmong Mien have this property: if they evolved to have some set of tones in Chinese, they invariably evolve to have the same set of tones in Hmong-Mien. In fact this also happens for Chinese borrowings into Kradai and Vietic:

Quote:The tone system of Middle Chinese is strikingly similar to those of its neighbours in the Mainland Southeast Asia linguistic area—proto-Hmong–Mien, proto-Tai and early Vietnamese—none of which is genetically related to Chinese. Moreover, the earliest strata of loans display a regular correspondence between tonal categories in the different languages.[71] In 1954, André-Georges Haudricourt showed that Vietnamese counterparts of the rising and departing tones corresponded to final /ʔ/ and /s/, respectively, in other (atonal) Austroasiatic languages. He thus argued that the Austroasiatic proto-language had been atonal, and that the development of tones in Vietnamese had been conditioned by these consonants, which had subsequently disappeared, a process now known as tonogenesis. Haudricourt further proposed that tone in the other languages, including Middle Chinese, had a similar origin. Other scholars have since uncovered transcriptional and other evidence for these consonants in early forms of Chinese, and many linguists now believe that Old Chinese was atonal.[72]

In her work on proto-Hmong-Mien, Ratcliffe shows that borrowings from Old Chinese result in regular tone correspondences between Middle Chinese and proto-Hmong-Mien (p.g. 191):

Quote:One striking piece of evidence in support of the hypothesis that Chinese itself was atonal when it lent Hmong-Mien the words in the first set above is the fact that an even older stratum of Chinese loans shows regular tonal correspondences between the two families. Most scholars believe that Old Chinese was toneless (Mei 1970, Baxter 1992, Sagart 1999). How else, then, can the correspondences in this oldest stratum of loanwords be explained other than to say that tones developed in the two languages in a parallel fashion after the words were borrowed? And how else can the identical pattern in the later stratum of loanwords be explained other than to say that tones developed in the two languages families after these words were borrowed as well?
(9) MC OC PHM Tone category in both
廩 lǐn ‘barn/granary’ limX *(pə.)r[ə]mʔ *rɛmX B2
鐵 tiě ‘iron’ thet *lˤ̥ik *hrɛkD (PM) D1
力 lì ‘strength’ lik *kə.rək *-rək D2

Old and Early Middle Chinese was therefore likely to have been in intimate contact with proto-Kradai and proto-Hmong-Mien from very early times, and proto-Vietnamese around the time of the Han Dynasty (but not Vietic as a whole or Austroasiatic), to such a degree that they all developed tone together from the time between the 1st Mil BC to the first half of the first Mil AD, and Chinese ended up looking like a language of the mainland Southeast Asian linguistic area than like a ST language.

Of course, one could state that it was contact with now-extinct branches of Austroasiatic that made Chinese look like the rest, however proto-Austroasiatic actually has a typology and morphology that looks like the rest of the ST languages, like Old Tibetan, than like Chinese! Roger Blench points out the similarities here:
https://web.archive.org/web/201809211345...ibetan.pdf

The evolution of the MSEA linguistic area therefore is likely to involve some process among the languages listed above, instead of Austroasiatic.

okarinaofsteiner · 02-01-2024, 08:00 PM

https://genoplot.com/discussions/post/187512

Ryukendo Wrote:Sinitic is not a creole language, not by any stretch of imagination, but it does share in a bunch of changes that tend to take place when there are large numbers of 2nd-language learners. Cases, inflection and affixes tend to collapse and the language tends to lose morphology, word order becomes more fixed (more meaning is carried by it), and the language becomes more isolating and analytic. Among European languages English is the prototypical example, and it was pushed in this direction from quite early on. Some linguists call this a "creoloid" language, in the sense that language transmission was shaped by continuous interaction between different language speakers in second-language learning, and this pushed the typology of the language continuously in a simpler and more analytic direction.

What makes the Middle Chinese and MSEA case unique would be the genesis of tone and the radical simplication of phonology that this co-occurred with.

Here is a paper arguing that the same process took place to some extent already for Old Chinese. Here are the main arguments:
1. Old Chinese is much more like other Sinotibetan languages than Middle and modern Chinese are (Middle and modern Chinese are squarely part of the MSEA linguistic area). But Old Chinese already shows some movement in the direction of MSEA.
2. These changes include SOV syntax, more monosyllabicity than other ST languages (they tended to retain the sesquisyllabic pattern with complex suffixes, prefixes and infixes characteristic for example of Austroasiatic languages), and some loss of morphology already. There are traces of SVO syntax in the oldest layers of Old Chinese, suggesting that the changes were recent.
3. The Oldest borrowings between Chinese, Tai-Kradai, and Hmong-Mien must have taken place already in the Old Chinese stage before the emergence of the defining features of the MSEA language area.

Some quotes:

Quote:The broad account which I suggest here is the familiar picture of a contact situation between western invaders speaking a TB tongue and locals speaking languages affiliated with one or more of the attested mainland Southeast Asian stocks. But it is not enough to simply say “contact” and pretend that we have explained anything. In this view of Sinitic we have a very specific outcome, with Sino-Tibetan lexical and grammatical core, heavy Bai Yue, especially Kadai, lexical influence, creoloid syntax based more on Bai Yue than on Sino-Tibetan patterns, and innovative phonological structure. This did not come about through people overhearing each other’s languages on market day, or learning a few phrases for doing business; we have to imagine a situation of widespread bi- or multilingualism. This would be the case in a scenario in which Chinese or pre-Chinese speakers conquered a Bai Yue population, as happened as the kingdoms of Chu and then Yue were incorporated into Qin China. But this does not automatically explain the extent of the influence which we find on the whole language. Ballard’s (1984) “Mother Soup” metaphor captures the problem but doesn’t solve it. More importantly, the most important contact evidence predates the assimilation of the southern kingdoms into imperial China...

Instead, I propose that the features which so dramatically distinguish Sinitic from other Tibeto-Burman branches reflect the use of Proto-Sinitic as a lingua franca, used widely by non-Chinese (by whatever definition) outside of the actual administrative control of the Chinese state. As we have noted, the term Bai Yue refers to the multiethnic and multilingual situation in the south. One can imagine the utility of a vehicular lingua franca even without reference to the Chinese state and its influence; by the time the Chinese state is present on the historical stage, some version of its language would be a likely candidate. Thus, with the increasing power and prestige of Zhou, perhaps even Shang, China, a pidiginized version of its Tibeto-Burman language became a lingua franca throughout the region. Cheng (1983) speaks of “two sublanguages coexisting in early archaic Chinese”, an earlier SOV stratum and an innovative SVO syntax. This would, essentially, be “pure” Sino-Tibetan Chinese with SOV syntax, and innovative “foreigner” Chinese, spoken with the SVO pattern of the Bai Yue languages. Ultimately the widespread lingua franca version of Proto-Sinitic replaced the original everywhere.

Some other points he makes, which help to contextualise what has been mentioned in this thread previously:

Quote:There is no serious question that the Southeast Asian syntactic profile in Chineseis a secondary development:
"From the fact that we can clearly see changes in the word order of these three languages [Sinitic, Karen, and Bai] over time, and cannot see such changes in the Tibeto-Burman languages other than Bai and Karen, we assume that it was Bai, Karen and Chinese that changed rather than all the other Tibeto-Burman languages. (LaPolla 2003:28)"

A few scholars see this development as internal to Sinitic:

"The new linguistic standard of the Han dynasty [Ryukendo: Early Middle Chinese] … typologically characterized by its incipient isolating morphology, and its emergent tonal and monosyllabic phonology, gradually spread to all parts of the empire, north and south, and this same typology further spread to all non-Chinese languages spoken in territories under Chinese rule after the Han: all of Miao-Yao, Viet-Muong (but not the rest of Mon-Khmer), all of Kam-Tai, some south-eastern Tibeto-Burman languages including Lolo-Burmese (but not Tibetan, Qiang, Gyarong, etc.). (Sagart 1999:8)"

But most scholars, from Terrien de la Couperie on, see the shift in Sinitic as due to influence from neighboring languages to the south; Egerod (1976:59) points out that since SVO order is inherited in Thai, “Chinese was largely a recipient rather than a donor in the early times … it is Chinese which borrows a new word order”7 (see also Benedict 1976). Indeed, all of the Southeast Asian groups have SVO syntax as far back as we can trace. And there are ample traces of earlier SOV patterning in Old Chinese (Cheng 1983) ...

Some comments on the MSEA linguistic area and the genesis of tone:

Quote:The most impressive correspondence between Sinitic and the Southeast Asian Tai-Kadai, Hmong-Mien, and Vietic languages is in their phonological structure. All share the stereotypical monosyllabic morpheme structure and elaborate tone systems. The most striking, and puzzling, fact about this congruence is the perfect correspondence of the tone systems (Wulff 1934, Haudricourt 1954a, b, FK Li 1945, 1976, Matisoff 1973, Ostapirat 2000, Ratliff 2010). Sinitic, Tai-Kadai, Hmong-Mien, and Vietnamese all have a four-tone system, with a three-way distinction on “smooth”, i.e. open or sonorant-final syllables, and all “checked”, i.e. obstruent-final, syllables manifesting a distinct fourth tone. Each of the other three shares with Sinitic (and to some extent with each other) a substantial body of shared vocabulary which shows regular correspondence in tone class.
In all of the languages tones originated out of final laryngeal features, so that the original correspondence is in the type of rime: obstruent coda, coda *-h (sometimes < *-s), final *-?, and “smooth” syllables with none of these (Haudricourt 1954a, b, 1961/1972, Mei 1970, 1980). The shared vocabulary which shows these correspondences must have been borrowed4 at a stage when both the donor and recipient languages still retained these final laryngeal distinctions, and had not yet developed phonemic tone; if we imagine that these items were borrowed with phonemic tone, it becomes impossible to explain the regularity of the correspondences. (For a very clear exposition of this argument see Ratliff 2010:187-93). The languages must have still been in close contact when they underwent a shared tonogenetic episode in which these laryngeal distinctions were reinterpreted as tonal, as they were still centuries later when they all shared in the “Great Tone Split” conditioned by mergers of initial consonant series.

The monosyllabic pattern is not really characteristic of Austroasiatic, or even of Kradai, and the Sinitic developments do have parallels in the phonological development of other Sino-Tibetan groups. So Sagart is probably right in attributing the original locus of monosyllabic structure to Chinese:

"From a typological point of view, Old Chinese was more similar to modern East Asian languages like Gyarong, Khmer or Atayal than to its daughter language Middle Chinese: its morphemes were nontonal and not strictly monosyllabic; its morphology was essentially derivational, and largely prefixing; but it also made use of infixes and suffixes. At some point between Old Chinese and Middle Chinese, and for unknown reasons, a cascade of changes caused the language to move away from this model. Its affixing morphology began to freeze; its loosely attached prefixes were lost, while other affixes clustered with root segments and were reinterpreted as root material. A new morphemic canon tending toward strict monosyllabism, with a great variety of initial and final consonant clusters, emerged. Further shifts saw the reduction of initial clusters, this resulting in a more complex inventory of initial consonants, and in new vowel contrasts. Final clusters were also reduced and the inventory of final consonants restricted to resonants and stops, this leading to the emergence of tones. Thus the classical ‘Indochinese’ typology common in its major features to Middle Chinese, Vietnamese, Miao-Yao, Tai, Burmese etc., was born." (Sagart 1999:13)

okarinaofsteiner · 02-01-2024, 08:36 PM

https://genoplot.com/discussions/post/187512

Ryukendo Wrote:
Granary Wrote:
Ryukendo Wrote:Sinitic is not a creole language, not by any stretch of imagination, but it does share in a bunch of changes that tend to take place when there are large numbers of 2nd-language learners. Cases, inflection and affixes tend to collapse and the language tends to lose morphology, word order becomes more fixed (more meaning is carried by it), and the language becomes more isolating and analytic. Among European languages English is the prototypical example, and it was pushed in this direction from quite early on. Some linguists call this a "creoloid" language, in the sense that language transmission was shaped by continuous interaction between different language speakers in second-language learning, and this pushed the typology of the language continuously in a simpler and more analytic direction.

What makes the Middle Chinese and MSEA case unique would be the genesis of tone and the radical simplication of phonology that this co-occurred with.

Here is a paper arguing that the same process took place to some extent already for Old Chinese. Here are the main arguments:
1. Old Chinese is much more like other Sinotibetan languages than Middle and modern Chinese are (Middle and modern Chinese are squarely part of the MSEA linguistic area). But Old Chinese already shows some movement in the direction of MSEA.
2. These changes include SOV syntax, more monosyllabicity than other ST languages (they tended to retain the sesquisyllabic pattern with complex suffixes, prefixes and infixes characteristic for example of Austroasiatic languages), and some loss of morphology already. There are traces of SVO syntax in the oldest layers of Old Chinese, suggesting that the changes were recent.
3. The Oldest borrowings between Chinese, Tai-Kradai, and Hmong-Mien must have taken place already in the Old Chinese stage before the emergence of the defining features of the MSEA language area.
I personally don't agree with that explanation as it's given, even if Old Chinese was adopted as a lingua franca and even if there were many second language speakers south of the Huai river that still wouldn't explain why the dozens of millions of North Chinese that didn't interact much with southern people ended up experiencing all the same changes.
It seems obvious to me that at any point in time the vast majority of Old Chinese speakers was likely native speakers living in the north so the idea that a minority of second language speakers, not even the culturally dominant one at that, managed to make the majority native speakers shift seems unlikely to me.
To extend why I think this explanation doesn't make sense, why didn't such a process happen in Latin Europe or the Arabic Islamic world? Arguably the amount of second language speakers there was higher and yet the loss of cases in Latin arguably started happening fairly rate in the late imperial period, when most of Iberia, France and Italy have been Latinized already.

IMO the presence of second language speakers and contact with other languages acts as a potential catalyst for change which may or may not manifest itself in regional or language-wide changes but ultimately the primary factor is still the same kind of probabilistic factors that seem to govern linguistic change at large, obviously some scholars prefer to think primarily in terms of non-spontaneous change but to me the burden of proof for proving whether general linguistic change have been caused(as opposed to encouraged) by specific events should be high...

Obviously I have nothing against the idea of sprachbund in general, although for some of the reasons delineated above I find very hard to ignore the overwhelming demographic and political dominance of Old Chinese in this scenario so I'd rather see this sprachbund arising in the context of mostly linguistic changes within Old Chinese that might have been encouraged but not caused by bilingual or second language communities because those people were hardly the dominant community within the Chinese world at the time.
Feel free to disagree, but speaking to linguists I feel sure that if one were to create a database of language contact and typological change one would see an extremely strong statistical signal of change being associated with the languages with large numbers of L2 learners. It is no coincidence that the most unchanged Germanic language is Icelandic, for example. There is already empirical work showing that the morphological complexity of languages decline and analyticism increases with size of the population speaking the language. There are even linguists now who reject the copernican assumption, that languages always change and always change at the same rate everywhere.

I know this paper, one of those associated with one of the grammar and typology databases for the languages of the world, which finds that the grammatical, phonological and typological characteristics of "large languages" such as English, Chinese, Spanish etc. are in fact the most extraordinary in the world, and the "unusual", complex, difficult-to-learn features you find in PNG languages or Native American languages or Siberian and agglutinative Himalayan languages are in fact the default. Let me dig it up, it may take a few days.

For Chinese, we know that Middle Chinese had different dialects, and that ultimately the dialect of the East prevailed and spread in a homogenizing wave over the rest of China before splitting into all the non-Min dialects of Chinese today. The same thing could have happened in Old Chinese, in fact the sociological situation there is ripe for this kind of thing.

Login
Username/Email:
Password:	Lost Password?
	Remember me