The GenArchivist Forum
Seeing language from the DNA - Printable Version

+- The GenArchivist Forum (https://genarchivist.com)
+-- Forum: Anthropology (https://genarchivist.com/forumdisplay.php?fid=108)
+--- Forum: Cultural Anthropology (https://genarchivist.com/forumdisplay.php?fid=113)
+---- Forum: Linguistics (https://genarchivist.com/forumdisplay.php?fid=120)
+---- Thread: Seeing language from the DNA (/showthread.php?tid=503)

Pages: 1 2 3 4 5 6 7 8 9 10 11


Seeing language from the DNA - Jaska - 02-04-2024

There is a widespread belief among laymen that language could be seen from the DNA. Strangely, they usually angrily deny that accusation, when at the same time they apply a method which cannot work unless they see language from the DNA. Here is a real-life example and my reply to it.

Random blogger:
“N-L1026 is very useful because it can be used to track the Uralic expansion from Siberia to the west, not to predict who currently speaks Uralic.”

The end of your sentence is important, because there are too much erroneous black-and-white beliefs about the interdependence of the language and the DNA. However, the truth that one cannot see the language from the DNA at the present also prevents one from seeing the language from the DNA in the past. All you can see is the expansion of N-L1026 from Siberia to the west, but you cannot just arbitrarily put linguistic label on it.

We have no certainty about the paternal lineages of the population speaking Late Proto-Uralic. The only scientific way for tryingto find them out involves this procedure:

1. We take the linguistic results as the starting point: when and where was Late Proto-Uralic spoken?

2. We look at the genetic data in the relevant spatio-temporal coordinates: what is present there is a tentative genetic correlate for Late Proto-Uralic.

3. Spatial or temporal match alone is not enough, but they must coexist. There have occurred migrations in all times, and the spread of N-L1026 could be either (1) earlier, (2) simultaneous, or (3) later than the expansion of Late Proto-Uralic. The situation where N-L1026 partially overlaps the modern region of the Uralic language family cannot testify that their expansions were simultaneous, not to speak of proving that their expansions were causally related to each other.

For some reason you do not follow the scientific procedure. Instead, you just randomly decide that the expansion of N-L1026 must be associated with the Uralic expansion. Then, based on this dice-roll, you claim that the linguistic evidence can now be ignored and the Uralic homeland can be located based on the DNA, because by some mystical skill you can see the language directly from the DNA. 

Hungarians are an important example about how ancestry of the language carriers changes through time, even in the same region. Ancient Hungarians still had a portion of Siberian (Yakutia_LNBA) autosomal ancestry, but modern Hungarians do not have it. Similar ancestry changes obviously have occurred also during the prehistoric times. Also, during every successive step of a linguistic expansion, it is likely that the genetic composition of the language carriers changes, because assimilation and language shift of earlier population are normal part of linguistic expansion.

This is just one of the many facts that demonstrate us why we cannot see the language from the DNA:
- The Uralic language can remain even when the Yakutia ancestry disappears.
- The Yakutia ancestry can remain even when language shifts: this ancestry is present in populations representing several language families, and we cannot see from the DNA which language the carriers of this ancestry originally spoke.

P.S. In case of trolls, please inform the administrators, so that this thread would not be closed due to insult wars.


RE: Seeing language from the DNA - JonikW - 02-04-2024

(02-04-2024, 06:33 AM)Jaska Wrote: There is a widespread belief among laymen that language could be seen from the DNA. Strangely, they usually angrily deny that accusation, when at the same time they apply a method which cannot work unless they see language from the DNA. Here is a real-life example and my reply to it.

Random blogger:
“N-L1026 is very useful because it can be used to track the Uralic expansion from Siberia to the west, not to predict who currently speaks Uralic.”

The end of your sentence is important, because there are too much erroneous black-and-white beliefs about the interdependence of the language and the DNA. However, the truth that one cannot see the language from the DNA at the present also prevents one from seeing the language from the DNA in the past. All you can see is the expansion of N-L1026 from Siberia to the west, but you cannot just arbitrarily put linguistic label on it.

We have no certainty about the paternal lineages of the population speaking Late Proto-Uralic. The only scientific way for tryingto find them out involves this procedure:

1. We take the linguistic results as the starting point: when and where was Late Proto-Uralic spoken?

2. We look at the genetic data in the relevant spatio-temporal coordinates: what is present there is a tentative genetic correlate for Late Proto-Uralic.

3. Spatial or temporal match alone is not enough, but they must coexist. There have occurred migrations in all times, and the spread of N-L1026 could be either (1) earlier, (2) simultaneous, or (3) later than the expansion of Late Proto-Uralic. The situation where N-L1026 partially overlaps the modern region of the Uralic language family cannot testify that their expansions were simultaneous, not to speak of proving that their expansions were causally related to each other.

For some reason you do not follow the scientific procedure. Instead, you just randomly decide that the expansion of N-L1026 must be associated with the Uralic expansion. Then, based on this dice-roll, you claim that the linguistic evidence can now be ignored and the Uralic homeland can be located based on the DNA, because by some mystical skill you can see the language directly from the DNA. 

Hungarians are an important example about how ancestry of the language carriers changes through time, even in the same region. Ancient Hungarians still had a portion of Siberian (Yakutia_LNBA) autosomal ancestry, but modern Hungarians do not have it. Similar ancestry changes obviously have occurred also during the prehistoric times. Also, during every successive step of a linguistic expansion, it is likely that the genetic composition of the language carriers changes, because assimilation and language shift of earlier population are normal part of linguistic expansion.

This is just one of the many facts that demonstrate us why we cannot see the language from the DNA:
- The Uralic language can remain even when the Yakutia ancestry disappears.
- The Yakutia ancestry can remain even when language shifts: this ancestry is present in populations representing several language families, and we cannot see from the DNA which language the carriers of this ancestry originally spoke.

P.S. In case of trolls, please inform the administrators, so that this thread would not be closed due to insult wars.

Very interesting idea for a thread, Jaska, and thanks for posting. I too hope it doesn't attract any problem comments. You're obviously broadly right, but I think the fact that a change in language can sometimes be observed clearly in the DNA is well demonstrated in lowland Britain in the Migration Period.

Here, the evidence of linguistics matches what we have recently established in the DNA record, including with the appearance and proliferation of I1 that was documented by Gretzinger.

In this case there was an obvious change from a Brittonic to a Germanic language (a few contrarian observers made bizarre arguments for continuity that I won't get into here). This shift has been examined in many ways including with the typology and distribution patterns of the earliest English place-name types. These almost entirely replaced the earlier names although sometimes the incomers retained and adapted them (including with some rivers and in the case of names that would have been known all over Europe such as Dover).

Among the more interesting debates, Schrijver argued that “Anglo-Saxon settlers met predominantly, if not exclusively, speakers of late-spoken Latin when they arrived in the British Lowland Zone”. This was one of many interesting ideas, but I don't think any serious linguist argued that there was continuity behind the fact that a Germanic language is spoken in England today.

The evidence of course was overwhelming, beginning with what we know of IA Britain, illuminated further by Roman period testimony such as this from Tacitus in his homage to his father in law Agricola, who had been governor of Britannia:

"On a general estimate, however, it is likely that Gauls took possession of the neighbouring island. In both lands you find the same rituals, the same superstitious beliefs; the language does not differ much ..."

One of the many smoking guns for subsequent significant Germanic migration alongside linguistic replacement was the fact that there is almost no lexical borrowing into Old English from Brittonic, as demonstrated in both place-names and the overall language. Here's Richard Coates in “Invisible Britons: The View from Linguistics”:
 
“The canonical list is brief; Förster recognized 15,24 of which only 4 are still generally accepted:
binn ‘manger’
brocc ‘badger’
cumb ‘valley’, but found only in place-names until its descendant Welsh cwm was re-borrowed late in the second millennium
luh ‘sea; pool’ ”

Then, as I touched on above, we finally had these two landmark genetic findings in Gretzinger that built on earlier DNA findings and clearly explain how the massive linguistic shift in lowland Britain came about:

“In contrast to these previous periods, the majority of the early medieval individuals from England in our sample derive either all or a large fraction of their ancestry from continental northern Europe, with CNE ancestry of 76 ± 2% on average.”

And:

“In particular, Y chromosomal haplogroups I1-M253 and R1a-M420 were absent from our Bronze, Iron and Roman Age British and Irish individuals, but were identified in more than one-third of our individuals from early medieval England.”

So in the case of Migration Period England we surely do have an example of being able to see a change of language in the DNA, including with I1 with its obvious base in Scandinavia in the aDNA record. I'm sure you wouldn't argue with the essentials of this so I'd be interested if you could comment on this example in relation to your hypothesis. 

I'm not out for a dispute but am genuinely interested in the nuances behind your view as a professional linguist who we amateur enthusiasts have much we'd like to learn from.


RE: Seeing language from the DNA - Quint - 02-04-2024

(02-04-2024, 06:33 AM)Jaska Wrote: There is a widespread belief among laymen that language could be seen from the DNA. Strangely, they usually angrily deny that accusation, when at the same time they apply a method which cannot work unless they see language from the DNA. Here is a real-life example and my reply to it.

...

For some reason you do not follow the scientific procedure. Instead, you just randomly decide that the expansion of N-L1026 must be associated with the Uralic expansion. Then, based on this dice-roll, you claim that the linguistic evidence can now be ignored and the Uralic homeland can be located based on the DNA, because by some mystical skill you can see the language directly from the DNA. 

Could the same be said about R-M269 and "WSH" autosomal ancestry, whose expansion is widely believed to be associated with an Indo-European-speaking steppe migration, in spite of many exceptions, e.g., Basque and East Caucasian? I sometimes feel like this (exclusive) association is more driven by belief than actual linguistic evidence, which I think exists in the form of the lexical substratum in Finno-Ugric languages: In the north, it is undoubtedly of Indo-Iranian, Baltic or (Pre-)Germanic origin, while in the south (Ugric languages near the forest steppe) it is of unknown affiliation.

It definitely is a good idea to remain cautious when mapping DNA to linguistic dispersals.


RE: Seeing language from the DNA - Bjørn - 02-04-2024

I would like to interject, as I’ve found Jaska’s discussions about this to be very common sense and haven’t quite understood why some argue so strenuously against it (not you JonikW, mainly on other sites). I think the English and Hungarian examples show good parallels.

In the case of England, we know what the Germanic tribes spoke, there is literary and archaeological evidence of their migrations, and now we have aDNA correlations. If we go further back, though, as shown by the interesting proto-Germanic discussions here, we can’t say with certainty how, why, and when these Germanic tribes spoke a Germanic language and we can’t say which exact genetic components are related. We know I1 is related to the spread of these tribes (as shown in England) and most of us here think I1 populations probably played a role in the early stages of Germanic languages, but we don’t necessarily have proof that it was there in the founding stages of the language.

It is similar in Hungary. We know who brought the Uralic languages and we now have a genetic profile of those people. However, how and when the ancestors of the Hungarian conquerers first used such language and what specific genetic components were applicable at that exact stage cannot be stated with complete certainty at this time, as far as I can tell. Some of the purported methods of Uralic expansion with associated genetic components may be true, but they shouldn’t be treated as fact- which I feel is one of Jaska’s main points.


RE: Seeing language from the DNA - Jaska - 02-04-2024

JonikW:
"You're obviously broadly right, but I think the fact that a change in language can sometimes be observed clearly in the DNA is well demonstrated in lowland Britain in the Migration Period."

Naturally there are this kind of cases, where genetic and linguistic expansions coalesce; after all, language spreads with people, not by shouting over the river. But it still does not mean that language could be seen from the DNA, because it is also possible that there is genetic influx without language expansion (that is: the newcomers shift to the earlier language, like the French Normans did in England).

“Seeing” here is apparently a semantically ambiguous verb, because of course we can “see” a new language matching a new ancestry, when we compare the linguistic and genetic results (which is the correct scientific procedure). My point is that language cannot be predicted (or “postdicted”, which is the erroneous unscientific procedure) from the DNA, but we must always accept the linguistic results as the basis when we want to study language.


RE: Seeing language from the DNA - Kale - 02-04-2024

Unfortunately temperance is a despised virtue, and consequently that manifests in the problems Jaska points out, namely in this case refusal to accept uncertainty (black-and-white beliefs as Jaska puts it).
It is important not to take the opposite extreme though, and dismiss evidence that falls short of certainty.
Linguistics and genetics are two lines of evidence, and there are plenty more.
The lines of evidence must be assessed together to determine the most probable scenario(s) based on the strength of each line's evidence.

A lot of disagreements also tend to be one or both parties oversimplifying / failing to clearly explain their positions fully.
The thoughts/intent may be less certain than the statement due to the burdensome nature of expressing all of the uncertainties in complex scenarios.
To take the example quote as, well, an example...
“N-L1026 is very useful because it can be used to track the Uralic expansion from Siberia to the west, not to predict who currently speaks Uralic.”
IF: N-L1026 is a consistent feature of attested/confirmed early Uralic speakers
IF: There is no strong evidence of N-L1026 appearing 'to the West' in any non-Uralic speaking group (or a group without Uralic influence) (and at least wasn't a dead-end)
IF: Later events displaced Uralic language in some places without displacing N-L1026
IF: Later events dispalced N-L1026 in some places without displacing Uralic languages
IF: Some other consideration
IF: Two turtle doves
IF: A partridge in a pear tree
It's a lot of if's, none of which alone (I think) are terrribly controversial? But that would be quite a mouthful to say (and read) every time the topic came up.
That's not to say that's what the person meant, that would have to be gleaned from context. But given no other context, I would assert it's possible that is what they thought.

EDIT: Just to clarify, because misunderstanding have happened before, probably because I tend to talk weird, I'm not disagreeing with Jaska.


RE: Seeing language from the DNA - Ebizur - 02-04-2024

(02-04-2024, 06:00 PM)Bjørn Wrote: I would like to interject, as I’ve found Jaska’s discussions about this to be very common sense and haven’t quite understood why some argue so strenuously against it (not you JonikW, mainly on other sites). I think the English and Hungarian examples show good parallels.

In the case of England, we know what the Germanic tribes spoke, there is literary and archaeological evidence of their migrations, and now we have aDNA correlations. If we go further back, though, as shown by the interesting proto-Germanic discussions here, we can’t say with certainty how, why, and when these Germanic tribes spoke a Germanic language and we can’t say which exact genetic components are related. We know I1 is related to the spread of these tribes (as shown in England) and most of us here think I1 populations probably played a role in the early stages of Germanic languages, but we don’t necessarily have proof that it was there in the founding stages of the language.

It is similar in Hungary. We know who brought the Uralic languages and we now have a genetic profile of those people. However, how and when the ancestors of the Hungarian conquerers first used such language and what specific genetic components were applicable at that exact stage cannot be stated with complete certainty at this time, as far as I can tell. Some of the purported methods of Uralic expansion with associated genetic components may be true, but they shouldn’t be treated as fact- which I feel is one of Jaska’s main points.

I would like to add that, at least as much as one must exercise caution when attempting to interpret the role of bearers of Y-DNA haplogroup I1 in the genesis of the Proto-Germanic language, one must exercise caution when attempting to interpret the role played by e.g. bearers of Y-DNA haplogroup N-Tat in the genesis of the Hungarian (Magyar) language. The Hungarian language has a lir (Chuvash-like) Turkic layer as well as at least one layer of unknown linguistic affiliation (i.e. possible influence from some extinct language isolate), and these layers should have been added to the (proto-)Magyar language prior to the migration of (proto-)Magyars into Hungary. I would like one of the rabid proponents of the "N-Tat=ancient Samoyeds(or Late Neolithic/Bronze Age Yakutians or whatever)=Proto-Uralic speakers" hypothesis to explain why they do not need to account for any genetic influence on the Hungarian conquerors that may have originated from Turkic-speaking or non-Uralic language isolate-speaking ancestors of those Hungarian conquerors. They also never seem to feel a need to explain why N-Tat is not particularly common among actual present-day speakers of Samoyedic languages despite insisting that these people are essentially pure-blooded descendants of speakers of the Proto-Uralic language.


RE: Seeing language from the DNA - JonikW - 02-04-2024

(02-04-2024, 06:33 PM)Jaska Wrote: JonikW:
"You're obviously broadly right, but I think the fact that a change in language can sometimes be observed clearly in the DNA is well demonstrated in lowland Britain in the Migration Period."

Naturally there are this kind of cases, where genetic and linguistic expansions coalesce; after all, language spreads with people, not by shouting over the river. But it still does not mean that language could be seen from the DNA, because it is also possible that there is genetic influx without language expansion (that is: the newcomers shift to the earlier language, like the French Normans did in England).

“Seeing” here is apparently a semantically ambiguous verb, because of course we can “see” a new language matching a new ancestry, when we compare the linguistic and genetic results (which is the correct scientific procedure). My point is that language cannot be predicted (or “postdicted”, which is the erroneous unscientific procedure) from the DNA, but we must always accept the linguistic results as the basis when we want to study language.

Thanks Jaska. I see exactly what you mean now and enjoyed Bjørn’s post on the subject too. I would add though in response to a point in your reply that while English remained the language of the masses after the Normans arrived in England, their influence was immense and eventually filtered right down to many of the words I've used in this very sentence.

I just tried conjuring up an extreme hypothetical scenario to see what it might suggest. This is a technique I sometimes find useful when testing ideas and should have tried it earlier because pushing competing potential scenarios to the extreme can be informative.

For this purpose, imagine that today we still have Tacitus, Caesar and all the other literary sources for IA and Roman Britain. But no artefacts at all have survived for any period in question, while conversely skeletal remains for every generation and every single mile of the country have been retrieved, both in Britain and wider Europe. Their DNA has all been tested and the bones have all been carbon dated, with isotope analysis to boot.

Gildas, Bede and all Continental primary sources for the migration likewise have vanished but we do have the limited corpus of texts from after the year 1000 in late Old English, none of which mention a migration, and then the full body of work that we ourselves know right through Middle English to the present day. We have no Idea when Germanic place-names were established in England because William the Conqueror’s Domesday Book is our first source for those.

In this ultra extreme situation we would know a lot but also be ignorant of many things. We could be sure that somewhere along the line between the Roman Empire and the year 1000 there was a shift from Brittonic (and presumably some Latin) to Germanic. We would separately see a large migration from Germanic areas of the Continent and Scandinavia in the fifth and sixth centuries, as revealed to us through the DNA. 

We might guess through this DNA evidence that the period in the fifth and sixth centuries was when the language change revealed in our very sparse literary sources occurred. But we couldn't know this beyond all doubt. 

The DNA alone couldn't rule out other unknown causes and influences for this language shift occurring at any point before 1000, perhaps involving wars, alliances, trading, or elite imposition and emulation. So in short, I completely get your point and it's a valid one.


RE: Seeing language from the DNA - Anglesqueville - 02-04-2024

This thread is off to a flying start. Prepare to be roughly insulted on a certain well-known blog.


RE: Seeing language from the DNA - ph2ter - 02-04-2024

I think it is obvious that there exist a positive correlation between Y-DNA and a language.
If this were not the case, then nobody would be interested in DNA.
For example, R1a-CTS10221 is a marker which connects all people speaking Balto-Slavic languages, or I2a-Y3120 connects all people speaking Slavic languages.

Of course, sometimes some clades become extinct and language can spread with different clades which were in coexistence with the extinct clades.
These different clades become the new vectors of the language spread.


RE: Seeing language from the DNA - Jaska - 02-05-2024

Ph2ter:
Quote:I think it is obvious that there exist a positive correlation between Y-DNA and a language.
If this were not the case, then nobody would be interested in DNA.
For example, R1a-CTS10221 is a marker which connects all people speaking Balto-Slavic languages, or I2a-Y3120 connects all people speaking Slavic languages.

Well, certainly language or ethnicity is not the main focus for many interested in their Y-DNA, but genealogy is. Naturally the Y-DNA result can confirm that a person belongs within certain family tree; descendance from a carrier of certain family name automatically brings us also language and ethnicity.

Correlations are indeed very common, but a correlation is just a starting point for further research. There can be competing or contradicting correlations, and a correlation alone cannot prove that a lineage or ancestry truly spread with certain language. There can appear a correlation even when a language and a lineage or ancestry did not spread together, because the contact networks within which innovations and languages spread are usually wide and long-lasting, thus leading to similar distributions of different phenomena spreading at different times and from different places.

The correct way to study these correlations is the scientific procedure: we look at the linguistic results and the genetic results and try to find between them a match concerning time, place, and direction of expansion. The erroneous way is to arbitrarily decide that this language must have spread with that haplogroup. It is just as absurd as claiming the opposite: that one can see (= predict) ancestry from language.


RE: Seeing language from the DNA - Vinitharya - 02-05-2024

Provocative thought, perhaps, but I believe lowland England (the southern, southeastern wedge, everything basically east of a Grimsby/Southampton line) was primarily British Romance speaking when the Anglo-Saxons arrived; the Restbritonen that the Anglo-Saxons assimilated may have been ethnically Celtic, but they were culturally and linguistically Roman.  They were not numerous, and their chief contribution was vocabulary related to Christianity.  This is why English has few Celtic words; the language had been marginalized, if not extirpated, from the southeastern areas where Standard English originated.


RE: Seeing language from the DNA - jdean - 02-05-2024

(02-05-2024, 02:06 AM)Vinitharya Wrote: Provocative thought, perhaps, but I believe lowland England (the southern, southeastern wedge, everything basically east of a Grimsby/Southampton line) was primarily British Romance speaking when the Anglo-Saxons arrived; the Restbritonen that the Anglo-Saxons assimilated may have been ethnically Celtic, but they were culturally and linguistically Roman.  They were not numerous, and their chief contribution was vocabulary related to Christianity.  This is why English has few Celtic words; the language had been marginalized, if not extirpated, from the southeastern areas where Standard English originated.

But if that were the case are there Romance loan words in Old English ?


RE: Seeing language from the DNA - ph2ter - 02-05-2024

(02-05-2024, 01:15 AM)Jaska Wrote: Ph2ter:
Quote:I think it is obvious that there exist a positive correlation between Y-DNA and a language.
If this were not the case, then nobody would be interested in DNA.
For example, R1a-CTS10221 is a marker which connects all people speaking Balto-Slavic languages, or I2a-Y3120 connects all people speaking Slavic languages.

Well, certainly language or ethnicity is not the main focus for many interested in their Y-DNA, but genealogy is. Naturally the Y-DNA result can confirm that a person belongs within certain family tree; descendance from a carrier of certain family name automatically brings us also language and ethnicity.

Here, on this forum people are mainly because of history of migrations, language and ethnicity, not genealogy. For genealogy you have different services and forums.

Quote:Correlations are indeed very common, but a correlation is just a starting point for further research. There can be competing or contradicting correlations, and a correlation alone cannot prove that a lineage or ancestry truly spread with certain language. There can appear a correlation even when a language and a lineage or ancestry did not spread together, because the contact networks within which innovations and languages spread are usually wide and long-lasting, thus leading to similar distributions of different phenomena spreading at different times and from different places.

Of course, but correlation is correlation and usually is founded. More often than not.

Quote:The correct way to study these correlations is the scientific procedure: we look at the linguistic results and the genetic results and try to find between them a match concerning time, place, and direction of expansion. The erroneous way is to arbitrarily decide that this language must have spread with that haplogroup. It is just as absurd as claiming the opposite: that one can see.

Of course. I don't claim otherwise.


RE: Seeing language from the DNA - jdean - 02-05-2024

(02-04-2024, 10:25 PM)ph2ter Wrote: I think it is obvious that there exist a positive correlation between Y-DNA and a language.
If this were not the case, then nobody would be interested in DNA.
For example, R1a-CTS10221 is a marker which connects all people speaking Balto-Slavic languages, or I2a-Y3120 connects all people speaking Slavic languages.

Of course, sometimes some clades become extinct and language can spread with different clades which were in coexistence with the extinct clades.
These different clades become the new vectors of the language spread.

John Koch and Johan Ling recently argued in ‘From the Ends of the Earth: A Cross-Disciplinary Approach to Long-Distance Contact in Bronze Age Atlantic Europe’ that Celtic spread along the Atlantic Façade as a trade language.
 
Iosif Lazaridis believes he has falsified Anatolian descended from a Steppe language because of a lack of Steppe DNA in ancient Anatolian samples but quite a lot of people disagree with him on this including Alwin Kloekorst, who apparently knows a thing or two on the subject. The counter argument is the Steppe DNA component could very easily have became so diluted by the time it reached Anatolia that it would be very hard to find.
 
Not intended as some sort of devastating argument but food for thought.