Hello guest, if you read this it means you are not registered. Click here to register in a few simple steps, you will enjoy all features of our Forum.

Chinese GEDmatch averages thread [repost]
#1
Original Proboards version of this thread


https://anthrogenica.com/showthread.php?27285-Simulated-Chinese-MDLP-K23b-results-by-province-DNAConnect-org-adoptees-vs-23mofang

Reposting my own version of @kushkush's maps with MDLP K23b results here for comparison.

This was based on the DNAConnect.org dataset as well as other samples found on GEDmatch and the WeGene forum. I'd like to think this is more representative of the Han in each province, since I tried to weigh my assigned samples for each province I had more than 10 samples for according to the population distibution for that province.

The first 4 maps are of the 4 largest MDLP K23b East Eurasian ancestry components found among Chinese samples. The 5th one is of a North-South cline I created for East Asian MDLP K23b results. Basically "Tungus_Altaic" = northern, "Austronesian" = southern, while "South_East_Asian" and the 2 "Siberian" components are defined relative to T_A and AN. The original idea was that a Han Chinese person who scores the same amount of T_A and AN will score 0.5 on my North-South cline.

[Image: cw2QvTF.png]

[Image: hwr2amo.png]

[Image: rT227Kd.png]

[Image: IbsfvhK.png]

[Image: EQjAhu7.png]

Original methodology post from Anthroscape, link has been defunct since Nov 2020.

Newer methodology post (from Anthrogenica, will expire Aug 2023)

TL;DR- I tried to weigh my available data according to the actual population distribution within each province (the DNAConnect.org data was geographically biased for most of the provinces- especially Jiangxi, Guangdong, Hunan, and Chongqing- which make up most of the DNAConnect.org dataset). Most of the samples in my original dataset and from DNAConnect.org were from the southern provinces, with very few from the northern provinces, so I had to use whatever I could get my hands on. The samples I used in my original dataset were just the ones of known regional/provincial ancestry.

I weighed the Fujian, Jiangxi, Hunan, and Anhui DNAConnect.org subsets according to the population distribution of each province to make them more representative. The Guangdong and Shaanxi subsets only represented more remote parts of the provinces, and the Guizhou subset was mostly from minority-heavy areas. The only northern province represented in the DNAConnect.org dataset was Henan.
Manofthehour likes this post
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#2
Full methodology post as posted on Anthrogenica

Quote:Longer version of my Chinese provinces map methodology-
All of the data I collected was from before the GEDmatch handover to Verogen, so I'm willing to bet most of the kits I looked at have been deleted already. I have no GEDmatch kits for Xinjiang, Tibet, Qinghai, Ningxia, Gansu, Inner Mongolia, or Hainan. I treated the 3 Dongbei provinces as 1 unit, since there are very few known-location Northern Chinese GEDmatch samples, and those provinces don't really have distinct identities like the ones in China proper.
I excluded any GEDmatch samples that stated they were non-Han (e.g. Korean from Jilin, Zhuang from Guangxi), aside from 1 half Han half Hui individual who shared their data on a WeGene forum. Around half of the non-DNAConnect.org datapoints I used were taken from the WeGene forums.


Guangdong (88)
- 81 DNAConnect.org + 7 Other
used 81 DNAConnect.org adoptees (90% from West Guangdong), plus 5 Pearl River Delta samples, 1 other Yangxi (Yangjiang) sample, and 1 Chaoshan sample. Since Guangdong's province is heavily concentrated in the Pearl River Delta area, I just took the mean of the non-DNAConnect.org samples and treated the mean of the DNAConnect.org as if it was 1 sample from Zhanjiang/Maoming.

Fujian (10)
- 5 DNAConnect.org + 5 Other
used 5 DNAConnect.org adoptees (weighted according to Fujian's population distribution), 1 incomplete sample from a Fuzhou-born Chinese immigrant to the US, and 4 other Southern Fujian/Taiwanese/Singaporean samples that I used as a proxy for Coastal/Southern Fujian. The weighted DNAConnect.org average was much more southern-shifted than the raw average, since 45% of Fujian's population is from the Hokkien-speaking area, and another 25% is from the Fuzhounese speaking area, and the DNAConnect.org samples from Quanzhou and Fuzhou happened to score higher on AN.

Guangxi (9)
- 6 DNAConnect.org + 3 Other
just took the raw average of the 6 DNAConnect.org adoptees and the 3 other GEDmatch Guangxi samples I have in my original dataset (at least 2 of whom are also adoptees)

Zhejiang (9)
- 6 DNAConnect.org + 3 Other
used 6 DNAConnect.org adoptees (3 from Hangzhou the capital, 3 from Quzhou in the south), plus 1 Ningbo sample and 1 generic (very northern-shifted) Zhejiang sample from my original dataset. I was going to attempt to weigh the DNAConnect.org samples, but there was no real difference between the Hangzhou and Quzhou averages, so I just took the raw mean of all 9 samples

Jiangsu (15)
- 8 DNAConnect.org + 4 Other + average of North_Anhui DNAConnect.org samples
used 8 DNAConnect.org adoptees (all from South Jiangsu along or south of the Yangtze), 2 other Suzhou samples, and 2 generic (albeit northern-shifted) Jiangsu samples. Since I had no confirmed northern Jiangsu samples, I used the Northern Anhui DNAConnect.org samples average as a proxy for that. I used 1/2 DNAConnect.org + 1/4 Non-DNAConnnect.org + 1/4 DNAConnect.org North Anhui to get the Jiangsu average

Anhui (17)- 13 DNAConnect.org (3 North, 9 Central, 1 South) + 4 Other (all Central)
the 13 DNAConnect.org samples broadly represent all parts of the province except for the southeast. I divided them into North, Central, and South regions based on their prefecture-level city; I used the 1 Anqing sample as a proxy for Gan and Huizhou-speaking areas in the south.
Fun fact: Around 40% of Anhui's population lives north of the Huai river, and another 40% lives in Jianghuai Mandarin-speaking areas (Hefei, Wuhu, etc.), so only 20% actually lives in the non-Mandarin speaking area. This actually explains why Anhui is so much more northern-shifted than Jiangsu...
My non-DNAConnect.org Anhui samples were all from the middle third of the province, so I treated them as equivalent to the Hefei/Wuhu DNAConnect.org adoptees. I used 0.4 North Anhui + 0.4 Central Anhui + 0.2 South Anhui to get the Anhui average

Jiangxi (140)
- 135 DNAConnect.org + 5 Other
there were 135 Jiangxi DNAConnect.org adoptees, 53 of which were from the southernmost prefecture-level city of Ganzhou. This city only makes up ~18% of the province's population. I also used 5 Jiangxi samples from my original dataset. Formula was 5* weighted DNAConnect.org mean + 1*other Jiangxi mean.

Hunan (53)
- 51 DNAConnect.org + 2 Other
there were 51 Hunan DNAConnect.org adoptees, 33 of which were from the southern Hunan cities of Shaoyang and Hengyang, which only make up 22% of Hunan's population. I also used 1 "Hunan boy" from my original dataset, and another very southern-shifted "Hunan" sample from GEDmatch. Formula was 6* weighted DNAConnect.org mean + 1*other Hunan.

Hubei (8)
- 6 DNAConnect.org + 2 Other
used 6 DNAConnect.org samples (1 from Wuhan), plus 2 Hubei samples from my original dataset. The Wuhan adoptee from my original dataset was very northern-shifted, so I took the average of all 8 Hubei samples, but weighed the northern-shifted Wuhan adoptee by 0.5 and the DNAConnect.org Wuhan adoptee by 2 (since I figured that one was more representative of Wuhan/east Hubei).

Chongqing/Sichuan (27/31)
- I included the 24 Chongqing DNAConnect.org adoptees for Sichuan, since Chongqing and Sichuan are part of the same cultural/demographic unit and were only separated recently for administrative regions. The Sichuan average also uses the Chongqing data points from my original dataset.
For Chongqing, I used the Chongqing DNAConnect.org data, plus 2 individuals from my original dataset of confirmed Chongqing origin, as well as 1 incomplete Chongqing sample from the WeGene forums.
For Sichuan, I used the DNAConnect.org data, plus the 2 Chongqing GEDmatch individuals from above, 2 Chengdu samples, 2 Mianyang (north Sichuan) samples from the WeGene forum, and 1 incomplete (southern-shifted) Sichuan sample from the WeGene forum.

Yunnan (2)
- just used the 2 Yunnan adoptees in my original dataset

Guizhou (15)
- 11 DNAConnect.org + 4 Other
7 of the 11 DNAConnect.org samples were from the mostly non-Han Zhenyuan area, while the other 4 were from Guiyang (the capital) and Zunyi. To try to make the Guizhou samples more representative of the Han population, I took the average of the Zhenyuan samples and treated them as equivalent to 1 Han individual.

Shanghai (14)
- averaged the Zhejiang DNAConnect.org samples and the Jiangsu DNAConnect.org samples (which are all from South Jiangsu)


Since all of the northern provinces had smaller sample sizes, I added 2 Northern Han samples of unknown regional ancestry to the average of each northern province to make my results more robust. This might have made the averages for the northern provinces closer to each other than they actually are, but previous studies show that the autosomal differences among the northern Chinese provinces are very low, so it's not like the differences are that significant.

Shaanxi (6)
- 2 Hanzhong DNAConnect.org + 2 Generic Northern Han + 1 artificial Xi'an proxy (calculated from the shared results of a Chongqing individual who is of 1/4 Xi'an ancestry) + 1 mixed Han-Hui individual (not from Shaanxi). I took the separate means of the Hanzhong and non-Hanzhong samples, and weighted according to South Shaanxi's proportion of the population to get the overall Shaanxi average.

Henan (11)
- 4 DNAConnect.org + 2 Generic Northern Han + 5 Other

Shandong (8)
- 0 DNAConnect.org + 2 Generic Northern Han + 5 Other (one person is 1/2 Shanxi) + Anthroscape user @uisashi

Dongbei (5)
- 0 DNAConnect.org + 2 Generic Northern Han + 2 Other + Anthroscape user @uisashi. The "Other" samples were 1 person from Harbin and one person who used the "Nurhaci" alias in GEDmatch.

Hebei (5)
- 0 DNAConnect.org + 2 Generic Northern Han + 2 Beijing samples from WeGene + Anthroscape user @uisashi's "North Hebei" indivdual (defunct Anthroscape link). As with Chongqing/Sichuan I included Beijing samples in my Hebei average.

Shanxi (4)
- 0 DNAConnect.org + 2 Generic Northern Han + 2 Other. 1 individual is from my private dataset, the other is 1/2 Shandong 1/2 Shanxi.

Gansu (4)
- 0 DNAConnect.org + 2 Generic Northern Han + 2 Other. Used an incomplete Gansu result posted on WeGene + the mixed Han-Hui person I used for Shaanxi. I used this person for both provinces to simulate the Central Asian shift NW Chinese seem to have.


Beijing + Tianjin- this involved a lot of guesswork/BS. I figured the two cities would be slightly different autosomally, with Beijing being slightly more southern-shifted and somewhat more cosmopolitan than Tianjin, but both cities being more southern-shifted and cosmopolitan than the neighboring provinces.

Tianjin (5)
= 2 Generic Northern Han + 2 Beijing samples from WeGene + Anthroscape user @mutabor's "Central Chinese" adoptee

Beijing (12)
= 5 samples used for Tianjin + the 1 "North Hebei" indivdual + 1 mixed Han-Hui individual [Shaanxi & Gansu] + 1 individual from the Dongbei average + 2 other Northern Han of unknown ancestry + 2 random Southern Han from my original dataset. This seems very close to the CHB reference population in HarappaWorld.- this involved a lot of guesswork/bullshit. I figured the two cities would be slightly different autosomally, with Beijing being slightly more southern-shifted and somewhat more cosmopolitan than Tianjin, but both cities being more southern-shifted and cosmopolitan than the neighboring provinces.
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#3
For comparison, here is what the model for non-urban Chinese districts looks like:

Quote:A former Anthroscape member sent me a chart of 23mofang averages for various suburban and rural districts in China, which are believed to be more autosomally "representative" of specific regions and linguistic subgroups (as opposed to the city centers, which are more cosmopolitan and therefore more "mixed".) https://imgur.com/a/pxuhh3D
[Image: 1UYN0Tn.jpg]

Map (latitude + longitude) with ChinaMAP study clusters added
[Image: GX11i2q.png]

Quote:More graphs using the 23mofang model

MDLP K23b components (Austronesian vs Tungus_Altaic) graph with ChinaMAP clusters
[Image: Ejnw6QP.png]
MDLP K23b East Eurasian N-S cline and %tage graph with ChinaMAP clusters
[Image: APym4vq.png]
Map of the East Eurasian N-S cline
[Image: ZJ0Z69E.png]
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#4
Comparison of 23mofang non-urban districts (which are allegedly more representative of regional population structure among the Han population) with 23mofang province averages. Results for the provinces are summarized on page 25 of my 'Chinese GEDmatch averages' Anthrogenica thread.

Found a list of 23mofang averages for entire provinces/regions instead of select rural/suburban districts.

[Image: 25d103e1cc81c579e9477b6232fac4cc5b4b4b33.jpg]
(screenshot from a Chinese website saved on archive.today)

truncated results for the provinces (excludes cities for ZJ, HUB, Chengdu, and Kunming + topolect regions for JS and GD). NMG = Nei MengGu (Inner Mongolia)
AN E_Sib Paleo_Sib S_EA T-A % East Asian N-S cline Global PC1 Global PC2
NMG 17.05 1.94 0.44 41.94 31.06 92.42583683 0.398543273 9.257930639 4.688090645
GS 19.19 1.01 0.34 45.21 30.28 96.02147238 0.428248898 9.672582868 5.37268204
BJ 19.79 0.89 0.30 45.32 31.30 97.59713824 0.42888667 9.852663194 5.642752307
SX 18.93 0.96 0.32 44.82 31.51 96.52912488 0.421582233 9.755029775 5.490130253
LN 19.34 1.03 0.31 45.00 32.19 97.86994738 0.420649044 9.896767385 5.697611215
JL 19.21 0.86 0.29 45.27 32.64 98.27289927 0.420023593 9.944914531 5.769027808
SW 18.51 1.01 0.32 45.25 32.21 97.28369925 0.415940455 9.838167627 5.604824955
HEB 19.26 0.83 0.29 45.34 32.09 97.96700653 0.42298456 9.910654615 5.724781548
HLJ 18.75 0.84 0.29 45.40 32.35 97.62430675 0.418798146 9.869953801 5.655325799
SD 19.49 0.74 0.27 45.74 31.94 98.17927014 0.426319901 9.908514397 5.718962822
HEN 19.65 0.74 0.28 45.55 31.27 97.52140098 0.42998435 9.848091234 5.639987407
JS 22.90 0.62 0.25 45.95 28.91 98.63094833 0.460686512 9.930649852 5.806497523
SH 24.35 0.59 0.24 45.96 27.65 98.78320191 0.474964645 9.93303725 5.830267085
AH 23.50 0.65 0.25 45.68 28.05 98.14053273 0.467586429 9.860246398 5.705533468
ZJ 26.22 0.55 0.23 46.10 25.76 98.86647515 0.49440468 9.920026349 5.838140868
HUB 27.13 0.59 0.24 46.27 24.51 98.75011954 0.504909289 9.897981074 5.825591874
YUN 26.38 0.65 0.27 47.56 23.25 98.11325987 0.506539908 9.826434183 5.72999527
SC 29.17 0.55 0.23 46.65 21.94 98.54857594 0.528718915 9.856229715 5.802867888
JX 30.75 0.51 0.21 45.98 21.37 98.83141042 0.540068741 9.869055947 5.827272364
CQ 30.79 0.57 0.23 46.49 20.52 98.5859883 0.544039169 9.845607726 5.809747726
HUN 31.68 0.58 0.21 46.03 20.37 98.8719243 0.549137419 9.868355978 5.84316753
FJ 32.40 0.47 0.20 45.85 20.02 98.9304318 0.555786803 9.870192909 5.854024549
GD 36.69 0.42 0.18 46.13 15.55 98.96630383 0.600665329 9.820688455 5.846125462
GX 38.14 0.42 0.18 46.38 13.96 99.03031161 0.615889635 9.805578573 5.851901433

Screenshot format

Map
[Image: 4FtBLSv.png]
Austronesian vs Tungus_Altaic graph including my original private samples to visualize the range of variation among actual GEDmatch samples. Predictably, the Guangxi, Guangdong, and Fujian averages are all more "northern-shifted" than their corresponding districts would suggest.

Quote:The Fujian and Guangdong averages are also slightly more northern than I would've expected given the rural/suburban 23mofang averages uisashi sent me in early 2021. The Fujian average is close to the Lianjiang Fuzhou (Mindong-speaking, north bank of the Min River Delta) average, while the Guangdong average is close to the Zijin Heyuan (Hakka-speaking, quite a bit east of the Pearl River Delta) average.
East Eurasian N-S cline vs % East Asian graph. District and province averages are a little off because the 23mofang provinces model includes "European" and "Other" components that the districts model does not, which drives the East Asian % down for the province averages. However, the same general patterns still hold- the northwestern provinces of Gansu, Shaanxi, and Shanxi are clearly less East Asian than the coastal Northern provinces, which are also all less East Asian than the Southern provinces.
[Image: 6W4BG46.png]
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#5
Early 2019 estimates for MDLP K23b based on the samples I had encountered (whether in my private dataset or not). This was before I compiled a separate dataset of Chinese adoptees of known regional ancestry (based on orphanage location):


Quote:Dongbei + Coastal North China: 28-36% T_A, 16-23% AN, 40-48% S_EA. I haven't seen any Dongbei-specific samples (at least not back then).

Shanxi, Gansu, Northern Shaanxi: 26-36% T_A, 15-23% AN, 40-48% S_EA. Probably more E_Sib and "Ancestral Altaic" than Dongbei + Coastal North China on average.

Central Shaanxi and North/West Henan: 24-36% T_A, 15-25% AN, 40-52% S_EA. Should be more southern (more S_EA-like in particular) than Dongbei, but still scoring quite high on T_A.

Central/East/South Henan + S Shandong + N Anhui + N Jiangsu: 24-34% T_A, 18-25% AN, 42-52% S_EA. This is based on a spreadsheet of Dodecad K7/K12 results I found on ranhaer a while back, where I was closest to people from Northern Anhui and Northern Jiangsu.

Beijing/Tianjin: 24-38% T_A, 14-26% AN, 40-50% S_EA. I suspect the range isn't much different from the Northern Chinese provinces, aside from Beijing probably having more Southern Han-like outliers, more E_Sib/Paleo_Sib/Arctic, and more non-East Eurasian admixture in general.

My hunch is if you collected enough MDLP K23b results from each province and province-level city in Northern China, you'd find that the range in T_A, AN, and S_EA scores won't vary by much, but the means and medians might.

Jiangsu south of the Huai river + Central Anhui + N Zhejiang: 24-30% T_A, 20-28% AN, 42-48% S_EA. East China Han seem to score slightly lower on S_EA than expected given their T_A and AN scores. Slightly more "northern" than "southern" on average but very close to the north-south divide.

Shanghai: 19-35% T_A, 19-35% AN, 40-50% S_EA. No idea if the average Shanghai resident is more "northern" or "southern", I'm guessing there's more of a bimodal distribution?

Hubei + S Anhui: 20-26% T_A, 23-31% AN, 42-49% S_EA. Probably more "southern" than "northern" on average but very close to the north-south divide.

Jiangxi: 19-25% T_A, 25-32% AN, 42-50% S_EA. There seem to be a lot of Jiangxiese samples on GEDmatch Genesis.

Hunan: 17-26% T_A, 25-33% AN, 42-52% S_EA. Probably slightly more southern than Jiangxi and slightly more northern than Fujian/Taiwan on average

Sichuan/Chongqing: 17-28% T_A, 24-33% AN, 44-52% S_EA. I'm guessing they have a broader range than the other core Yangtze provinces.

Zhejiang south of Hangzhou Bay: 20-30% T_A, 24-30% AN, 42-48% S_EA.

Fujian + Taiwan + Min-speaking Han in general: 16-24% T_A, 28-36% AN, 43-48% S_EA. This is more or less the range I've seen in Taiwanese Han samples that aren't obviously part waishengren. Haven't seen any confirmed Fujianese Han MDLP K23b results on GEDmatch. I'm guessing Hakka Han are more T_A shifted than Minnan Han on average, and possibly slightly more S_EA than both Minnan Han and Guangfu (OG) Cantonese Han too.

Central/Eastern Guangdong + HKers who aren't obviously part noi dei yan: 13-21% T_A, 30-41% AN, 44-50% S_EA. Most "HK/Canto" people in my dataset are slightly more northern-shifted than the MDLP K23b "Cantonese" reference population although many are more southern.

Western Guangdong (incl. Greater Taishan) + Guangxi: 9-16% T_A, 35-43% AN, 45-48% S_EA. Generally slightly more southern-shifted than the MDLP K23b "Cantonese" reference population. Haven't seen any actual Hainanese samples. I suspect Northern Guangxi Han are shifted towards Central Chinese.



Quote:List of average WeGene results for Chinese provinces (source: http://mahui.me/gene/, https://imgur.com/a/YsWR5Y3)

Dongbei excluding Yanbian Jilin: 68-70% Northern Han, 6-9% Southern Han, 10-11% Mongol, 4-6% Japanese, 2-3% Naxi, 1% Tungusic/Korean.

Hebei + Shandong + Tianjin: 70-75% Northern Han, 6-10% Southern Han, 9-12% Mongol, 2-4% Naxi, 2-3% Japanese
--> Beijing: similar to Hebei and Tianjin. Dongcheng district is slightly more Southern Han (11-12%) and Euro-mixed (2%)

Shanxi: 68% Northern Han, 7% Southern Han, 16% Mongol, 3-4% Naxi, 2-3% Japanese

Gansu: 58% Northern Han, 7% Southern Han, 19% Mongol, 6% Naxi, 2% Japanese

Shaanxi: 63% Northern Han, 10% Southern Han, 15% Mongol, 4-5% Naxi, 2% Japanese on average
-->N Shaanxi: 60-65% Northern Han, 4-7% Southern Han, 16-20% Mongol, 3-4% Naxi, 2% Japanese
-->C Shaanxi: 55-65% Northern Han, 5-15% Southern Han, 12-16% Mongol, 5% Naxi, 2% Japanese
-->S Shaanxi: 50-55% Northern Han, 20-25% Southern Han, 8-12% Mongol, 6% Naxi, 2% Japanese

Most of Henan: 65-75% Northern Han, 5-15% Southern Han, 10-15% Mongol, 2-4% Naxi, 2% Japanese

S Henan: 60% Northern Han, 20% Southern Han, 6% Mongol, 3% Naxi, 3% Japanese

N Anhui + Xuzhou Jiangsu: 65-70% Northern Han, 10-15% Southern Han, 8-10% Mongol, 2-3% Naxi, 3% Japanese
C Anhui (down to Wuhu): 60-65% Northern Han, 20-25% Southern Han, 5-6% Mongol, 2-3% Naxi, 3% Japanese
S Anhui: 56-64% Northern Han, 28-33% Southern Han, 2-3% Mongol, 2% Naxi, 3% Japanese

N-C Jiangsu: 65-70% Northern Han, 15-20% Southern Han, 4-6% Mongol, 2% Naxi, 3% Japanese

Rest of Jiangsu: 63-68% Northern Han, 19-23% Southern Han, 4% Mongol, 2% Naxi, 3% Japanese
--> Shanghai: 62% Northern Han, 26% Southern Han, 3% Mongol, 2% Naxi, 3% Japanese. The urban center is slightly more northern-shifted (close to S Jiangsu)

Northern 2/3 of Zhejiang: 50-60% Northern Han, 30-35% Southern Han, 2-3% Mongol

S Zhejiang: 45-50% Northern Han, 40-45% Southern Han, 1-2% Mongol. Wenzhou is almost evenly split between N and S Han

Hubei: 50-51% Northern Han, 35% Southern Han, 2-4% Mongol, 2-4% Naxi, 1% Gaoshan. Relatively uniform aside from the NW parts near Shaanxi and the Tujia regions in the SW.

Sichuan/Chongqing: the Han parts average 35-45% Northern Han, 35-45% Southern Han, 1-2% Dai, 1-3% Mongol, 3-6% Naxi. N-most Sichuan and NE-most Chongqing are more like 45-50% Northern Han + 30-35% Southern Han.

Guizhou: 33% Northern Han, 45% Southern Han, 5-6% Naxi on average. Guiyang is 37% Northern Han, 43% Southern Han, 5% Naxi, 3% Mongol, and 2% Dai/Hmong-Mien/Gaoshan.

Hunan: 35% Northern Han, 50% Southern Han, 3% Naxi, 2% Dai/Gaoshan/Mongol on average. North is ~40% Northern Han + ~40% Southern Han, South is ~30% Northern Han + ~60% Southern Han, Changsha is slightly more northern-shifted than the provincial average.

Jiangxi: 40% Northern Han, 47% Southern Han, 2% Naxi/Gaoshan/Mongol, 1% Dai on average. North and Nanchang are ~45% Northern Han + ~40% Southern Han, South is ~32% Northern Han + ~55% Southern Han.

Fujian: 32% Northern Han, 58% Southern Han, 1-2% Japanese/Mongol/Gaoshan/Naxi on average.
-->Nanping: 38% Northern Han, 52% Southern Han, 2% Japanese, 1-2% Mongol/Gaoshan/Naxi, 1% She
-->C + E Fujian: 35% Northern Han, 55% Southern Han, 1-2% Japanese/Mongol/Gaoshan/Naxi, 1% She
-->S Fujian: 27-32% Northern Han, 59-64% Southern Han, 1-2% Japanese/Mongol/Gaoshan/Naxi, 1% She. Xiamen is slightly more northern-shifted but also slightly more Dai and SEA-shifted.

Guangdong: 15% Northern Han, 71% Southern Han, 3% Dai, 3% Gaoshan, 2% Viet, 1% Japanese on average.
-->'Teochew' region + Meizhou: 22-27% Northern Han, 63-68% Southern Han, 1-2% Gaoshan/Dai, 1% Naxi/Viet/Japanese
-->'Transitional' region + Shenzhen + Zhongshan: 13-17% Northern Han, 70-72% Southern Han, 2-3% Gaoshan/Dai, 2% Viet, 1% Japanese
-->Rest of Pearl River Delta: 8-12% Northern Han, 71-75% Southern Han, 3-5% Dai, 3-5% Gaoshan, 2% Viet, 1% Japanese
--> W Guangdong: 5-8% Northern Han, 75-79% Southern Han, 3-5% Dai, 3-5% Gaoshan, 2% Viet. Zhanjiang is less Dai and more Han (possibly because it's Min-speaking rather than Yue-speaking?)

Guilin Guangxi: 11% Northern Han, 70% Southern Han, 5% Dai, 4% Gaoshan, 3% Viet, 2% Naxi, 1% Khmer/Hmong-Mien/Lahu
Nanning Guangxi: 5% Northern Han, 64% Southern Han, 9% Dai, 5-6% Gaoshan/Viet, 3% Khmer, 1% Thai/Hmong-Mien/Lahu

Hainan: 10% Northern Han, 73% Southern Han, 4% Dai, 3% Gaoshan/Viet, 1% Khmer. Amazingly, Haikou is only ~8% Northern Han and 74-75% Southern Han.
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#6
Going to start reposting select r/23andMe posts that I shared in my original "Chinese GEDmatch averages" thread (link is to the Genoplot archive)



https://genoplot.com/discussions/topic/10946/chinese-gedmatch-averages/80

Quote:First Chinese 23andMe result I’ve seen on r/23andMe post-algorithm update. Hong Konger whose grandparents are from Dongguan and Shantou: https://np.reddit.com/r/23andme/comments...ents_from/

85.3% Lingnan (Guangxi)
11.4% Yangtze + Fujian/Taiwan (order of provinces is interesting- Fujian, Jiangsu, Zhejiang, Shanghai, etc)
3.3% Vietnamese
[Image: o08m604rj8i91.jpg]
Pre-update results:
[Image: pcv8b04rj8i91.jpg]
Quote:Very nice to see Chinese results post-update!

Shantou explains the Taiwan specific matches (Teochew and Hokkien are relatively closely related topolects). I suspect you’d still get some “Southern Chinese + Taiwanese” if you were 100% Canto, and your “South Chinese” percentage makes me suspect most Fujianese and Taiwanese are going to get significant “South Chinese” ancestry



For contrast, here's a 5th generation Chinese Indonesian result who (as far as they know) are fully ethnic Chinese.
https://np.reddit.com/r/23andme/comments...esian_and/
[Image: rgylbcyrwaxb1.png?]

mtDNA is B4d1, on-paper regional Chinese ancestry is 3/4 Hakka and 1/4 Hokkien
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#7
A Chinese-descended 23andme user posted photos of the updated results of various relatives, all of whom had all four grandparents from the same location. (original r/23andMe post)

[Image: LnUXRo1.png]
[Image: TGnk6eT.png]
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#8
Is the darker red color supposed to signify a "Northern Chinese" autosomal component? It appears that one of the individuals with four grandparents from Shanghai has been assigned nearly 100% "Northern Chinese," whereas the other individual with four grandparents from Shanghai has been assigned about 2/3 "Central Chinese" (or non-Lingnan but also non-Northern Chinese) and about 1/3 "Northern Chinese." I wonder which dialect(s) the family members of the nearly 100% "Northern Chinese" individual might speak. I suppose that the 2/3 "Central Chinese" + 1/3 "Northern Chinese" individual from Shanghai should most likely be a native speaker of Wu Chinese.
Reply
#9
https://genoplot.com/discussions/topic/1...erages/135 (https://genoplot.com/discussions/post/83790)

okarinaofsteiner Wrote:Someone made a map of regional Taiwanese 23andMe results based on their DNA relatives. Sample sizes are very, very small (n=41) but there are enough samples to include Taiwanese aborigines.

https://np.reddit.com/r/23andme/comments...of_41_dna/

[Image: w4vx44j7elm91.png?]

Apparently the OP excluded individuals who had at 1 or more grandparents born outside of Taiwan.

I did list and calculate the 21 people who's Waishengren or at least one side from China (recent paternal/maternal grandparents).

Quote:Waishengren Average (Samples: 21)

• ⁠Northern Chinese & Tibetan - 3.05%
• ⁠Southern Chinese & Taiwanese - 73.89%
• ⁠Broadly Chinese - 0.39%
• ⁠Filipino & Austroensian - 0.95%
• ⁠Indonesian, Thai, Khmer & Myanmar - 0.07%
• ⁠Mongolian & Manchurian - 0.05%
• ⁠Iranian, Caucasian & Mesopotamian - 0.02%
• ⁠Southern Indian & Sri Lankan - 0.02%
• ⁠Unassigned - 0.02%

But it's excluded after the discussion with friends. The map only shows the people who have all 4 sides of their families from Taiwan. Among 41 Benshengren, they have none Northern Chinese & Tibetan.



SG_Jun Wrote:Surprisingly it seems that people from Kaohsiung and Yilan (both predominantly Zhangzhou accent leaning areas) have an ever lower % of South Chinese compared to people from Taichung, Chiayi, Tainan etc.; I wonder how does that work out..

okarinaofsteiner Wrote:I would’ve expected a lot of Fujianese and Taiwanese Hoklo to get a 50-50 split. I wonder how a 100% Chaoshan person or a Xiamen local would score, surprised there would be a difference between Quanzhou and Zhangzhou tbh

Also wonder how Hunan and Jiangxi Chinese score in the new system. I suspect many Sichuanese are still going to get trace Viet and I/T/K/M SEA like they did before.
Actually I do believe there is a difference between Quanzhou and Zhangzhou, that these two are distinguishable. Based on my 23mofang results, I have far more autosomal DNA matches with people from Quanzhou (and even Fuzhou) than I do with people from Zhangzhou. This is accurate because my maternal grandparents were indeed from Quanzhou (specifically Anxi county for my maternal grandfather, not sure about my maternal grandmother) and we speak a very Quanzhou leaning dialect of Hokkien at home (that would also be closest to the Lukang 鹿港 dialect of Taiwanese Hokkien), for example for the phrase "豬尾短短" we would say "tir ber ter ter" instead of the Zhangzhou "ti bue te te"
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#10
(11-20-2023, 06:30 AM)okarinaofsteiner Wrote: For contrast, here's a 5th generation Chinese Indonesian result who (as far as they know) are fully ethnic Chinese.
https://np.reddit.com/r/23andme/comments...esian_and/
[Image: 64NjUqx.png]

mtDNA is B4d1, on-paper regional Chinese ancestry is 3/4 Hakka and 1/4 Hokkien

Discord attachment image URL expiration occurred faster than I anticipated, here's a workaround for that



Here's another (very recent) r/23andMe post of a Taiwanese-speaking Mainland Chinese person of Zhangpu and Xiamen ancestry

[Image: T4DILzCl.png]
75.1% non-Lingnan Southern Chinese, 23.6% Lingnan Chinese, 0.2% Broadly Chinese, 1.1% Filipino and Austronesian. Seems like roughly what you'd expect for someone whose ancestry is that specific part of China, although judging from the general Taiwanese results I've seen the Lingnan Chinese score might be on the lower end
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#11
https://genoplot.com/discussions/topic/10946/chinese-gedmatch-averages/210

Quote:https://np.reddit.com/r/23andme/comments...m_chinese/

Lingnan diaspora result- mom is from Zhuhai (OP has one Indonesian great-grandparent, not sure if ethnic Chinese), while their father is from Jiangmen (Toisan area). mtDNA haplogroup is B2.

81.6% South Chinese
15.9% Southern Chinese & Taiwanese

0.1% Japanese
0.1% Broadly East Asian

1.3% Indigenous American

1.0% Italian

[Image: mst8yc7w7dz91.jpg]

Quote:Really interesting. Obviously the Indigenous American ancestry came from their mom's side if their maternal haplogroup is B2. Wonder how it got to China.
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#12
Former Anthroscape member Kheshigten/Tsakhur's 23andMe results

[Image: B0zjwaQ.png]

Region breakdown maps
Quote:
Show Content

Quote:My MDLP K23b results:

Admix Results (sorted):

# Population Percent
1 South_East_Asian 47.5
2 Austronesian 30.31
3 Tungus-Altaic 19.21
4 South_Indian 1.44
5 Paleo_Siberian 1.1
6 Archaic_Human 0.29
7 Near_East 0.13
8 Ancestral_Altaic 0.01

Single Population Sharing:

# Population (source) Distance
1 Hakka ( ) 3.59
2 Han_Singapore ( ) 3.95
3 Chinese_Taiwan ( ) 4.5
4 Han ( ) 4.66
5 Jinuo ( ) 6.41
6 Tujia ( ) 6.46
7 She ( ) 8.41
8 Hmong_Miao ( ) 8.47
9 Lawa ( ) 8.68
10 Han-Mandarin ( ) 9.13
11 Cantonese ( ) 9.2
12 Paluang ( ) 9.59
13 Yao ( ) 10.06
14 Miao ( ) 10.36
15 Hmong ( ) 10.72
16 Wa ( ) 11.06
17 Karen ( ) 11.47
18 Plang ( ) 13.12
19 Tai_Yuan ( ) 14.37
20 Tai_Khuen ( ) 15.69

Mixed Mode Population Sharing:
Show Content

Quote:MDLP K23b 4-Ancestors Oracle

MDLP K23b Oracle Rev 2014 Sep 16

Admix Results (sorted):

# Population Percent
1 South_East_Asian 47.50
2 Austronesian 30.31
3 Tungus-Altaic 19.21
4 South_Indian 1.44
5 Paleo_Siberian 1.10


Finished reading population data. 620 populations found.
23 components mode.

--------------------------------

Least-squares method.

Using 1 population approximation:
1 Hakka_ @ 3.807661
2 Han_Singapore_ @ 4.281514
3 Chinese_Taiwan_ @ 4.819707
4 Han_ @ 5.001935
5 Tujia_ @ 6.762456
6 Jinuo_ @ 7.059114
7 She_ @ 9.139069
8 Hmong_Miao_ @ 9.492093
9 Lawa_ @ 9.597943
10 Han-Mandarin_ @ 10.124299
11 Cantonese_ @ 10.330509
12 Paluang_ @ 10.481288
13 Miao_ @ 11.079558
14 Yao_ @ 11.351115
15 Hmong_ @ 12.067934
16 Wa_ @ 12.167025
17 Karen_ @ 12.265606
18 Plang_ @ 14.483479
19 Tai_Yuan_ @ 16.126541
20 Tai_Khuen_ @ 17.618530

Using 2 populations approximation:
1 50% Han_North_ +50% Tai_Lue_ @ 1.968843


Using 3 populations approximation:
1 50% Miao_ +25% Ryukyuan_ +25% Yao_ @ 1.322152


Using 4 populations approximation:
++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++
Show Content

My commentary-
Quote:Tsakhur your MDLP K23b is more northern than I would've expected given your on-paper regional Chinese ancestry.

The 'Thai' reference population scores 41.23% S_EA, 38.01% AN, and 0.714 on my N-S East Asian cline. You score 0.545 on my N-S East Asian cline, which is lower than the 5 probable Thai Chinese individuals (judging from their names) in my private dataset. Your score on my cline is similar to the 2 Myanmar-origin individuals who are probably Myanmar Chinese though, and is close to my modeled Hunan and Fujian province averages.
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#13
(12-06-2023, 07:03 AM)okarinaofsteiner Wrote: Former Anthroscape member Kheshigten/Tsakhur's 23andMe results

My commentary-
Quote:Tsakhur your MDLP K23b is more northern than I would've expected given your on-paper regional Chinese ancestry.

The 'Thai' reference population scores 41.23% S_EA, 38.01% AN, and 0.714 on my N-S East Asian cline. You score 0.545 on my N-S East Asian cline, which is lower than the 5 probable Thai Chinese individuals (judging from their names) in my private dataset. Your score on my cline is similar to the 2 Myanmar-origin individuals who are probably Myanmar Chinese though, and is close to my modeled Hunan and Fujian province averages.


https://genoplot.com/discussions/topic/1...erages/333

Quote:Question in r/23andMe: Does anyone know how Northern and Southern Chinese admixture proportions vary between provinces?

Quote:Northern Chinese&Tibetan decreases gradually and Southern Chinese&Taiwanese gradually increases as you go south through Jiangsu. Then as you cross into Zhejiang, Southern Chinese&Taiwanese increases rapidly until you reach Zhejiang Wenzhou where results are 100% Southern Chinese&Taiwanese. In Northern Fujian, people start scoring a percentage of South Chinese and it increases until it becomes around 100% South Chinese in Cantonese people in Guangdong.
  • In Northern Jiangsu around Xuzhou and further North, 23andme results are 100% Northern Chinese & Tibetan.
  • In Southern Jiangsu Northern Chinese & Tibetan is in the 50-65% range, rest Southern Chinese & Taiwanese.
  • Ningbo in Northern Zhejiang will score about 15-20% Northern Chinese & Tibetan, the rest Southern Chinese & Taiwanese.
  • Wenzhou in Southern Zhejiang province is about where people score 100% Southern Chinese & Taiwanese.
  • Northern Fujianese will score 80-85% Southern Chinese & Taiwanese, rest South Chinese.
  • Southern Fujianese (this includes Taiwanese) score 60-70% Southern Chinese & Taiwanese, rest South Chinese.
  • Teochews score 50-60% Southern Chinese & Taiwanese, rest South Chinese.
  • Cantonese score 90-100% South Chinese, rest usually Chinese Dai (although I have seen small percentages of Vietnamese or Filipino & Austronesian even after the most recent update).

For the inland north-south cline, it seems that Southern Shaanxi is where Southern Chinese & Taiwanese increases rapidly, because Xi'an results are mostly 100% Northern Chinese & Tibetan while Northern Hubei and Wuhan score 10-20% Northern Chinese & Tibetan, 75-90% Southern Chinese & Taiwanese, 0-3% South Chinese. I haven't seen many Hunan results but they will probably score 60-70% Southern Chinese & Taiwanese and the rest South Chinese. I haven't seen many Sichuan results but people from the southern part of Sichuan in Chengdu or Chongqing will probably score 90-100% Southern Chinese & Taiwanese, the rest probably South Chinese.

So Tsakhur's 67.5% Lingnan (South Chinese) to 21.6% non-Lingnan rice-growing (Southern & Taiwanese) ratio seems consistent with him being 1/2 Hainanese-Teochew + 1/8 Hakka + 1/8 Hokkien(?)
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#14
2023 Chinese British forum member ronin92's MDLP K23b results

Quote:Paternal grandfather from Siyi (W Guangdong), paternal grandmother from HK (who knows where before then, should have asked her while she lived), maternal grandfather from Panyu, maternal grandmother from Sanshui. Yup, I should be pretty much Yue Guangdongese...

MDLP K23b 4-Ancestors Oracle
This program is based on 4-Ancestors Oracle Version 0.96 by Alexandr Burnashev.
Questions about results should be sent to him at: [email protected]
Original concept proposed by Sergey Kozlov.
Many thanks to Alexandr for helping us get this web version developed.

MDLP K23b Oracle Rev 2014 Sep 16

Admix Results (sorted):

# Population Percent
1 South_East_Asian 46.40
2 Austronesian 42.48
3 Tungus-Altaic 10.79


Finished reading population data. 620 populations found.
23 components mode.

--------------------------------

Least-squares method.

Using 1 population approximation:
1 Vietnamese_ @ 4.397965
2 Yao_ @ 4.702821
3 Cantonese_ @ 5.073689
4 Hmong_ @ 5.681221
5 Tai_Lue_ @ 5.896916
6 Tai_Khuen_ @ 6.202476
7 Zhuang_ @ 6.365070
8 Jiamao_ @ 6.635678
9 Yong_ @ 6.639216
10 Tai_Yuan_ @ 7.469939
11 Hmong_Miao_ @ 7.553263
12 Vietnamese_north_ @ 8.985241
13 Kinh_Vietnam_KHV_ @ 10.135927
14 She_ @ 11.214406
15 Vietnamese_central_ @ 11.357127
16 Han_Singapore_ @ 11.727256
17 Han_ @ 11.939180
18 Vietnamese_south_ @ 12.296912
19 Chinese_Dai_ @ 12.680389
20 Plang_ @ 12.694946

Using 2 populations approximation:
1 50% Chinese_Taiwan_ +50% Dai_ @ 1.272562


Using 3 populations approximation:
1 50% Dai_ +25% Dai_ +25% Korean_ @ 1.094572


Using 4 populations approximation:
Show Content

My commentary-
Quote:Thanks for sharing! This is definitely in the more SEA-shifted range for Yue-speaking Guangdong Han- none of my confirmed Canto subset of my private Chinese samples score that high on Austronesian or that low on Tungus_Altaic, but many of my far western Guangdong samples in my DNAConnect.org adoptees dataset do.

What other components show up in your MDLP K23b Oracle (not Oracle-4) results? The 3 big components only add up to 99.67.

Quote:Admix Results (sorted):

# Population Percent
1 South_East_Asian 46.4
2 Austronesian 42.48
3 Tungus-Altaic 10.79
4 Melano_Polynesian 0.19
5 Archaic_Human 0.14
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply
#15
(11-29-2023, 01:24 AM)okarinaofsteiner Wrote: Here's another (very recent) r/23andMe post of a Taiwanese-speaking Mainland Chinese person of Zhangpu and Xiamen ancestry

[Image: T4DILzCl.png]
75.1% non-Lingnan Southern Chinese, 23.6% Lingnan Chinese, 0.2% Broadly Chinese, 1.1% Filipino and Austronesian. Seems like roughly what you'd expect for someone whose ancestry is that specific part of China, although judging from the general Taiwanese results I've seen the Lingnan Chinese score might be on the lower end

https://np.reddit.com/r/23andme/comments...ry_report/

Quanzhou ancestry results- 70.7% non-Lingnan Southern Chinese, 28.3% Lingnan Chinese, 0.4% Broadly Chinese, 0.2% Filipino & Austronesian, 0.2% Iranian + Caucasian + Mesopotamian (Eastern West Asia)
[Image: RAG11TPl.png]
anti-racist on here for kicks and giggles

“If you want to grant your own wish, then you should clear your own path to it”
― Okabe Rintarou

“Never doubt that a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.”.
― Margaret Mead
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)