Hello guest, if you read this it means you are not registered. Click here to register in a few simple steps, you will enjoy all features of our Forum.

Ancient genomes illuminate Eastern Arabian population history
#16
(03-01-2024, 11:56 PM)Tomenable Wrote:
(03-01-2024, 11:48 PM)TanTin Wrote:
(03-01-2024, 11:37 PM)Tomenable Wrote: I think that paper has data from Qatar, but not from Bahrain?

Yes, mostly Qatar, but there are many others as well. There are more than 800 in total.

Which of these samples are from Bahrain?

Not from Bahrain exactly, but from the region in general. The populations there are not very different from each other.
Reply
#17
Regarding these 4 samples: we have the fastq data: (4 files)

https://www.ebi.ac.uk/ena/browser/view/PRJEB71330
Reply
#18
Converted one of the Bahrain individuals:

https://ufile.io/7yaq3rrq

(Four genomes from Tylos-period Bahrain:   SAMEA115108455  ERX11887579  ERR12512058  9606 )
Reply
#19
(03-01-2024, 11:14 PM)Tomenable Wrote: Teepean converted them. I sent you in a PM.

For some reason I couldn't align them with bwa aln but had to use bwa mem.
Reply
#20
(03-02-2024, 01:28 PM)teepean Wrote:
(03-01-2024, 11:14 PM)Tomenable Wrote: Teepean converted them. I sent you in a PM.

For some reason I couldn't align them with bwa aln but had to use bwa mem.

I also did the alignment . Here are the 4 files:

https://ufile.io/8s5jtzvn

Please note: there are multiple variants with 3+ alleles present

# 3 more multiple-position warnings: see log file.
# Error: 98093 variants with 3+ alleles present.

So for one individual that I checked, I had to exclude almost half of all the snips because of the 3-rd allele present. (ERR12512058)  . Excluding these 98000 snips is not fatal, there are 170 k other snips left.
The other 3 samples have lot more snips, but I didn't have time to check more on them.
Capsian20 and teepean like this post
Reply
#21
Another version for Tylos, I removed the variants with 3+ alleles present.

> missing$F_MISS
[1] 0.8138 0.3887 0.4102 0.4722
> missing$IID
[1] "B_ERR12512058" "ERR12512055.bam" "ERR12512056.bam" "ERR12512057.bam"

https://ufile.io/cwgdu23l

The quality if this data is still very good. More than 700 k variants.
Capsian20 likes this post
Reply
#22
Here's the plink dataset.

https://drive.google.com/file/d/1FLMV-Eg...sp=sharing
ionix and Capsian20 like this post
Reply
#23
Photo 
When I do a projection for the new Bahrain data, I have different positions on PCA.
[Image: Bahrain.png]

I am not sure which one is the correct position. When I did the alignment, I skip the step with the filtering. Also I notice a difference in my .bim file. In my file I have such info: 


23 rs4829294 0 33611081 A C
23 rs12014055 0 33618765 . C   <-----------------------Dots for the missing
23 rs5972902 0 33619246 G A
23 snp_23_33623357 0 33623357 . G
23 rs1878889 0 33625822 A C

in teepan .bim file: 
23 rs12013178 0.981526 53292827 C A  <--- all the alleles are listed properly.
23 rs2315863 0.981748 53302195 G C
23 rs1409117 0.981761 53302495 A G
23 rs6638377 0.981833 53304115 G T
23 rs2094145 0.982004 53307984 G C
23 rs6529669 0.982055 53308578 C T
23 rs188067797 0.982304 53310072 G C

This could be a result from the convert from vcf to bed/bim/fam. 
I did the alignment on galaxy.org site, as per the steps described in the other topic (except that I filter only  ID=="." )

As you may see, my data on PCA is more related to Yemen / Eritrea/ Morocco.  The data converted from Teepean is more related to Iran, Causas and Europe.  It seems in the official publication they also find the position as more related to Europe. 
Capsian20 likes this post
Reply
#24
This is the official PCA from the publication:
https://ars.els-cdn.com/content/image/1-...4X-gr1.jpg[Image: 1-s2.0-S2666979X2400034X-gr1.jpg]
As you may notice, the ancient Bahrain individuals goes almost in the middle of Europe and Caucasus.
Capsian20 likes this post
Reply
#25
(03-02-2024, 10:14 PM)TanTin Wrote: When I do a projection for the new Bahrain data, I have different positions on PCA.
[Image: Bahrain.png]

I am not sure which one is the correct position. When I did the alignment, I skip the step with the filtering. Also I notice a difference in my .bim file. In my file I have such info: 


23 rs4829294 0 33611081 A C
23 rs12014055 0 33618765 . C   <-----------------------Dots for the missing
23 rs5972902 0 33619246 G A
23 snp_23_33623357 0 33623357 . G
23 rs1878889 0 33625822 A C

in teepan .bim file: 
23 rs12013178 0.981526 53292827 C A  <--- all the alleles are listed properly.
23 rs2315863 0.981748 53302195 G C
23 rs1409117 0.981761 53302495 A G
23 rs6638377 0.981833 53304115 G T
23 rs2094145 0.982004 53307984 G C
23 rs6529669 0.982055 53308578 C T
23 rs188067797 0.982304 53310072 G C

This could be a result from the convert from vcf to bed/bim/fam. 
I did the alignment on galaxy.org site, as per the steps described in the other topic (except that I filter only  ID=="." )

As you may see, my data on PCA is more related to Yemen / Eritrea/ Morocco.  The data converted from Teepean is more related to Iran, Causas and Europe.  It seems in the official publication they also find the position as more related to Europe. 

I used following process: remove adapters -> align with bwa aln -> remove duplicates -> get the dataset with pileupCaller.
TanTin likes this post
Reply
#26
Here are the coords:

Quote:AS_EMT2,0.084229,0.126941,-0.068636,-0.064277,-0.03139,-0.030678,-0.005405,-0.013846,-0.015339,-0.006378,0.002273,-0.002698,0.010406,-0.007982,0.001357,0.02254,0.005476,-0.000127,0.005405,-0.006503,-0.012353,-0.008656,-0.00456,0.000723,0.007664
MH2_LT2,0.084229,0.118817,-0.066373,-0.046512,-0.033237,-0.011992,0.00329,-0.001385,-0.01493,-0.007107,0.002436,-0.008842,0.020069,0.005092,0.002307,0.00411,-0.020861,0.002154,0.002891,-0.005878,-0.006613,0.000618,-0.005546,0.001325,0.007784
MH1_LT2,0.086506,0.128972,-0.067127,-0.064277,-0.03139,-0.012829,-0.001175,-0.006692,-0.000614,-0.005103,0.003897,-0.014987,0.023637,-0.003441,0.001493,0.001856,-0.019558,-0.002787,-0.000754,-0.001501,0.007112,-0.001978,0.001232,-0.00241,0.00467
MH3_LT2,0.087644,0.132019,-0.071276,-0.075905,-0.034468,-0.017849,-0.006345,-0.008077,0.004909,0.000364,0.008119,-0.016335,0.024083,0.003716,-0.000136,0.017634,-0.002086,0.011529,0.007668,0.001376,0.001996,0.003586,-0.002711,0.00012,0.003832
ionix likes this post
Reply
#27
[Image: LIhQdcN.png]

They're significantly Iranian and Arabian admixed, with some South-Central Asian in the mix. Although it doesn't seem like MH3 is 'Levantine admixed' as they said, it's only 'less Iranian'. 

The sources:
Code:
Mesopotamia_BA:Iran_DinkhaTepe_MLBA,0.0878714,0.141768,-0.0678816,-0.0858534,-0.0256046,-0.0293392,0.000329,-0.0077536,-0.0067084,0.00277,0.0073724,-0.0039562,0.0107928,0.0030552,-0.0021172,0.0091222,-0.0037552,0.0009376,0.003947,0.0007254,0.003968,0.0035118,-0.005472,-0.0021448,0.0058436
Mesopotamia_BA:Turkey_Mesopotamia_Sirnak_EBA,0.09418875,0.14318925,-0.06665625,-0.0799425,-0.019388,-0.03005025,-0.00188,-0.00738425,-0.014981,0.00154875,0.00499325,-0.00273525,0.00364225,0.00681225,-0.0075325,-0.00089475,-0.01258225,0.003357,0.00402225,-0.001657,0.00698775,0.00398775,-0.005053,-0.010905,0.000988
Syria_TellQarassa_Umayyad,0.0682935,0.155376,-0.0550595,-0.127909,-0.0004615,-0.056615,-0.0186835,-0.0125765,0.0651405,0.0072895,0.0185935,-0.028924,0.069053,0.004473,0.01079,0.0230045,-0.035269,-0.003864,0.00088,0.0345165,0.0207135,0.0072335,-0.0070865,0.0030125,-0.01443
Iranian_Zoroastrian,0.091627545,0.10778459,-0.063219091,-0.022521909,-0.045365,0.0040438636,0.0018586364,-0.005318,-0.028373045,-0.016840318,-0.00025090909,0.000361,0.0043652273,-0.0049418182,0.0071253182,0.013554227,-0.0035262273,0.002885,0.0017312727,-0.0087711818,-0.0032215909,-0.0037656818,0.00067772727,-0.003489,0.0053668636
Armenia_Lchashen_LBA,0.10974442,0.12727967,-0.038403583,-0.015315583,-0.034365333,0.0029515833,0.0066193333,-0.0068074167,-0.04678475,-0.02426775,0.002422,0.0033346667,-0.010492917,-0.002913,0.0070575833,-0.0026739167,-0.00140175,0.00029566667,-0.00116275,0.0016153333,0.0038474167,0.00021641667,-0.00126325,-0.00215875,-0.0011575
Kalash,0.083556455,0.024972818,-0.084166409,0.066391182,-0.071649591,0.039982773,0.0029695455,0.0020559545,-0.030743636,-0.025256318,-0.0056910909,-0.00052440909,-0.0027299545,-0.011003545,0.016724455,0.0086965909,-0.013951136,0.0020442727,0.00063981818,-0.0128755,-0.0038171818,-0.0053057273,0.0030475909,-0.0034286364,0.0034945
pegasus, ionix, Megalophias And 1 others like this post
Reply
#28
When are these dated too??
Qrts likes this post
Reply
#29
The paper says the raw data has been published at accession PRJEB31781. Does anyone know how to match the names from PRJEB71330 to this one?
Qrts likes this post
Reply
#30
(03-03-2024, 11:35 AM)pegasus Wrote: When are these dated too??

Quote:Due to poor collagen preservation, we could only obtain radiocarbon dates for two out of the four sequenced individuals, placing them in the Late Tylos/Sasanian period (LT, ∼300–622 CE), with MH1 being older (432–561 cal. CE) than MH3 (577–647 cal. CE) (Figure S1B). Sample MH2 was not directly dated, but its archaeological context places it in the Late Tylos period. The Abu Saiba sample was excavated from a cemetery with known occupation between 200 BCE and 300 CE,7,27 and therefore it dates confidently within the boundaries of the Early/Middle Tylos period (EMT), more precisely during the times of Seleucid and Characene influence in Bahrain, which preceded the emergence of the Sasanian Empire.
Qrts and pegasus like this post
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)