Hello guest, if you read this it means you are not registered. Click here to register in a few simple steps, you will enjoy all features of our Forum.

Check for new replies
R1a STR UMAP plot
#1
I tried to plot STR UMAP for R1a haplogroup with YDNA111 FTDNA results:

[Image: So0xTGS.jpg]

It is visible that Z93 and CTS1211 are intermingled, but other R1a clades are mostly separated.
The UMAP distance probably represents some measure of genetic distance.
L664 and M458 are clearly separated as if this implies a long independent duration.
rmstevens2, AimSmall, JonikW And 7 others like this post
Reply
#2
(11-03-2023, 02:46 PM)ph2ter Wrote: I tried to plot STR UMAP for R1a haplogroup with YDNA111 FTDNA results:

[Image: So0xTGS.jpg]

It is visible that Z93 and CTS1211 are intermingled, but other R1a clades are mostly separated.
The UMAP distance probably represents some measure of genetic distance.
L664 and M458 are clearly separated as if this implies a long independent duration.

Likely back to the CW period?
Reply
#3
May I ask about the way how to generate such a Map? The possibility to distinguish Y-STR clusters by their uniqueness or overlapping is certainly useful.
JMcB likes this post
---
Main Projects
: Tyrol DNA, Alpine DNA, J2-M172, J2a-M67, J2a-PF5197, ISOGG Wiki, GenWiki;
Focus on Y-DNA: J2a-M67-L210, J2a-PF5197-PF5169, R1a-M17, R1b-U106-Z372
Reply
#4
(11-05-2023, 01:26 PM)ChrisR Wrote: May I ask about the way how to generate such a Map? The possibility to distinguish Y-STR clusters by their uniqueness or overlapping is certainly useful.

I took the data from FTDNA site and normalised STR values.
And after that I made a PCA and an UMAP.
jamtastic and JMcB like this post
Reply
#5
I'm sorry but I have no experience/knowledge for normalizing STR and then creating PCA and UMAP (from the 111 dimensions?). Is there a tutorial online?
JMcB likes this post
---
Main Projects
: Tyrol DNA, Alpine DNA, J2-M172, J2a-M67, J2a-PF5197, ISOGG Wiki, GenWiki;
Focus on Y-DNA: J2a-M67-L210, J2a-PF5197-PF5169, R1a-M17, R1b-U106-Z372
Reply
#6
(11-05-2023, 09:30 PM)ChrisR Wrote: I'm sorry but I have no experience/knowledge for normalizing STR and then creating PCA and UMAP (from the 111 dimensions?). Is there a tutorial online?

The normalisation means that you must somehow the STRs that have multiple values limit into 2 or 4 (for example DYS464 has normally 4 values, but some users have more than 4 values, you must cut some of the values and leave only 4), then exclude the users that have some STR values equal to 0 and similar issues.
You can use Vahaduo tool for PCA. More professional tool is PAST and for anything more sophisticated you must use R.
I cannot teach you R, because it is not trivial.
Some tutorials were on AG.
jamtastic, Capsian20, ChrisR And 1 others like this post
Reply
#7
(11-05-2023, 10:18 PM)ph2ter Wrote: The normalisation means that you must somehow the STRs that have multiple values limit into 2 or 4 (for example DYS464 has normally 4 values, but some users have more than 4 values, you must cut some of the values and leave only 4), then exclude the users that have some STR values equal to 0 and similar issues.
You can use Vahaduo tool for PCA. More professional tool is PAST and for anything more sophisticated you must use R.
I cannot teach you R, because it is not trivial.
Some tutorials were on AG.

Thanks for the hints. Once I did a little in R but have forgotten almost all. Seems like coding the proper workflow is required unless a lot is done manually. So for now it seems I must keep this for a time with a lot of free time ;-)
ph2ter likes this post
---
Main Projects
: Tyrol DNA, Alpine DNA, J2-M172, J2a-M67, J2a-PF5197, ISOGG Wiki, GenWiki;
Focus on Y-DNA: J2a-M67-L210, J2a-PF5197-PF5169, R1a-M17, R1b-U106-Z372
Reply

Check for new replies

Forum Jump:


Users browsing this thread: 1 Guest(s)