The GenArchivist Forum
R1a STR UMAP plot - Printable Version

+- The GenArchivist Forum (https://genarchivist.com)
+-- Forum: Human Population Genetics (https://genarchivist.com/forumdisplay.php?fid=21)
+--- Forum: Y-Chromosome (Y-DNA) Haplogroups (https://genarchivist.com/forumdisplay.php?fid=22)
+---- Forum: R (https://genarchivist.com/forumdisplay.php?fid=53)
+----- Forum: R1a (https://genarchivist.com/forumdisplay.php?fid=54)
+----- Thread: R1a STR UMAP plot (/showthread.php?tid=217)



R1a STR UMAP plot - ph2ter - 11-03-2023

I tried to plot STR UMAP for R1a haplogroup with YDNA111 FTDNA results:

[Image: So0xTGS.jpg]

It is visible that Z93 and CTS1211 are intermingled, but other R1a clades are mostly separated.
The UMAP distance probably represents some measure of genetic distance.
L664 and M458 are clearly separated as if this implies a long independent duration.


RE: R1a STR UMAP plot - leonardo - 11-05-2023

(11-03-2023, 02:46 PM)ph2ter Wrote: I tried to plot STR UMAP for R1a haplogroup with YDNA111 FTDNA results:

[Image: So0xTGS.jpg]

It is visible that Z93 and CTS1211 are intermingled, but other R1a clades are mostly separated.
The UMAP distance probably represents some measure of genetic distance.
L664 and M458 are clearly separated as if this implies a long independent duration.

Likely back to the CW period?


RE: R1a STR UMAP plot - ChrisR - 11-05-2023

May I ask about the way how to generate such a Map? The possibility to distinguish Y-STR clusters by their uniqueness or overlapping is certainly useful.


RE: R1a STR UMAP plot - ph2ter - 11-05-2023

(11-05-2023, 01:26 PM)ChrisR Wrote: May I ask about the way how to generate such a Map? The possibility to distinguish Y-STR clusters by their uniqueness or overlapping is certainly useful.

I took the data from FTDNA site and normalised STR values.
And after that I made a PCA and an UMAP.


RE: R1a STR UMAP plot - ChrisR - 11-05-2023

I'm sorry but I have no experience/knowledge for normalizing STR and then creating PCA and UMAP (from the 111 dimensions?). Is there a tutorial online?


RE: R1a STR UMAP plot - ph2ter - 11-05-2023

(11-05-2023, 09:30 PM)ChrisR Wrote: I'm sorry but I have no experience/knowledge for normalizing STR and then creating PCA and UMAP (from the 111 dimensions?). Is there a tutorial online?

The normalisation means that you must somehow the STRs that have multiple values limit into 2 or 4 (for example DYS464 has normally 4 values, but some users have more than 4 values, you must cut some of the values and leave only 4), then exclude the users that have some STR values equal to 0 and similar issues.
You can use Vahaduo tool for PCA. More professional tool is PAST and for anything more sophisticated you must use R.
I cannot teach you R, because it is not trivial.
Some tutorials were on AG.


RE: R1a STR UMAP plot - ChrisR - 11-18-2023

(11-05-2023, 10:18 PM)ph2ter Wrote: The normalisation means that you must somehow the STRs that have multiple values limit into 2 or 4 (for example DYS464 has normally 4 values, but some users have more than 4 values, you must cut some of the values and leave only 4), then exclude the users that have some STR values equal to 0 and similar issues.
You can use Vahaduo tool for PCA. More professional tool is PAST and for anything more sophisticated you must use R.
I cannot teach you R, because it is not trivial.
Some tutorials were on AG.

Thanks for the hints. Once I did a little in R but have forgotten almost all. Seems like coding the proper workflow is required unless a lot is done manually. So for now it seems I must keep this for a time with a lot of free time ;-)