Release: BAM to plink - Printable Version +- The GenArchivist Forum (https://genarchivist.com) +-- Forum: Miscellaneous (https://genarchivist.com/forumdisplay.php?fid=153) +--- Forum: Tutorials (https://genarchivist.com/forumdisplay.php?fid=159) +---- Forum: Other (https://genarchivist.com/forumdisplay.php?fid=164) +---- Thread: Release: BAM to plink (/showthread.php?tid=645) |
Release: BAM to plink - teepean - 03-16-2024 So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant) There are both Linux and Windows versions available. The instructions are very simple so I would like to get feedback from anyone interested in creating their own datasets from BAMs. Notice: Windows version includes all the files necessary to run except references. For Linux you need to have samtools and pileupCaller in your path. pileupCaller can be downloaded from here and for samtools I assume you know how to use apt, pacman, yum etc. https://github.com/stschiff/sequenceTools Main page: https://github.com/teepean/adna_to_dataset Download: https://github.com/teepean/adna_to_dataset/archive/refs/tags/v.0.2.zip PileupCaller uses default settings and if you want to modify them you have to edit the .bat or .sh. EDIT: the program supports only hs37d5 and hg19 as references as those are the most commonly used in aDNA papers. hg38/T2T support can be added if AADR starts supporting those references. RE: Release: BAM to plink - Qrts - 03-16-2024 Brilliant work. Thank you for your contributions teepean. RE: Release: BAM to plink - teepean - 03-16-2024 New release that includes a reference downloader. https://github.com/teepean/adna_to_dataset/archive/refs/tags/v.0.3.zip RE: Release: BAM to plink - ChrisR - 03-17-2024 (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant) Sorry for n00b question but what is the main purpose of creating own datasets from BAMs? Regarding references download etc. may I ask if this could be installed together with WGSExtract so that big (reference) files/paths can be shared? RE: Release: BAM to plink - miquirumba - 03-17-2024 (03-16-2024, 05:37 PM)teepean Wrote: So I have had questions over the years about creating datasets and here I present a new program called aDNA to dataset (AKA make-myself-redundant) RE: Release: BAM to plink - teepean - 03-17-2024 (03-17-2024, 12:57 PM)ChrisR Wrote: Sorry for n00b question but what is the main purpose of creating own datasets from BAMs? The idea is that more people could create datasets. As for the references it is possible to edit the code to point to a different location. ..\winbin\samtools mpileup -B -q 30 -Q 30 -l ../positions/v42.4.1240K.pos -f ../reference/hs37d5.fa RE: Release: BAM to plink - Kale - 03-17-2024 Excellent! This sounds very useful. Quick question, about how much RAM does this program utilize on Windows, and is it compatible with 32-bit OS? RE: Release: BAM to plink - Fabrice E - 03-17-2024 ok Technical question from a noob : I am running the program from Windows. The download for the references starts without any issues. Then, I encounter the question "enter population name". Which population does this refer to? RE: Release: BAM to plink - Anglesqueville - 03-17-2024 (03-17-2024, 05:05 PM)Fabrice E Wrote: ok The outcome is a PLINK packedped consisting of 3 files (.bed, .bim, .fam) that will be created in the target subdirectory. The .fam indicates for each individual a Population Name and an Individual Name. These are what the program asks you to choose. RE: Release: BAM to plink - TanTin - 03-18-2024 Can someone convert these 2 files to plink please? GRC13292545.chrY.bam GRC13292546.chrY.bam https://evolbio.ut.ee/chrY/ These are Y chromosomes of A00 used by Karmin et al. (2015) . RE: Release: BAM to plink - TanTin - 03-18-2024 In addition , these BAM files are hg18 ( not hg19) . ##reference=file:///cvmfs/data.galaxyproject.org/byhand/hg18/sam_index/hg18.fa The snp positions POS ID are not the same as hg19. RE: Release: BAM to plink - teepean - 03-19-2024 (03-18-2024, 01:57 PM)TanTin Wrote: Can someone convert these 2 files to plink please? This program should not be used for Y-DNA. RE: Release: BAM to plink - teepean - 03-19-2024 New version released with reference auto-detection: https://github.com/teepean/adna_to_dataset/archive/refs/tags/v.0.5.zip |