GitHub - HudsonAlpha/Run-Tractor: A pipeline used to prepare files for Tractor-Mix. Performs phasing with shapeit5 and local ancestry with rfmix2.

A pipeline used to prepare files for Tractor-Mix. Performs phasing with shapeit5 and local ancestry with rfmix2.

Use /cluster/home/jtaylor/scripts/Run_Tractor/run_tractor.sh to generate full phasing and local ancestry outputs. Takes a gzipped vcf and an output directory as arguments.
Use /cluster/home/jtaylor/scripts/Run_Tractor/plink_processing.sh to filter down to the analysis subtype you are working on, prune related individuals, and generate a standardized kinship matrix. Takes a plink bfile, phenotype string, and a subset file (if filtering down to a specific phenotype). All work is performed in the working directory.
Create a sample list with the resulting unrelated fam file generated from the plink processing (will be used to filter phasing and local ancestry results).
Using /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/latam5k_ftd_unrelated_null_model_fit_v2_1-25-26.` as an example, make a new script with the correct paths to the new unrelated plink dataset, the kinship matrix, and covariate files, as well as the path for the output null model. *** Be sure to update the formula for the null model to include the covariates you want. *** Be sure to load this micromamba env before running or make sure you have the “GMMAT” and “data.table” R libraries installed: /cluster/home/jtaylor/micromamba/envs/r-env
Filter phasing results to the new unrelated/phenotype individuals. You can use this script as an example: /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/filter_vcf_for_ftd_and_chr_batch_job.sh I would suggest making a copy of the script in the phasing dir and updating the paths and naming convention.

The script is set up to run as a batch, you can submitted using a command like this: sbatch --array [1-22]%6 -o logs/filter_phased-%A_%a.out filter_vcf_for_ftd_and_chr_batch_job.sh /cluster/home/jtaylor/scripts/Run_Tractor/resources/chrs.txt
Filter local ancestry results the new unrelated/phenotype individuals. You can use this script as an example: /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/filter_map_files_ftd_unrelated.sh Again, I would recommend making a copy of the script and updating paths and naming convention.
Use Tractor-Mix extract tracts to generate final files for Tractor-Mix GWAS. There are two scripts to do this:

a. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/extract_tracts/extract_tracts_ftd_unrelated.sh

b. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/extract_tracts/run_extract_tracts_ftd_unrelated.sh

Make copies of the files in the directory you want the results in, the update the paths and naming conventions in both. The run_extract_tracts_ftd_unrelated.sh script with run the other, but batch the jobs for chromosome. You may also need to update num-ancs in extract_tracts_ftd_unrelated.sh if you are using anything besides 3 ancestries. This will generate chromosome/ancestry split dosage files for the main Tractor-Mix script.
Run the main Tractor-Mix pipeline using these two scripts:

a. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/run_tractor_mix.R

b. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/run_tractor_mix_by_chr.sh

Make copies of the scripts and update paths and naming conventions in both. You will need to update the path to the R script in the shell script too. *** There are some env variables in the shell script that need to be adjusted if you change the memory or cores of the script, there is a comment in the script for this ***

You can run this workflow with a command like this: sbatch -o /cluster/projects/ADFTD/redlat_paper_2/LATAM5k_12-3-25/tractor/tractor_mix_ftd_unrelated_AC5_site_PC1-10/logs/tractor_mix_run-%A_%a.out /cluster/projects/ADFTD/redlat_paper_2/LATAM5k_12-3-25/tractor/tractor_mix_ftd_unrelated_AC5_site_PC1-10/run_tractor_mix_by_chr.sh /cluster/home/jtaylor/scripts/Run_Tractor/resources/chrs.txt
Combine the chromosome results from Tractor-Mix and filter variants for blacklist and snps only. You can use /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/combine_and_filter_tractor_mix_results.sh as an example. Make a copy and update paths and naming conventions. This will be final GWAS results.
To generate the ADMIXTURE plot, use the two plots:

a. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/combine_Q.R

b. /cluster/home/jtaylor/scripts/Run_Tractor/example_workflow_scripts/plot_ancestry_proportions.R

Just like the other scripts, make a copy and update paths/naming convention. Run combine Q first to create overall ancestry proportions (chromosome size aware). Then you can plot those results with the other script. You can use the r-env micromamba env for these two scripts as well.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
example_workflow_scripts		example_workflow_scripts
.gitignore		.gitignore
README.md		README.md
check_chunk_jobs.sh		check_chunk_jobs.sh
chunk_vcf.sh		chunk_vcf.sh
ligate_phased_chunks.sh		ligate_phased_chunks.sh
local_ancestry.sh		local_ancestry.sh
plink_processing.sh		plink_processing.sh
run_tractor.sh		run_tractor.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

HudsonAlpha/Run-Tractor

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages