Updated EukPhylo QuickStart (markdown)

GiuliaRibeiro 2025-04-10 09:15:47 -04:00
parent cfcf0bba3a
commit d8654eadec

@ -131,14 +131,7 @@ If a user choose to use their own gene families database, they need to replace t
In a main project directory:
* Create a `Scripts` folder containing the 8 scripts from GitHub [here](https://github.com/Katzlab/EukPhylo/tree/main/PTL2/Scripts)
* In addition to the scripts, also add the `trimal-trimAl` and `guidance.v2.02` folders, as downloaded from [here](https://github.com/inab/trimal) and [here](https://github.com/anzaika/guidance). Smith College HPC (Grid) users see [here](https://docs.google.com/document/d/1tDxaCrVEHckyvlaaY58lNQnJ4UHTRAMICdR7xzzjPTE/edit?tab=t.0)
* IMPORTANT NOTE: Please make sure to correct the paths in the `guidance.py` script with the full path of the location of your `trimal-trimAl` and `guidance.v2.02`.
Lines to modify:
`os.system('python [full path for guidance]/guidance.pl --seqFile ' + guidance_input + '/' + file + ' --msaProgram MAFFT --seqType aa --outDir ' + tax_guidance_outdir + ' --seqCutoff ' + str(params.seq_cutoff) + ' --colCutoff ' + str(params.col_cutoff) + " --outOrder as_input --bootstraps 10 --MSA_Param '\\--" + mafft_alg + " --maxiterate 1000 --thread " + str(params.guidance_threads) + " --bl 62 --anysymbol' > " + params.output + '/Output/Intermediate/Guidance/Output/' + file[:10] + '/log.txt')
`
and here:
`#Gap trimming
os.system('Scripts/trimal-trimAl/source/trimal -in ' + tax_guidance_outdir + '/' + file.split('.')[0].split('_preguidance')[0] + '.postGuidance_preTrimAl_aligned.fasta -out ' + tax_guidance_outdir + '/' + file.split('.')[0].split('_preguidance')[0] + '.95gapTrimmed.fasta -gapthreshold ' + str(params.trimal_cutoff) + ' -fasta')
`
* IMPORTANT NOTE: Please make sure to correct the paths in the `guidance.py` script with the full path of the location of your `trimal-trimAl` and `guidance.v2.02` as exemplified in the script.
* Create an empty output folder (e.g. `Output`) for results (i.e. guidance and tree outputs)
* Create a list of ten-digit codes for your target and outgroup taxa (e.g. `taxa.txt`)
* Create a folder (e.g. `R2Gs`) that contains the AA ReadyToGo fasta files for all taxa (from `taxa.txt`)