Updated QuickStart EukPhylo (markdown)

Adri K. Grow 2025-02-03 11:52:16 -05:00
parent 91db5cb7b0
commit a5d9cfe85a

@ -182,7 +182,7 @@ Below are several optional ways to parameterize EukPhylo Part 2
## Contamination Removal ## Contamination Removal
Contamination removal within EukPhylo (also called Contamination Loop) allows for sequence removal based on Sisters/Subsisters identification or based on Clades diversity. An examplar run is available in [Figshare](https://figshare.com/articles/dataset/Examplar_runs_PhyloToL_and_CLoop/26662018) Contamination removal within EukPhylo (also called Contamination Loop) allows for sequence removal based on Sisters/Subsisters identification or based on Clades diversity. An examplar run is available in [Figshare](https://figshare.com/articles/dataset/Examplar_runs_PhyloToL_and_CLoop/26662018)
### Set up ### Set up:
* An input folder (called for example Input), with both * An input folder (called for example Input), with both
* the treefiles * the treefiles
* the fasta files matching the trees * the fasta files matching the trees
@ -191,7 +191,7 @@ Contamination removal within EukPhylo (also called Contamination Loop) allows fo
* a txt file containing the rules for contamination removal * a txt file containing the rules for contamination removal
* the Scripts Folder * the Scripts Folder
### Running ### Running:
Basic running of the Contamination loop, with the sister mode: Basic running of the Contamination loop, with the sister mode:
`python3 Scripts/eukphylo.py --start trees --end trees --data Input --output Output --contamination_loop seq --sister_rules sister_rules_file.txt > log.out` `python3 Scripts/eukphylo.py --start trees --end trees --data Input --output Output --contamination_loop seq --sister_rules sister_rules_file.txt > log.out`
@ -215,11 +215,11 @@ Options:
| --cl_exclude_taxa | no | Any valid path | Path to a file containing taxon names present in input MSA/tree files but which should be removed in the first iteration of the contamination loop. | none | | --cl_exclude_taxa | no | Any valid path | Path to a file containing taxon names present in input MSA/tree files but which should be removed in the first iteration of the contamination loop. | none |
## Concatenation: ## Concatenation
EukPhylo includes an option to choose orthologs and produce a concatenated alignement. EukPhylo includes an option to choose orthologs and produce a concatenated alignement.
### Set up ### Set up:
* A folder called `Output` containing all outputs from the main pipeline (with Guidance, Trees, Pre-Guidance, NotGapTrimmed folders) * A folder called `Output` containing all outputs from the main pipeline (with Guidance, Trees, Pre-Guidance, NotGapTrimmed folders)
* the Scripts folder * the Scripts folder
* a list of taxa to concatenate * a list of taxa to concatenate