Updated QuickStart EukPhylo (markdown)

Adri K. Grow 2025-02-03 11:52:16 -05:00
parent 91db5cb7b0
commit a5d9cfe85a

@ -182,7 +182,7 @@ Below are several optional ways to parameterize EukPhylo Part 2
## Contamination Removal
Contamination removal within EukPhylo (also called Contamination Loop) allows for sequence removal based on Sisters/Subsisters identification or based on Clades diversity. An examplar run is available in [Figshare](https://figshare.com/articles/dataset/Examplar_runs_PhyloToL_and_CLoop/26662018)
### Set up
### Set up:
* An input folder (called for example Input), with both
* the treefiles
* the fasta files matching the trees
@ -191,7 +191,7 @@ Contamination removal within EukPhylo (also called Contamination Loop) allows fo
* a txt file containing the rules for contamination removal
* the Scripts Folder
### Running
### Running:
Basic running of the Contamination loop, with the sister mode:
`python3 Scripts/eukphylo.py --start trees --end trees --data Input --output Output --contamination_loop seq --sister_rules sister_rules_file.txt > log.out`
@ -215,11 +215,11 @@ Options:
| --cl_exclude_taxa | no | Any valid path | Path to a file containing taxon names present in input MSA/tree files but which should be removed in the first iteration of the contamination loop. | none |
## Concatenation:
## Concatenation
EukPhylo includes an option to choose orthologs and produce a concatenated alignement.
### Set up
### Set up:
* A folder called `Output` containing all outputs from the main pipeline (with Guidance, Trees, Pre-Guidance, NotGapTrimmed folders)
* the Scripts folder
* a list of taxa to concatenate