mirror of
http://43.156.76.180:8026/YuuMJ/EukPhylo.git
synced 2025-12-28 05:30:30 +08:00
Updated EukPhylo Part 2: MSAs, trees, and contamination loop (markdown)
parent
25b3665f36
commit
e01b7d47f5
@ -121,10 +121,11 @@ Argument | Default | Choices | Help
|
||||
|
||||
## Guidance
|
||||
|
||||
Within EukPhylo part 2, we use Guidance to assess homology within gene families. EukPhylo part 2 runs Guidance in an iterative fashion to remove non-homologous sequences, defined as those that fall below the sequence score cutoff (we note that there is some stochasticity here). Users should consult the [Guidance 2.02 documentation](https://taux.evolseq.net/guidance/) for details on the significance of these parameters. After inspecting a diversity of gene families, we have lowered the default sequence score cutoff from 0.6 to 0.3, though this may not be appropriate for all genes. All sequences removed by Guidance are listed in output files with their score, and MSAs are rebuilt after each iteration. Some options are available to change the default set up for Guidance:
|
||||
Within EukPhylo part 2, we use Guidance to assess homology within gene families. EukPhylo part 2 runs Guidance in an iterative fashion to remove non-homologous sequences, defined as those that fall below the sequence score cutoff (we note that there is some stochasticity here). We note that we initially wrote the tool to use Guidance v2.0.2, but have since updated it to use Guidance v2.1, which runs faster but otherwise performs very similarly. Users who wish to use the older version of Guidance will have to make a small change in guidance.py (look for a comment in the script with the phrase "UNCOMMENT THE FOLLOWING LINE IF USING v2.0.2". Users should consult the [Guidance 2.02 documentation](https://taux.evolseq.net/guidance/) or the [Guidance v2.1 Github page](https://github.com/XseniaP/Guidance_mid/tree/main) for details on these parameters. After inspecting a diversity of gene families, we have lowered the default sequence score cutoff from 0.6 to 0.3, though this may not be appropriate for all genes. All sequences removed by Guidance are listed in output files with their score, and MSAs are rebuilt after each iteration. Some options are available to change the default set up for Guidance:
|
||||
|
||||
Argument | Default | Choices | Description
|
||||
-- | -- | -- | --
|
||||
--guidance_path | none | A valid path | Only required if running Guidance v2.1, and not required if running Guidance v2.0.2. Path to the downloaded Guidance folder (probably called guidance_Linux or guidance_MacOS-arm64, this folder should contain a folder called "script" which contains the guidance_main.py script). You can download this folder from this link: https://github.com/XseniaP/Guidance_mid/tree/main.
|
||||
--guidance_iters | 5 | Any positive integer | Number of Guidance iterations for sequence removal.
|
||||
--seq_cutoff | 0.3 | Any number between 0 and 1 | During guidance, taxa are removed if their score is below this cutoff.
|
||||
--col_cutoff | 0.0 | Any number between 0 and 1 | During guidance, columns are removed if their score is below this cutoff.
|
||||
@ -132,11 +133,6 @@ Argument | Default | Choices | Description
|
||||
--keep_temp | False | include or exclude the argument | Use this to keep ALL Guidance intermediate files.
|
||||
--keep_iter / -z | False | include or exclude the argument | Keep all Guidance iterations (beware this will be very large)
|
||||
|
||||
We initially developped EukPhylo using Guidance v2.0.2, and then updated the scripts with the newest version Guidance v1. Some changes might be necessary depending on the version of the tool the user downloaded:
|
||||
* Using Guidance v2.0.2, the tool should be placed in the Script Folder of EukPhylo, and the corresponding run line uncommented from the contamination.py script
|
||||
* Using Guidance v1, user should make sure to update the path to the tool in the corresponding run line and uncomment that line (in contamination.py script)
|
||||
|
||||
|
||||
## Gene trees
|
||||
|
||||
After homology assessment and building MSAs (the Guidance step), EukPhylo trims alignments and build trees. By default, alignments are trimmed at 0.95% with TrimAL, and trees by default are built by IqTREE with an LG+G model; users may choose to use a different third-party tool for phylogenetic reconstruction.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user