mirror of
http://43.156.76.180:8026/YuuMJ/EukPhylo.git
synced 2025-12-27 07:30:24 +08:00
updating header in concatenate.py
This commit is contained in:
parent
8d48e65b7f
commit
3dc60dcd2e
@ -1,12 +1,12 @@
|
|||||||
# Last updated Jan 2024
|
# Last updated Jan 2024
|
||||||
# Authors: Auden Cote-L'Heureux and Mario Ceron-Romero
|
# Authors: Auden Cote-L'Heureux and Mario Ceron-Romero
|
||||||
|
|
||||||
# This script chooses orthologs to concatenate OGs. This can be done as part of an end-to-end PhyloToL run,
|
# This script chooses orthologs to concatenate OGs. This can be done as part of an end-to-end EukPhylo run,
|
||||||
# or by inputting already complete alignments and gene trees and running only the concatenation step.
|
# or by inputting already complete alignments and gene trees and running only the concatenation step.
|
||||||
# Use the --concatenate flag to run this step, and optionally use the argument --concat_target_taxa to input
|
# Use the --concatenate flag to run this step, and optionally use the argument --concat_target_taxa to input
|
||||||
# a file containing a list of taxon codes to be included in the concatenated alignment. If a GF has more
|
# a file containing a list of taxon codes to be included in the concatenated alignment. If a GF has more
|
||||||
# than one sequence from a taxon, a representative ortholog must be chosen to include in the concatenated alignment.
|
# than one sequence from a taxon, a representative ortholog must be chosen to include in the concatenated alignment.
|
||||||
# To do this, for each taxon PhyloToL keeps only the sequences falling in the monophyletic clade in the tree
|
# To do this, for each taxon EukPhylo keeps only the sequences falling in the monophyletic clade in the tree
|
||||||
# that contains the greatest number of species of the taxon’s minor clade (or major clade, if the ‘target taxon list’
|
# that contains the greatest number of species of the taxon’s minor clade (or major clade, if the ‘target taxon list’
|
||||||
# uses major-clade codes). If multiple sequences from the taxon fall into this largest clade, then the sequence
|
# uses major-clade codes). If multiple sequences from the taxon fall into this largest clade, then the sequence
|
||||||
# with the highest ‘score’ (defined as length times k-mer coverage for transcriptomic data with k-mer coverage
|
# with the highest ‘score’ (defined as length times k-mer coverage for transcriptomic data with k-mer coverage
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user