From 634ac075730280c685db591946a3a2fb6f548347 Mon Sep 17 00:00:00 2001 From: Katzlab Date: Tue, 13 Aug 2024 18:52:24 -0400 Subject: [PATCH] Updated PhyloToL Part 2: MSAs, trees, and contamination loop (markdown) --- PhyloToL-Part-2:-MSAs,-trees,-and-contamination-loop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PhyloToL-Part-2:-MSAs,-trees,-and-contamination-loop.md b/PhyloToL-Part-2:-MSAs,-trees,-and-contamination-loop.md index 34dc9bd..2edb5c4 100644 --- a/PhyloToL-Part-2:-MSAs,-trees,-and-contamination-loop.md +++ b/PhyloToL-Part-2:-MSAs,-trees,-and-contamination-loop.md @@ -154,7 +154,7 @@ indicates that the a sequence from the choanoflagellate Op_ch_Dgra should be rem In clade-grabbing mode, each row again represents a rule. This time, there are five columns. The first column gives the target taxonomic group for which you are clade grabbing. Here you can give a ten-digit code, a subset of a code, or even the path to a text file containing a list of multiple codes if they don't all share a precise enough prefix. The third column gives the minimum number of target taxa that must be in a clade for it to be kept, and the second column gives the minimum proportion (or absolute number of >1) of taxa in that clade that are not in the target group. The fourth column allows you to give a list of 'special' taxa (or just a ten-digit code or a subset of a code), X of which must be present in a clade for it to be selected, where X is the number in the fifth column. For example, the line |Sr_ci | 0.1 | 13 | ciliate_genomes.txt | 1| -|-|-|-|-| +|-|-|-|-|-| indicates that all ciliate sequences should be removed if they don't fall in a clade with at least 13 ciliate species (unique ten digit codes beginning with Sr_ci), where no more than 1/10 of the species in the clade are non-ciliates, and containing at least 1 sequence that begins with a prefix listed in the ciliate_genomes.txt file (i.e., if you're more confident in genomic data, you may want to make sure that there's a genome in your clade).