From 4cc45c68db4c84997a64baf65f6861c395d83471 Mon Sep 17 00:00:00 2001 From: "Adri K. Grow" <42044618+adriannagrow@users.noreply.github.com> Date: Tue, 19 Aug 2025 16:47:29 -0400 Subject: [PATCH] Updated EukPhylo Part 2: MSAs, trees, and contamination loop (markdown) --- EukPhylo-Part-2:-MSAs,-trees,-and-contamination-loop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/EukPhylo-Part-2:-MSAs,-trees,-and-contamination-loop.md b/EukPhylo-Part-2:-MSAs,-trees,-and-contamination-loop.md index ef7ba86..87e601c 100644 --- a/EukPhylo-Part-2:-MSAs,-trees,-and-contamination-loop.md +++ b/EukPhylo-Part-2:-MSAs,-trees,-and-contamination-loop.md @@ -145,7 +145,7 @@ NOTE: These processes are resource-intensive. Each system has its own syntax and ## Contamination loop -The contamination coop (CL) is implemented within EukPhylo to allow the removal of contaminants based on the topology of each tree (phylogeny-informed contamination removal). Three modes are available: sister-, subsister-, and clade-based contamination removal. All modes take a user defined file of 'rules,' used to identify the sequences to remove. We first provide an overview of the three modes and then give details on running below. +The contamination loop (CL) is implemented within EukPhylo to allow the removal of contaminants based on the topology of each tree (phylogeny-informed contamination removal). Three modes are available: sister-, subsister-, and clade-based contamination removal. All modes take a user defined file of 'rules,' used to identify the sequences to remove. We first provide an overview of the three modes and then give details on running below. **Sisters-based contamination removal** identifies sequences as putative contaminants based on their sister relationships. If a sequence from sample A appears on a tree sister to a sequence from sample B, and sample B is known to have contaminated sample A, then the sequence from sample A will be removed. **Subsisters-based removal** operates similarly, but looks at the taxa that are sister to sample A's _parent_ node, useful for when multiple samples are contaminated by the same other sample.