diff --git a/PhyloToL-Part-2.md b/PhyloToL-Part-2.md index 6a31b4e..dbb3e28 100644 --- a/PhyloToL-Part-2.md +++ b/PhyloToL-Part-2.md @@ -1,5 +1,9 @@ # Overview and Modularity +# Databases + +We provide a diverse database of 1,000 genomes and transcriptomes from across the eukaryotic, bacterial, and archaeal tree of life, with a focus on microeukaryotic diversity. This database is in the form of "ReadyToGo" files, the output of PhyloToL part 1. This means that using this dataset, you can jump right in to running analyses of any subset of these taxa using any of the OGs in the Hook Database (if you want to add your own samples or use a different set of OGs, you should look at [PhyloToL part 1](https://github.com/Katzlab/PhyloToL-6/wiki/PhyloToL-Part-1)) + # Overlap and similarity filters # Guidance @@ -7,3 +11,5 @@ # Gene trees # Contamination loop + +The Contamination Loop is implemented within PhyloToL to allow the removal of contaminants based on the topology of each tree (= Phylogenetic based contamination removal). 3 modes are available and described in this section: ‘sister’, ‘subsister’ and ‘clade’. All modes take a user defined rules file to identify the sequences to remove.