Updated Home (markdown)

Katzlab 2024-08-09 14:50:17 -04:00
parent 90e8610b1c
commit 261f40c144

@ -1,6 +1,8 @@
# Overview:
The core PhyloToL pipeline comprises two main components, which we refer to as PhyloToL part one and part two. PhyloToL part one takes input sequences from a whole genome or transcriptome assembly, applies several curation steps, and provides initial homology assessment against a customizable database of reference sequences to assign GFs. Part one outputs a fasta file of curated nucleotide and amino acid sequences with gene families assigned, as well as a dataset of descriptive statistics (e.g. length, coverage, and composition) for each input sample. PhyloToL part 2 is highly modular; for a given selection of taxa and GFs it stringently assesses homology by iterating the tool Guidance (Penn et al., 2010; Sela et al., 2015), which outputs an MSA for each gene family. From MSAs it builds gene trees, and then includes an innovative workflow for tree topology-based contamination removal.
We also provide
# Table of Contents
## PhyloToL Part I = determining gene families