From e141f9e601a5f33f16246e9dfcd336b19b482b2e Mon Sep 17 00:00:00 2001
From: Katzlab <katzlab@smith.edu>
Date: Fri, 9 Aug 2024 17:53:17 -0400
Subject: [PATCH] Updated Utilities (markdown)

---
 Utilities.md | 62 ++++++++++++++++++++++++++--------------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/Utilities.md b/Utilities.md
index 11fe868..66479f3 100644
--- a/Utilities.md
+++ b/Utilities.md
@@ -1,33 +1,33 @@
-PhyloToL 6 includes a set of stand-alone utility scripts that aim to increase the power of the analysis done with or without the core PhyloToL pipeline. We divide these scripts into five main categories: basic statistics, composition tools, MSA tools, gene tree description, and contamination removal.
+PhyloToL 6 includes a set of stand-alone utility scripts that aim to increase the power of the analysis done with or without the core PhyloToL pipeline. We divide these scripts into five main categories: basic statistics, composition tools, MSA tools, gene tree description, and contamination removal
+
 A summary of some of the scripts is divided by category here
 
-| [Script name](https://github.com/Katzlab/PhyloToL-6/tree/main/Utilities) | Intent                                                                             | Output                                                                     |
-| ------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
-| Assess_transcriptomes_v2.0.py                                            | Calculates the length, GC content, and coverage of assembled files                 | Spreadsheet containing the length, coverage, and GC of each transcript.    |
-| Cluster_v2.0.py                                                          | Clusters sequences in a fasta file                                                 | Clustered fasta files                                                      |
-| GetTaxonomy_v1.0.py                                                      | Collects taxonomic classification of organisms from NCBI                           | Spreadsheet with NCBI taxonomy                                             |
-| GetUniqueTaxa_v1.0.py                                                    | Gets the unique taxa from a taxonomic classification                               | Spreadsheet with unique taxa                                               |
-| Plot_transcriptomes_v2.0.py                                              | Plots the length, coverage, and GC distribution of transcriptomes.                 | Plots of transcripts distribution.                                         |
-| QuerySRA_v1.0.py                                                         | Downloads assemblies from NCBI                                                     | Assemblies, IDs, and GCA or SRR codes.                                     |
-| ReadMapping_v2.0.py                                                      | Maps a group of trimmed reads to a reference                                       | Sam/Bam files.                                                             |
-| SeqLenToCsv_v1.0.py                                                      | Calculates the length of DNA sequences in fasta files                              | Spreadsheet containing the length of all sequences.                        |
-| SharedOGs_v1.0.py                                                        | Summarizes the gene family presence in fasta files                                 | Spreadsheet with the gene families                                         |
-|                                                                          |                                                                                    |                                                                            |
-|                                                                          |                                                                                    |                                                                            |
-| CUB_v2.1.py                                                              | Summarizes the nucleotide composition of fasta files                               | Fasta file and several spreadsheets summarizing the nucelotide composition |
-| GC_identifier_v1.0.py                                                    | Renames sequence ID by GC composition                                              | Fasta files with relabeled sequence ID                                     |
-| PlotComps_v2.0.r                                                         | Produces GC3 width plots                                                           | GC3 width plots                                                            |
-| Plotcomps_SppName_v1.0.R                                                 | Produces GC3 width plots with the species name and # seqs added to each plot       | GC3 width plots                                                            |
-|                                                                          |                                                                                    |                                                                            |
-|                                                                          |                                                                                    |                                                                            |
-| BacktranslateAlignment.py                                                | Produces new nucleotide alignment from an amino acid alignment                     | Aligned nucelotide file                                                    |
-| CountTaxonOccurence_v2.0.py                                              | Counts the occurences of each taxa in each gene family of a post guidance file     | Spreadsheet with counts of taxa                                            |
-| friendlessness_v2.0.py                                                   | Describes the internal regions of insertion unique or nearly unique to a sequence  | Spreadsheet with each sequence statistics                                  |
-| Gappiness_v2.0.py                                                        | Produces statistics on the terminal and internal gaps of an alignment              | Spreadsheet with the paralogs statistics                                   |
-| GuidanceWrapper_v2.1.py                                                  | Guidance wrapper that can be used in place of PhyloToL pipeline                    | Guidanced alignment files                                                  |
-|                                                                          |                                                                                    |                                                                            |
-|                                                                          |                                                                                    |                                                                            |
-| CladeSizes_v2.0.py                                                       | Describes clade sizes for different taxonomic groups                               | Spreadsheet describing clade sizes                                         |
-| ColorByClade_v2.1.py                                                     | Visualizes placement of taxa by taxonomic group in trees                           | Colored trees                                                              |
-| ContaminationBySisters_v2.2.py                                           | Summarizes the taxonomic distribution of sister sequences for each taxon in a tree | Two spreadsheets summarizing tree tips relationship                        |
-| RenameTips_v1.0.py                                                       | Renames the tip labels of trees to include metadata such as location and date      | Renamed trees                                                              |
\ No newline at end of file
+| Input                         | [Script name](https://github.com/Katzlab/PhyloToL-6/tree/main/Utilities) | Intent                                                                             | Output                                                                     |
+| ----------------------------- | ------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
+| Assembly and fasta tools      | Assess_transcriptomes_v2.0.py                                            | Calculates the length, GC content, and coverage of assembled files                 | Spreadsheet containing the length, coverage, and GC of each transcript.    |
+|                               | Cluster_v2.0.py                                                          | Clusters sequences in a fasta file                                                 | Clustered fasta files                                                      |
+|                               | GetTaxonomy_v1.0.py                                                      | Collects taxonomic classification of organisms from NCBI                           | Spreadsheet with NCBI taxonomy                                             |
+|                               | GetUniqueTaxa_v1.0.py                                                    | Gets the unique taxa from a taxonomic classification                               | Spreadsheet with unique taxa                                               |
+|                               | Plot_transcriptomes_v2.0.py                                              | Plots the length, coverage, and GC distribution of transcriptomes.                 | Plots of transcripts distribution.                                         |
+|                               | QuerySRA_v1.0.py                                                         | Downloads assemblies from NCBI                                                     | Assemblies, IDs, and GCA or SRR codes.                                     |
+|                               | ReadMapping_v2.0.py                                                      | Maps a group of trimmed reads to a reference                                       | Sam/Bam files.                                                             |
+|                               | SeqLenToCsv_v1.0.py                                                      | Calculates the length of DNA sequences in fasta files                              | Spreadsheet containing the length of all sequences.                        |
+|                               | SharedOGs_v1.0.py                                                        | Summarizes the gene family presence in fasta files                                 | Spreadsheet with the gene families                                         |
+|                               |                                                                          |                                                                                    |                                                                            |
+| Sequence composition analysis | CUB_v2.1.py                                                              | Summarizes the nucleotide composition of fasta files                               | Fasta file and several spreadsheets summarizing the nucelotide composition |
+|                               | GC_identifier_v1.0.py                                                    | Renames sequence ID by GC composition                                              | Fasta files with relabeled sequence ID                                     |
+|                               | PlotComps_v2.0.r                                                         | Produces GC3 width plots                                                           | GC3 width plots                                                            |
+|                               | Plotcomps_SppName_v1.0.R                                                 | Produces GC3 width plots with the species name and # seqs added to each plot       | GC3 width plots                                                            |
+|                               |                                                                          |                                                                                    |                                                                            |
+| MSA tools                     | BacktranslateAlignment.py                                                | Produces new nucleotide alignment from an amino acid alignment                     | Aligned nucelotide file                                                    |
+|                               | CountTaxonOccurence_v2.0.py                                              | Counts the occurences of each taxa in each gene family of a post guidance file     | Spreadsheet with counts of taxa                                            |
+|                               | friendlessness_v2.0.py                                                   | Describes the internal regions of insertion unique or nearly unique to a sequence  | Spreadsheet with each sequence statistics                                  |
+|                               | Gappiness_v2.0.py                                                        | Produces statistics on the terminal and internal gaps of an alignment              | Spreadsheet with the paralogs statistics                                   |
+|                               | GuidanceWrapper_v2.1.py                                                  | Guidance wrapper that can be used in place of PhyloToL pipeline                    | Guidanced alignment files                                                  |
+|                               |                                                                          |                                                                                    |                                                                            |
+| Gene tree description         | CladeSizes_v2.0.py                                                       | Describes clade sizes for different taxonomic groups                               | Spreadsheet describing clade sizes                                         |
+|                               | ColorByClade_v2.1.py                                                     | Visualizes placement of taxa by taxonomic group in trees                           | Colored trees                                                              |
+|                               | ContaminationBySisters_v2.2.py                                           | Summarizes the taxonomic distribution of sister sequences for each taxon in a tree | Two spreadsheets summarizing tree tips relationship                        |
+|                               | RenameTips_v1.0.py                                                       | Renames the tip labels of trees to include metadata such as location and date      | Renamed trees                                                              |
+|                               |                                                                          |                                                                                    |                                                                            |
+| Stand-alone clade grabbing    | CladeGrabbing_v2.1.py                                                    | Selects clades of interest from trees using taxonomic specifications               | Phylogenetic trees                                                         |
\ No newline at end of file