From 2e507ecf46a6ae84527c2c0620a52b013492e4b0 Mon Sep 17 00:00:00 2001
From: Godwin Ani <aniigodwinn@gmail.com>
Date: Mon, 12 Aug 2024 15:00:14 -0400
Subject: [PATCH] Updated PhyloToL Part 1: GF assignment (markdown)

---
 PhyloToL-Part-1:-GF-assignment.md | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/PhyloToL-Part-1:-GF-assignment.md b/PhyloToL-Part-1:-GF-assignment.md
index 9e9b15f..528f550 100644
--- a/PhyloToL-Part-1:-GF-assignment.md
+++ b/PhyloToL-Part-1:-GF-assignment.md
@@ -96,12 +96,23 @@ Role of each script
 To process transcriptomes, run:
 
 `python Scripts/wrapper.py -1 1 -2 7 --assembled_transcripts AssembledTranscripts --output . --genetic_code Universal -d Databases > log.txt`
-* -1 = start script
-* -2 = end script
-* --assembled_transcripts = Folder with Assembled transcripts in fasta format 
-* --output = path to output folder
-* --genetic_code = specified genetic code, name of .txt file with Genetic codes
-* -d = path to Databases folder 
+
+| Parameter  | Description| 
+| ----------- | ----------------- |
+| -1, --first_script  | First script to run | 
+| -2, --last_script  | Last script to run | 
+| -a, --assembled_transcripts  | Path to a folder of assembled transcripts, assembled by rnaSPAdes. Each assembled transcript file name should start with a unique 10 digit code, and end in "_assembledTranscripts.fasta", E.g. Op_me_hsap_assembledTranscripts.fasta | 
+| -d, --databases  | Path to databases folder | 
+| -o, --output  | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder | 
+| -x, --xplate_contam  | Run cross-plate contamination removal (includes all files) | 
+| -g, --genetic_code  | If all of your taxa use the same genetic code, you may enter it here (to be used in script 5). Alternatively, if you need to use a variety of genetic codes but know which codes to use, you may fill give here the path to a .txt or .tsv with two tab-separated columns, the first with the ten-digit codes and the second column with the corresponding genetics codes | 
+| -n, --conspecific_names  | A .txt or .tsv file with two tab-separated columns; the first should have 10 digit codes, the second species or other identifying names. This is used to determine which sequences to remove (only between "species") in cross-plate contamination assessment. | 
+| -min, --minlen  | Minimum transcript length | 
+| -max, --maxlen  | Maximum transcript length | 
+| -c, --seq_count  | minimum number of sequences after assigning OGs | 
+
+ 
+
 * \>log.txt = if added to the end of the command, it will output a log file with progress, warning, or error messages
 * *For running with cross plate contamination removal, add `-x -n Conspecific.txt` to the line of code.