Updated PhyloToL Part 1: GF assignment (markdown)

2025-12-29 16:20:24 +08:00 · 2024-08-12 15:00:14 -04:00 · 2024-08-12 15:00:14 -04:00 · 2e507ecf46
commit 2e507ecf46
parent af123c3212
1 changed files with 17 additions and 6 deletions
--- a/PhyloToL-Part-1:-GF-assignment.md
+++ b/PhyloToL-Part-1:-GF-assignment.md
@ -96,12 +96,23 @@ Role of each script
 To process transcriptomes, run:
 `python Scripts/wrapper.py -1 1 -2 7 --assembled_transcripts AssembledTranscripts --output . --genetic_code Universal -d Databases > log.txt`
-* -1 = start script
+
-* -2 = end script
+| Parameter  | Description| 
-* --assembled_transcripts = Folder with Assembled transcripts in fasta format 
+| ----------- | ----------------- |
-* --output = path to output folder
+| -1, --first_script  | First script to run | 
-* --genetic_code = specified genetic code, name of .txt file with Genetic codes
+| -2, --last_script  | Last script to run | 
-* -d = path to Databases folder 
+| -a, --assembled_transcripts  | Path to a folder of assembled transcripts, assembled by rnaSPAdes. Each assembled transcript file name should start with a unique 10 digit code, and end in "_assembledTranscripts.fasta", E.g. Op_me_hsap_assembledTranscripts.fasta | 
 | -d, --databases  | Path to databases folder | 
 | -o, --output  | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder | 
 | -x, --xplate_contam  | Run cross-plate contamination removal (includes all files) | 
 | -g, --genetic_code  | If all of your taxa use the same genetic code, you may enter it here (to be used in script 5). Alternatively, if you need to use a variety of genetic codes but know which codes to use, you may fill give here the path to a .txt or .tsv with two tab-separated columns, the first with the ten-digit codes and the second column with the corresponding genetics codes | 
 | -n, --conspecific_names  | A .txt or .tsv file with two tab-separated columns; the first should have 10 digit codes, the second species or other identifying names. This is used to determine which sequences to remove (only between "species") in cross-plate contamination assessment. | 
 | -min, --minlen  | Minimum transcript length | 
 | -max, --maxlen  | Maximum transcript length | 
 | -c, --seq_count  | minimum number of sequences after assigning OGs | 
 * \>log.txt = if added to the end of the command, it will output a log file with progress, warning, or error messages
 * *For running with cross plate contamination removal, add `-x -n Conspecific.txt` to the line of code.