Updated PhyloToL Part 1: GF assignment (markdown)

2025-12-29 10:10:25 +08:00 · 2024-08-12 15:25:28 -04:00 · 2024-08-12 15:25:28 -04:00 · 0873758f6e
commit 0873758f6e
parent d74deb013a
1 changed files with 14 additions and 14 deletions
--- a/PhyloToL-Part-1:-GF-assignment.md
+++ b/PhyloToL-Part-1:-GF-assignment.md
@ -97,19 +97,19 @@ To process transcriptomes, run:

 `python Scripts/wrapper.py -1 1 -2 7 --assembled_transcripts AssembledTranscripts --output . --genetic_code Universal -d Databases > log.txt`

-| Parameter  | Description| 
-| ----------- | ----------------- |
-| -1, --first_script  | First script to run | 
-| -2, --last_script  | Last script to run | 
-| -a, --assembled_transcripts  | Path to a folder of assembled transcripts, assembled by rnaSPAdes. Each assembled transcript file name should start with a unique 10 digit code, and end in "_assembledTranscripts.fasta", E.g. Op_me_hsap_assembledTranscripts.fasta | 
-| -d, --databases  | Path to databases folder | 
-| -o, --output  | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder | 
-| -x, --xplate_contam  | Run cross-plate contamination removal (includes all files) | 
-| -g, --genetic_code  | If all of your taxa use the same genetic code, you may enter it here (to be used in script 5). Alternatively, if you need to use a variety of genetic codes but know which codes to use, you may fill give here the path to a .txt or .tsv with two tab-separated columns, the first with the ten-digit codes and the second column with the corresponding genetics codes | 
-| -n, --conspecific_names  | A .txt or .tsv file with two tab-separated columns; the first should have 10 digit codes, the second species or other identifying names. This is used to determine which sequences to remove (only between "species") in cross-plate contamination assessment. | 
-| -min, --minlen  | Minimum transcript length | 
-| -max, --maxlen  | Maximum transcript length | 
-| -c, --seq_count  | minimum number of sequences after assigning OGs | 
+| Parameter | Type| Options| Description|
+| ----------- | ----------------- |----------- | ----------------- |
+| --first_script |int |1, 2, 3, 4, 5, 6 | First script to run | 
+| --last_script  |int|1, 2, 3, 4, 5, 6, 7 | Last script to run | 
+| --assembled_transcripts  |str|Path to a folder of assembled transcripts, assembled by rnaSPAdes. | Each assembled transcript file name should start with a unique 10 digit code, and end in "_assembledTranscripts.fasta", E.g. Op_me_hsap_assembledTranscripts.fasta | 
+| --databases| str| Path to databases folder | The folder should contain all 3 databases|
+| --output|str|Path for the output files | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder |
+|--xplate_contam |-|- | Run cross-plate contamination removal (includes all files) | 
+| --genetic_code  |str|A .txt or .tsv with two tab-separated columns, the first with the ten-digit codes and the second column with the corresponding genetics codes| If all of your taxa use the same genetic code, you may enter it here. Alternatively, if you need to use a variety of genetic codes but know which codes to use, you may fill give here the path to a file.  | 
+|--conspecific_names  |str| A .txt or .tsv file with two tab-separated columns; the first should have 10 digit codes, the second species or other identifying names|This is used to determine which sequences to remove (only between "species") in cross-plate contamination assessment. | 
+| --minlen |int| -| Minimum transcript length | 
+| --maxlen  |int|-| Maximum transcript length | 
+| --seq_count  |int|-| minimum number of sequences after assigning OGs | 

 

@ -128,7 +128,7 @@ Role of each script
 | --first_script| int |  1, 2, 3, 4 | First script to run |
 | --last_script | int | 2, 3, 4, 5 | First script to run|
 | --cds| str|Path to a folder of nucleotide CDS| Each file name should start with a unique 10 digit code, and end in "_GenBankCDS.fasta", E.g. Op_me_hsap_GenBankCDS.fasta|
-| --output| = str|Path for the output files | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder |
+| --output|str|Path for the output files | An "Output" folder will be created at this directory to contain all output files. By default this folder will be created at the parent directory of the Scripts folder |
 | --genetic_code| str| Path to a file, Universal | If all of your taxa use the same genetic code, you may enter it here|
 | --databases| str| Path to databases folder | The folder should contain all 3 databases|