Updated PhyloToL Part 1: GF assignment (markdown)

2025-12-28 20:50:25 +08:00 · 2024-08-12 16:14:23 -04:00 · 2024-08-12 16:14:23 -04:00 · d0859d044c
commit d0859d044c
parent a766fee515
1 changed files with 8 additions and 22 deletions
--- a/PhyloToL-Part-1:-GF-assignment.md
+++ b/PhyloToL-Part-1:-GF-assignment.md
@ -92,7 +92,7 @@ Available parameters are:
 | --maxlen  |int|-| Maximum transcript length | 
 | --seq_count  |int|-| minimum number of sequences after assigning OGs | 

-To run the PhyloToL part 1 for both processing transcriptomes and removing sequences that resulted from index switching (cross plate contamination), run: 
+#### To run the PhyloToL part 1 for both processing transcriptomes and removing sequences that resulted from index switching (cross plate contamination), run: 
 `python Scripts/wrapper.py --first_script 1 --last_script 7 --assembled_transcripts AssembledTranscripts --output . --genetic_code Gcode.txt --databases Databases --xplate_contam --conspecific_names Conspecific.txt > log.txt`


@ -100,8 +100,13 @@ To run the PhyloToL part 1 for both processing transcriptomes and removing seque
 ### Processing genomes
 Role of each script
 <img src="https://github.com/Katzlab/PhyloToL-6/blob/main/Other/PTL1_Processing_Genomes_scripts.png" width="100%">
-* **Main inputs** : A folder containing the [CDS](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/TestData), a folder containing the Databases, and a folder containing the [Scripts](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/Scripts).
-* **Outputs** : ReadyToGo files (AA and NTD) and taxon summary.
+
+* **Main inputs** : The main inputs for processing genomes are: a folder containing the assembled [CDS](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/TestData), a folder containing the Databases with three subfolders(db_BvsE (how we ID likely-bacterial sequences), db_StopFreq (for stop codon assignment), and db_OG (this must be current version of the hook)), and a folder containing the [Scripts](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/Scripts).
+* **Outputs** : The main outputs after processing genomes are: ReadyToGo files which contain the nucleotide and amino acid sequences of each taxa, and a summary information of the sequences processed for each taxa.
+
+#### To run the PhyloToL part 1 for processing genomes, run:
+ `python Scripts/wrapper.py --first_script 1 --last_script 5 --cds CDS --output . --genetic_code Gcode.txt --databases Databases > log.txt`
+


 | Parameter | Type| Options| Description|
@ -115,22 +120,3 @@ Role of each script



-* *Optional inputs : Gcodes.txt and Conspecific.txt
-* *The Gcodes.txt is a tab separated txt file containing the genetic code of the taxa. This will most likely not be needed except for some organisms like ciliates.
-* _Example:_
-
-| Taxa  | Genetic code | 
-| ----------- | --------- |
-| EE_uc_Me03  | Universal | 
-| Sr_ci_Arsp  | Ciliate | 
-| Sr_ci_Cpol  | Peritrich | 
-| Sr_ci_Bjap  | Blepharisma | 
-
-* *The Conspecific.txt is similar and is only needed for cross plate contamination removal. 
-* _Example:_
-
-| Taxa  | plate| 
-| ----------- | ----------------- |
-| EE_uc_Me03  | Metatranscriptome | 
-| EE_uc_Me04  | Metatranscriptome | 
-| EE_uc_Me05  | Metatranscriptome |