diff --git a/PhyloToL-Part-1:-GF-assignment.md b/PhyloToL-Part-1:-GF-assignment.md index 81f47fd..7d564c6 100644 --- a/PhyloToL-Part-1:-GF-assignment.md +++ b/PhyloToL-Part-1:-GF-assignment.md @@ -92,7 +92,7 @@ Available parameters are: | --maxlen |int|-| Maximum transcript length | | --seq_count |int|-| minimum number of sequences after assigning OGs | -To run the PhyloToL part 1 for both processing transcriptomes and removing sequences that resulted from index switching (cross plate contamination), run: +#### To run the PhyloToL part 1 for both processing transcriptomes and removing sequences that resulted from index switching (cross plate contamination), run: `python Scripts/wrapper.py --first_script 1 --last_script 7 --assembled_transcripts AssembledTranscripts --output . --genetic_code Gcode.txt --databases Databases --xplate_contam --conspecific_names Conspecific.txt > log.txt` @@ -100,8 +100,13 @@ To run the PhyloToL part 1 for both processing transcriptomes and removing seque ### Processing genomes Role of each script -* **Main inputs** : A folder containing the [CDS](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/TestData), a folder containing the Databases, and a folder containing the [Scripts](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/Scripts). -* **Outputs** : ReadyToGo files (AA and NTD) and taxon summary. + +* **Main inputs** : The main inputs for processing genomes are: a folder containing the assembled [CDS](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/TestData), a folder containing the Databases with three subfolders(db_BvsE (how we ID likely-bacterial sequences), db_StopFreq (for stop codon assignment), and db_OG (this must be current version of the hook)), and a folder containing the [Scripts](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/Scripts). +* **Outputs** : The main outputs after processing genomes are: ReadyToGo files which contain the nucleotide and amino acid sequences of each taxa, and a summary information of the sequences processed for each taxa. + +#### To run the PhyloToL part 1 for processing genomes, run: + `python Scripts/wrapper.py --first_script 1 --last_script 5 --cds CDS --output . --genetic_code Gcode.txt --databases Databases > log.txt` + | Parameter | Type| Options| Description| @@ -115,22 +120,3 @@ Role of each script -* *Optional inputs : Gcodes.txt and Conspecific.txt -* *The Gcodes.txt is a tab separated txt file containing the genetic code of the taxa. This will most likely not be needed except for some organisms like ciliates. -* _Example:_ - -| Taxa | Genetic code | -| ----------- | --------- | -| EE_uc_Me03 | Universal | -| Sr_ci_Arsp | Ciliate | -| Sr_ci_Cpol | Peritrich | -| Sr_ci_Bjap | Blepharisma | - -* *The Conspecific.txt is similar and is only needed for cross plate contamination removal. -* _Example:_ - -| Taxa | plate| -| ----------- | ----------------- | -| EE_uc_Me03 | Metatranscriptome | -| EE_uc_Me04 | Metatranscriptome | -| EE_uc_Me05 | Metatranscriptome | \ No newline at end of file