Updated PhyloToL Part 1: GF assignment (markdown)

Katzlab 2024-08-13 14:54:32 -04:00
parent bc2cedb913
commit 8651d736f2

@ -115,13 +115,11 @@ To run this step, you will need to add the '--xplate_contam' flag to the command
## Processing genomes ## Processing genomes
<img src="https://github.com/Katzlab/PhyloToL-6/blob/main/Other/PTL1_Processing_Genomes_scripts.png" width="100%"> <img src="https://github.com/Katzlab/PhyloToL-6/blob/main/Other/PTL1_Processing_Genomes_scripts.png" width="100%">
Running PhyloToL Part 1 on genomes requires at least 3 items in your main directory: 1) A folder named Scripts and containing all **[Scripts](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/Scripts)** from PhyloToL part 1 github, 2) a folder containing your **[CDS](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Genomes/TestData)** (as described above), and 3) a folder containing the **Databases** with three subfolders(db_BvsE (how we ID likely-bacterial sequences), db_StopFreq (for stop codon assignment), and db_OG (The hook database as described above)). Default script starts with your **CDS** and produces **ReadyToGo files** (nucleotide and amino acid sequences) of each taxa, and **summary information** of the sequences processed for those taxa. PhyloToL part 1 uses a different but similar set of scripts to process input genomic CDSs (as opposed to assembled transcripts). The setup here is the same as described above, and running PhyloToL part 1 on genomic CDSs is similar to as described above for transcriptomes, except there is no option to remove contamination as a result of index hopping and no need to identify genetic codes to use in translation (start/stop codon positions are already known), so the process is simpler. We recommend always running scripts 1 through 5 in a single run, as follows:
* To run the PhyloToL part 1 for processing genomes, run:
`python Scripts/wrapper.py --first_script 1 --last_script 5 --cds CDS --output . --genetic_code Gcode.txt --databases Databases > log.txt`
`python Scripts/wrapper.py --first_script 1 --last_script 5 --cds CDS --genetic_code Gcode.txt --databases Databases > log.txt`
The parameter options are:
| Parameter | Type| Options| Description| | Parameter | Type| Options| Description|
| ----------- | ----------------- |----------- | ----------------- | | ----------- | ----------------- |----------- | ----------------- |