Updated PhyloToL Part 1 (markdown)

2026-02-12 01:00:24 +08:00 · 2024-08-09 16:39:12 -04:00 · 2024-08-09 16:39:12 -04:00 · 84d3e4cb6a
commit 84d3e4cb6a
parent d6ab88cf13
1 changed files with 12 additions and 1 deletions
--- a/PhyloToL-Part-1.md
+++ b/PhyloToL-Part-1.md
@ -39,7 +39,18 @@ At this point, you are ready to run the code! See the [Processing transcriptomes
 #### Genomes
-PhyloToL part 1 for genomes takes as input genomic CDS, such as are available to download for many genome assemblies on GenBank
+PhyloToL part 1 for genomes takes as input genomic CDS, such as are available to download for many genome assemblies on GenBank. Similarly to the transcriptome setup above, each input file must be named in the format 
 >Op_me_Hsap_GenBankCDS.fasta
 with the first ten digits representing a unique sample identifier. Each sequence in the CDS fasta file should be formatted as downloaded from GenBank:
 >/>lcl|NC_000001.11_cds_NP_001005484.2_1 [gene=OR4F5] [db_xref=CCDS:CCDS30547.1,Ensembl:ENSP00000493376.2,GeneID:79501] [protein=olfactory receptor 4F5] [protein_id=NP_001005484.2] [location=join(65565..65573,69037..70008)] [gbkey=CDS]
 ATGAAGAAGGTAACTGCAGAGGCTATTTCCTGGAATGAATCAACGAGTGAAACGAATAACTCTATGGTGACTGAATTCAT
 TTTTCTGGGTCTCTCTGATTCTCAGGAACTCCAGACCTTCCTATTTATGTTGTTTTTTGTATTCTATGGAGGAATCGTGT
 TTGGAAACCTTCTTATTGTCATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCAACCTC...
 And all of the CDS fasta files should be in a folder alongside the [Scripts](https://github.com/Katzlab/PhyloToL-6/blob/main/PTL1/Genomes/Scripts) and [Databases](https://github.com/Katzlab/PhyloToL-6/blob/main/PTL1/Genomes/Databases) folders, as above.
 ## The Hook Database