diff --git a/QuickStart.md b/QuickStart.md new file mode 100644 index 0000000..d914732 --- /dev/null +++ b/QuickStart.md @@ -0,0 +1,49 @@ +# Quickstart EukPhylo v1.0 + +## Installing EukPhylo +Scripts can be used as downloaded from the [GitHub](https://github.com/Katzlab/EukPhylo), and should work on any platform +Dependencies & third party tools, along with the versions that we use at the Katz lab +TrimAl (1.2) +Guidance (2.2) +Diamond (0.9.30, compiled with GCC 8.3.0) +MAFFT (7.475) +IQ-Tree (2.1.12) +RAxML (8.2.12) +BLAST+ (2.9.0) +Vsearch (2.21.1, compiled with GCC 10.3.0) +Python libraries (can be installed with Pip) +ETE3 (pip install ete3) +BioPython +tqdm + + +## EukPhylo part 1 = Assigning Gene families + +EukPhylo part 1 runs CDS or assembled transcripts through several scripts in order (7 for transcriptomes, 5 for genomes). These scripts are run through a ‘wrapper’ script. + +### Transcriptomes: +Set Up: +* A folder called “AssembledTranscripts” with your assembled transcript fasta files +* A folder called “Databases” with the three sub folders: +** db_BvsE (how we ID likely-bacterial sequences) +** db_StopFreq (for stop codon assignment) +** db_OG +*** Hook *.dmnd file ([Current version Hook-6.6.dmnd](https://drive.google.com/open?id=1ywYLZXzcTERDFCysz5vPbI9u6WRxz5r0&usp=drive_copy)) +*** Hook *.fasta file ([Current version Hook-6.6.fasta](https://drive.google.com/open?id=1AN4_SmZUYFH6_xh2qOhyNUlFZ_NT9_-D&usp=drive_copy)) +* A folder called “Scripts” filled with scripts from [here](https://github.com/Katzlab/PhyloToL-6/tree/main/PTL1/Transcriptomes/Scripts) on Github + +Running: +python wrapper.py -1 1 -2 7 --assembled_transcripts AssembledTranscripts -o . --genetic_code Universal -d Databases > log.txt + +Here add detail of each option possible: +-1 = start script +-2 = end script +--assembled_transcripts = Folder with Assembled transcripts in fasta format +-o = path to output folder +--genetic_code = specified genetic code, name of .txt file with Genetic codes +-d = path to Databases folder +> log.txt = if added to the end of the command, it will output a log file with progress, warning, or error messages + +Output: +ReadyToGo = AA, NTD +Sequences summary \ No newline at end of file