Updated EukPhylo QuickStart (markdown)

2026-02-10 17:00:24 +08:00 · 2025-03-03 11:37:15 -05:00 · 2025-03-03 11:37:15 -05:00 · 22ea61fed0
commit 22ea61fed0
parent f0356015a8
1 changed files with 23 additions and 0 deletions
--- a/EukPhylo-QuickStart.md
+++ b/EukPhylo-QuickStart.md
@ -1,5 +1,28 @@
 > Note: The EukPhylo pipeline is currently being dockerised for easier installation and use. More information about the dockerfile can be found here -  [Docker branch](https://github.com/Katzlab/EukPhylo/tree/Docker)

+## Dockerfile
+
+The docker file can be executed with:
+
+```bash
+cd EukPhylo
+
+# Build the container
+docker build -t Dockerfile . --tag MyEuk:1
+
+# Get the container IMAGE_ID
+docker image list
+
+# Current command is:
+docker run -it \
+    --mount type=bind,src=$(pwd)/databases,dst=/Databases \
+    --mount type=bind,src=$(pwd)/input_data,dst=/Input_data \
+    --mount type=bind,src=$(pwd)/output_data,dst=/Output_data \
+    {IMAGE_ID}
+```
+
+After development, GitHub CICD workflows can be added to automatically build and release the dockerfile for the end user.
+
 # General Steps

 EukPhylo pipeline is composed of two parts, that can be run individually: Part 1 can be run only once, to assign gene families; Part 2 builds MSAs, trees, and implements contamination removal and concatenation. It's preferable to run Part 2 using the outputs of Part 1 as input, but this is not required as long as the input files are in the same format (one fasta file per species, with sequences IDs starting with a 10 digit taxon identifier and ending in a gene family identifier with the format OGx_xxxxxx. See extended version of the wiki for details.)