mirror of
http://43.156.76.180:8026/YuuMJ/EukPhylo.git
synced 2026-02-11 05:00:25 +08:00
Updated EukPhylo QuickStart (markdown)
parent
f0356015a8
commit
22ea61fed0
@ -1,5 +1,28 @@
|
|||||||
> Note: The EukPhylo pipeline is currently being dockerised for easier installation and use. More information about the dockerfile can be found here - [Docker branch](https://github.com/Katzlab/EukPhylo/tree/Docker)
|
> Note: The EukPhylo pipeline is currently being dockerised for easier installation and use. More information about the dockerfile can be found here - [Docker branch](https://github.com/Katzlab/EukPhylo/tree/Docker)
|
||||||
|
|
||||||
|
## Dockerfile
|
||||||
|
|
||||||
|
The docker file can be executed with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd EukPhylo
|
||||||
|
|
||||||
|
# Build the container
|
||||||
|
docker build -t Dockerfile . --tag MyEuk:1
|
||||||
|
|
||||||
|
# Get the container IMAGE_ID
|
||||||
|
docker image list
|
||||||
|
|
||||||
|
# Current command is:
|
||||||
|
docker run -it \
|
||||||
|
--mount type=bind,src=$(pwd)/databases,dst=/Databases \
|
||||||
|
--mount type=bind,src=$(pwd)/input_data,dst=/Input_data \
|
||||||
|
--mount type=bind,src=$(pwd)/output_data,dst=/Output_data \
|
||||||
|
{IMAGE_ID}
|
||||||
|
```
|
||||||
|
|
||||||
|
After development, GitHub CICD workflows can be added to automatically build and release the dockerfile for the end user.
|
||||||
|
|
||||||
# General Steps
|
# General Steps
|
||||||
|
|
||||||
EukPhylo pipeline is composed of two parts, that can be run individually: Part 1 can be run only once, to assign gene families; Part 2 builds MSAs, trees, and implements contamination removal and concatenation. It's preferable to run Part 2 using the outputs of Part 1 as input, but this is not required as long as the input files are in the same format (one fasta file per species, with sequences IDs starting with a 10 digit taxon identifier and ending in a gene family identifier with the format OGx_xxxxxx. See extended version of the wiki for details.)
|
EukPhylo pipeline is composed of two parts, that can be run individually: Part 1 can be run only once, to assign gene families; Part 2 builds MSAs, trees, and implements contamination removal and concatenation. It's preferable to run Part 2 using the outputs of Part 1 as input, but this is not required as long as the input files are in the same format (one fasta file per species, with sequences IDs starting with a 10 digit taxon identifier and ending in a gene family identifier with the format OGx_xxxxxx. See extended version of the wiki for details.)
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user