A Modular Pipeline for Novel MAG (Metagenome-Assembled Genome) Discovery and Metagenomic Profiling
Computational requirements vary based on:
Conda Distribution:
Python:
Version Control:
The example configuration (config_example.py
) shows the expected structure and typical installation paths for all tools. Use it as a reference when customizing your setup.
# Download Miniconda installer wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh # Make installer executable chmod +x Miniconda3-latest-Linux-x86_64.sh # Run installer ./Miniconda3-latest-Linux-x86_64.sh # Follow the prompts and restart your terminal source ~/.bashrc
# Clone the repository git clone https://github.com/msatti123/MetaMAG_Explorer.git # Navigate to the directory cd MetaMAG_Explorer
# Download the repository wget https://github.com/msatti123/MetaMAG_Explorer/archive/main.zip # Extract the archive unzip main.zip # Navigate to the directory cd MetaMAG_Explorer-main
Before generating your configuration, review the example to understand the structure:
# View the example configuration cat config_example.py # This shows all required tools and their typical paths # You'll use this as a reference if tools aren't auto-detected
# Scan your system for existing tools and generate config python setup_tools.py --use-existing # This creates MetaMAG/config.py based on tools found in your PATH Next activate the conda environemnt metamag and start installing tools conda activate metamag
Add tools as needed:
# Install all pipeline tools and create the metamag environment python setup_tools.py --update --all # This will install all require tools
# Install specific tools python setup_tools.py --update --tools fastqc python setup_tools.py --update --tools fastp bwa samtools # Install tools for specific pipeline steps python setup_tools.py --update --steps trimming # Install tools for multiple steps python setup_tools.py --update --steps qc preprocessing assembly binning
Step Name | Included Tools |
---|---|
qc | fastqc, multiqc |
preprocessing | fastp, bwa, samtools |
assembly | idba, megahit, metaquast |
binning | das_tool, metawrap, metabat2 |
evaluation | checkm2, drep |
taxonomy | gtdbtk, kraken2, bracken |
annotation | eggnog-mapper, dbcan, prodigal |
--all
: Install all pipeline tools--tools [list]
: Install specific tools--steps [list]
: Install tools for pipeline steps--update
: Add tools to existing environment--force-recreate
: Recreate environment from scratch--use-existing
: Use system tools without installingCheck which tools are properly configured:
# Verify all tools python setup_tools.py --verify # Expected output format: # Tool Status Version/Notes # ============================================================ # fastqc OK FastQC v0.11.9 # fastp OK fastp 0.23.2 # megahit MISSING Not found in environment # ...
If tools weren't detected automatically, edit your configuration:
# Compare your config with the example diff config_example.py MetaMAG/config.py # Edit your configuration nano MetaMAG/config.py # For each missing tool, find its location: which [toolname] # Then update the path in config.py # Use config_example.py as a reference for the format
/miniconda3/envs/
config_example.py
for the expected formatInstall the MetaMAG pipeline package:
# From the repository root directory pip install -e . # This installs MetaMAG in development mode # You can now import MetaMAG modules from anywhere
Run a quick test to ensure everything works:
# Test import python -c "from MetaMAG import qc; print('Import successful!')" # Run help to see available options python main.py --help
Step | Estimated Time |
---|---|
Install Conda (if needed) | 5-10 minutes |
Clone repository | 1 minute |
Generate configuration | 2-3 minutes |
Install core tools via Conda | 20-30 minutes |
Install all tools via Conda | 45-60 minutes |
Verify installation | 2-3 minutes |
Download reference databases (optional) | 2-4 hours |
Installation times depend on your internet connection speed and whether Conda needs to download packages or can use cached versions.
# 1. Create a sample list echo "SRR12345678" > samples.txt echo "SRR12345679" >> samples.txt # 2. Create a project configuration cat > project_config.yaml << EOF input_dir: "/path/to/raw/reads" output_dir: "/path/to/output" reference: "/path/to/host/genome.fa" # Optional for host removal EOF # 3. Run quality control step python -m MetaMAG.main --project_config project_config.yaml \ --samples-file samples.txt \ --steps qc \ --batch_size 2 \ --cpus 8 \ --memory "32G" \ --time "2:00:00" \ --log_dir ./logs
FastQC, MultiQC
Fastp, BWA, Samtools
IDBA, MEGAHIT, MetaQUAST
MetaWrap, DAS Tool, MetaBAT2, MaxBin2
CheckM2, dRep
GTDB-Tk, Kraken2, Bracken
EggNOG-mapper, dbCAN, Prodigal
Issue | Solution |
---|---|
conda: command not found | Run source ~/.bashrc or reinstall Miniconda |
Environment 'metamag' not found | Run python setup_tools.py --tools fastqc to create the environment |
Tool not detected | Check config_example.py for reference, then edit MetaMAG/config.py |
Permission denied errors | Avoid using sudo; install in user space with Conda |
Config file not found | Ensure you're in the repository root directory when running setup |
Tools missing after installation | Activate the environment: conda activate metamag |
# If tools aren't found automatically: # 1. Find tool location manually which megahit # or for conda environments: find ~/miniconda3 -name "megahit" -type f 2>/dev/null # 2. Check config_example.py for the format grep "megahit" config_example.py # 3. Update your config nano MetaMAG/config.py # Add or update the tool path # 4. Verify the change python setup_tools.py --verify
config_example.py
for tool path examples