MetaMAG Explorer Installation Guide

A Modular Pipeline for Novel MAG (Metagenome-Assembled Genome) Discovery and Metagenomic Profiling

🖥️ System Requirements

Hardware Requirements

Notes

Computational requirements vary based on:

  • Dataset size
  • Analysis complexity
  • Number of samples

Software Prerequisites

Essential Software

Conda Distribution:

Python:

Version Control:

Additional Recommended Tools

📁 Configuration Files Overview

Important: Two Configuration Files

  • config_example.py: Reference configuration with example tool paths. DO NOT edit this directly - use it as a template.
  • MetaMAG/config.py: Your actual configuration file (generated during setup). Edit this file to customize tool paths for your system.

The example configuration (config_example.py) shows the expected structure and typical installation paths for all tools. Use it as a reference when customizing your setup.

📦 Step-by-Step Installation

1️⃣ Install Conda (if not already installed)

# Download Miniconda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Make installer executable
chmod +x Miniconda3-latest-Linux-x86_64.sh

# Run installer
./Miniconda3-latest-Linux-x86_64.sh

# Follow the prompts and restart your terminal
source ~/.bashrc

2️⃣ Clone the MetaMAG Repository

Clone Using Git (Recommended)

# Clone the repository
git clone https://github.com/msatti123/MetaMAG_Explorer.git

# Navigate to the directory
cd MetaMAG_Explorer

Download Using wget (Alternative)

# Download the repository
wget https://github.com/msatti123/MetaMAG_Explorer/archive/main.zip

# Extract the archive
unzip main.zip

# Navigate to the directory
cd MetaMAG_Explorer-main

3️⃣ Review Example Configuration

Before generating your configuration, review the example to understand the structure:

# View the example configuration
cat config_example.py

# This shows all required tools and their typical paths
# You'll use this as a reference if tools aren't auto-detected

4️⃣ Generate Your Configuration File

First create config file and check for existing tools (if already installed)

# Scan your system for existing tools and generate config
python setup_tools.py --use-existing
# This creates MetaMAG/config.py based on tools found in your PATH
	
Next activate the conda environemnt metamag and start installing tools
conda activate metamag

5️⃣ Install Required Tools

Add tools as needed:

Option A: Install All Tools via Conda

# Install all pipeline tools and create the metamag environment
python setup_tools.py --update --all

# This will install all require tools

Option B: Install Minimal Tools First

# Install specific tools
python setup_tools.py --update --tools fastqc
python setup_tools.py --update --tools fastp bwa samtools

# Install tools for specific pipeline steps
python setup_tools.py --update --steps trimming

# Install tools for multiple steps
python setup_tools.py --update --steps qc preprocessing assembly binning
Step Name Included Tools
qc fastqc, multiqc
preprocessing fastp, bwa, samtools
assembly idba, megahit, metaquast
binning das_tool, metawrap, metabat2
evaluation checkm2, drep
taxonomy gtdbtk, kraken2, bracken
annotation eggnog-mapper, dbcan, prodigal
Available Installation Options:
  • --all: Install all pipeline tools
  • --tools [list]: Install specific tools
  • --steps [list]: Install tools for pipeline steps
  • --update: Add tools to existing environment
  • --force-recreate: Recreate environment from scratch
  • --use-existing: Use system tools without installing

6️⃣ Verify Tool Installation

Check which tools are properly configured:

# Verify all tools
python setup_tools.py --verify

# Expected output format:
# Tool                Status     Version/Notes
# ============================================================
# fastqc              OK         FastQC v0.11.9
# fastp               OK         fastp 0.23.2
# megahit             MISSING    Not found in environment
# ...

7️⃣ Customize Tool Paths (if needed)

If tools weren't detected automatically, edit your configuration:

# Compare your config with the example
diff config_example.py MetaMAG/config.py

# Edit your configuration
nano MetaMAG/config.py

# For each missing tool, find its location:
which [toolname]

# Then update the path in config.py
# Use config_example.py as a reference for the format

Tips for Setting Tool Paths

  • Paths should point to executable files, not directories
  • Use absolute paths (starting with /)
  • For conda-installed tools, paths typically include /miniconda3/envs/
  • Reference config_example.py for the expected format

8️⃣ Install Python Package

Install the MetaMAG pipeline package:

# From the repository root directory
pip install -e .

# This installs MetaMAG in development mode
# You can now import MetaMAG modules from anywhere

9️⃣ Test Your Installation

Run a quick test to ensure everything works:

# Test import
python -c "from MetaMAG import qc; print('Import successful!')"

# Run help to see available options
python main.py --help

⏱️ Installation Time Estimates

Total Installation Time

Time Breakdown by Step

Step Estimated Time
Install Conda (if needed) 5-10 minutes
Clone repository 1 minute
Generate configuration 2-3 minutes
Install core tools via Conda 20-30 minutes
Install all tools via Conda 45-60 minutes
Verify installation 2-3 minutes
Download reference databases (optional) 2-4 hours

Note

Installation times depend on your internet connection speed and whether Conda needs to download packages or can use cached versions.

📋 Quick Start Example

# 1. Create a sample list
echo "SRR12345678" > samples.txt
echo "SRR12345679" >> samples.txt

# 2. Create a project configuration
cat > project_config.yaml << EOF
input_dir: "/path/to/raw/reads"
output_dir: "/path/to/output"
reference: "/path/to/host/genome.fa"  # Optional for host removal
EOF

# 3. Run quality control step
python -m MetaMAG.main --project_config project_config.yaml \
  --samples-file samples.txt \
  --steps qc \
  --batch_size 2 \
  --cpus 8 \
  --memory "32G" \
  --time "2:00:00" \
  --log_dir ./logs

Tool Categories

Quality Control

FastQC, MultiQC

Preprocessing

Fastp, BWA, Samtools

Assembly

IDBA, MEGAHIT, MetaQUAST

Binning

MetaWrap, DAS Tool, MetaBAT2, MaxBin2

Evaluation

CheckM2, dRep

Taxonomy

GTDB-Tk, Kraken2, Bracken

Annotation

EggNOG-mapper, dbCAN, Prodigal

🛠 Troubleshooting

Issue Solution
conda: command not found Run source ~/.bashrc or reinstall Miniconda
Environment 'metamag' not found Run python setup_tools.py --tools fastqc to create the environment
Tool not detected Check config_example.py for reference, then edit MetaMAG/config.py
Permission denied errors Avoid using sudo; install in user space with Conda
Config file not found Ensure you're in the repository root directory when running setup
Tools missing after installation Activate the environment: conda activate metamag

Configuration Troubleshooting

# If tools aren't found automatically:

# 1. Find tool location manually
which megahit
# or for conda environments:
find ~/miniconda3 -name "megahit" -type f 2>/dev/null

# 2. Check config_example.py for the format
grep "megahit" config_example.py

# 3. Update your config
nano MetaMAG/config.py
# Add or update the tool path

# 4. Verify the change
python setup_tools.py --verify

🙋 Getting Help