nf-core/configs: DKFZ configuration
To use, run the pipeline with -profile dkfz. This will download and launch the dkfz.config, pre-configured for the Deutsches Krebsforschungszentrum (DKFZ) / ODCF LSF cluster in Heidelberg, Germany.
This configuration is tested with Nextflow 25.10.0 (available on the cluster as a module).
The profile only configures the cluster itself (LSF executor, dynamic queue selection, scratch, resource limits and the /omics bind-mount). Pick a container engine on the command line, e.g. -profile dkfz,apptainer or -profile dkfz,conda.
⚠️ Use Apptainer/Singularity (or Conda), not Docker. On the ODCF cluster Docker is only available through LSF’s
docker-genericapplication profile. Nextflow’sdockerexecutor runsdocker rundirectly on the node, which this setup does not allow, so-profile dkfz,dockerwill not work. Use-profile dkfz,apptainerinstead.
Before you use this profile
-
Load Nextflow via the environment module system on a submission host. Check the pipeline’s README for the required Nextflow version:
module load Nextflow/25.10.0 -
Submit from a submission host (
bsub01.lsf.dkfz.de/bsub02.lsf.dkfz.de). Do not run heavy work on the login/worker nodes. Wrap the Nextflow driver itself in absubjob (see below). -
The shared
/omicsfilesystem is bind-mounted into every container automatically. If your inputs or references live elsewhere, pointNXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIRat a path under/omicsso images are cached on shared storage:export NXF_APPTAINER_CACHEDIR=/omics/groups/<your-group>/.../apptainer_cache
Queues
Queue selection is automatic, based on each task’s requested time and memory:
| Queue | Selected when | Limit |
|---|---|---|
short | no time given, or time <= 10.min | 10 min |
medium | time <= 1.h | 1 hour |
long | time <= 10.h | 10 hours |
verylong | time > 10.h | no hard limit |
highmem | memory > 200.GB | up to ~4 TB |
Note: highmem is the only queue that accepts requests above 200 GB (and it rejects requests below 200 GB).
Resource limits, retries and containers
- Every task is capped to what the cluster can provide via
process.resourceLimits(64 CPUs, 1000 GB memory, 720 h). Requests above these are capped automatically. - Unlabelled processes default to a safe 1 CPU / 6 GB / 10 min.
- The shared
/omicsfilesystem is bound into every container viacontainerOptions, with--nvadded for accelerator tasks. If one of your modules sets its owncontainerOptions, re-add--bind /omicsthere.
Enable GPU support
This profile turns any task that requests a GPU through Nextflow’s standard accelerator directive into a correct DKFZ GPU submission. It selects the GPU queue, builds the LSF -gpu num=<n>:j_exclusive=yes[:gmem=<n>G] request, and adds --nv so the GPU is visible inside the container.
How a task acquires an accelerator request depends on the pipeline:
-
nf-core pipelines mark GPU-capable processes with the
process_gpulabel and only switch the accelerator on when the run includes thegpuprofile. So addgputo your profile list:nextflow run <pipeline> -profile dkfz,gpu,apptainer --input ... --outdir ... -
Custom / non-nf-core pipelines just declare
acceleratoron the GPU process:process MY_GPU_TASK { accelerator 1 container 'docker://nvcr.io/...' script: "my_gpu_tool ..." }nextflow run main.nf -profile dkfz,apptainer --outdir ...
Tasks without an accelerator request are unaffected and run on the normal CPU queues.
Choosing the GPU queue
The --dkfz_gpu_queue parameter selects which GPU queue all GPU jobs are submitted to (default gpu):
gpu— default (RTX 2080 Ti … V100/A100-DGX), 72 h wall timegpu-lowprio— same nodes asgpubut low priority; use for large job batchesgpu-pro— high-end A100/H200/L40S/GH200, 142 h wall time — requires a separate access application to the DKFZ Data Science Board
Number of GPUs and GPU memory per process
The profile builds the LSF request as -gpu num=<n>:j_exclusive=yes[:gmem=<n>G] (DKFZ requires j_exclusive=yes and rejects mode=exclusive_process). Two things are tunable per process:
- Number of GPUs — the
acceleratordirective (default 1). - GPU memory (optional) — set
ext.gpu_memoryto a Nextflow memory value to pin the job to GPUs with at least that much VRAM. Whenext.gpu_memoryis unset,gmemis omitted and LSF assigns any free GPU.
Approximate values to target each GPU tier (request at or just below the card’s usable VRAM):
ext.gpu_memory | Targets | Queue |
|---|---|---|
10.GB | RTX 2080 Ti (11 GB) | gpu |
15.GB | V100 16 GB | gpu |
23.GB | TITAN RTX / Quadro RTX (24 GB) | gpu |
31.GB | V100 32 GB | gpu |
40.GB | A100 40 GB | gpu-pro only |
46.GB | L40S | gpu-pro only |
98.GB | GH200 | gpu-pro only |
141.GB | H200 | gpu-pro only |
Set these directly on the process, or per process name from config (e.g. nf-core’s conf/modules.config):
process {
// 2 GPUs, any free GPU (no gmem constraint)
withName: 'FOO:BAR:ALIGN_GPU' {
accelerator = 2
}
// 1 big-memory GPU
withName: 'FOO:BAR:FOLD' {
accelerator = 1
ext.gpu_memory = 40.GB // -> A100/L40S/H200; also set --dkfz_gpu_queue gpu-pro
}
}⚠️ Requesting
40.GBor more only works ongpu-pro. On the plaingpuqueue such a request hangs inPENDforever. Use at most 12 CPUs and ~45 GB host RAM per GPU (DKFZ GPU usage policy).
Running Nextflow on the cluster
Run the Nextflow driver inside an LSF job rather than on a submission host directly. Make a script and submit it with bsub < my_script.sh:
#!/bin/bash
#BSUB -J nf_pipeline
#BSUB -o nf_pipeline.%J.log
#BSUB -q long
#BSUB -n 2
#BSUB -R "rusage[mem=8G]"
#BSUB -W 10:00
module load Nextflow/25.10.0
# Cache images on shared storage so worker nodes can reach them:
export NXF_APPTAINER_CACHEDIR=/omics/groups/<your-group>/.../apptainer_cache
nextflow run <pipeline> \
-profile dkfz,apptainer \
--input samplesheet.csv \
--outdir resultsAdd gpu to -profile (e.g. -profile dkfz,gpu,apptainer) to send process_gpu tasks to a GPU queue.
Config file
// Institutional profile for the DKFZ / ODCF LSF cluster.
params {
config_profile_description = 'Deutsches Krebsforschungszentrum (DKFZ) ODCF HPC cluster profile'
config_profile_contact = 'Abid Abrar (abid.abrar@dkfz-heidelberg.de), Kübra Narcı (kuebra.narci@dkfz-heidelberg.de)'
config_profile_name = 'DKFZ Cluster'
config_profile_url = 'https://www.dkfz.de'
max_cpus = 64
max_memory = '1000.GB'
max_time = '720.h'
// GPU queue for GPU jobs (options: gpu (default), gpu-lowprio, gpu-pro)
dkfz_gpu_queue = 'gpu'
}
apptainer {
enabled = true
autoMounts = true
}
// Ignore the custom dkfz_gpu_queue param in nf-schema validation
validation.ignoreParams = ['dkfz_gpu_queue']
process {
executor = 'lsf'
scratch = '$CLUSTER_SCRATCHDIR'
// Retry transient failures: no exit status, signals 130–145 (137 = OOM/preempt), 104/255 (I/O drops)
errorStrategy = { (task.exitStatus == null || task.exitStatus == Integer.MAX_VALUE || task.exitStatus in ((130..145) + [104, 255])) ? 'retry' : 'finish' }
maxRetries = 3
cache = 'lenient'
// Cap every task to the cluster ceiling: 64 cores, 1000 GB RAM, 720 h (30 day) wall time
resourceLimits = [
cpus : 64,
memory: 1000.GB,
time : 720.h,
]
// Low defaults for unlabelled processes
cpus = 1
memory = 6.GB
time = 10.min
// GPU tasks go to a GPU queue; everything else to a CPU queue by time/memory.
queue = {
if (task.accelerator) {
return params.dkfz_gpu_queue
} else if (task.memory && task.memory > 200.GB) {
return 'highmem'
} else if (!task.time || task.time <= 10.min) {
return 'short'
} else if (task.time <= 1.h) {
return 'medium'
} else if (task.time <= 10.h) {
return 'long'
} else {
return 'verylong'
}
}
// GPU request, depends on `accelerator`: a nf-core `process_gpu` task without
// `-profile gpu` has no accelerator, so it stays on CPU.
// j_exclusive=yes is mandatory
// optional `ext.gpu_memory` pins to GPUs with at least that much VRAM.
clusterOptions = {
if (!task.accelerator) {
return null
}
def gpu = "-gpu num=${task.accelerator.request}:j_exclusive=yes"
if (task.ext.gpu_memory) {
gpu += ":gmem=${task.ext.gpu_memory.toGiga()}G"
}
return gpu
}
// Bind /omics into every container; add --nv for GPU tasks.
containerOptions = { task.accelerator ? '--bind /omics --nv' : '--bind /omics' }
}
executor {
name = 'lsf'
perJobMemLimit = true
perTaskReserve = false
queueSize = 10
submitRateLimit = '1 sec'
exitReadTimeout = '30 min'
}