Research area

Bioinformatics

Building reproducible pipelines, algorithms and analysis workflows that turn raw data into trustworthy results — at any scale.

See the science →Publications

% reproducible runs

+ pipelines

compute targets

Overview

What this area is.

Bioinformatics is the engineering backbone of the ecosystem. We design portable, reproducible workflows (Nextflow / Snakemake) that run identically on a laptop, an HPC cluster or the cloud, with QC baked in at every step.

Beyond running tools, we benchmark and build methods — choosing the right aligner, quantifier or statistical model for the question, and documenting every parameter so results can be reproduced and trusted.

Tools & technologies

Volcano plotDifferential expression at a glance.

UMAP embeddingStructure and batch effects in high-dimensional data.

Capabilities

What we do.

Core methods we apply in bioinformatics.

Reproducible pipelines

Nextflow / Snakemake workflows, containerised and version-pinned.

Quality control

FastQC and MultiQC across cohorts, with automated gating.

Alignment & quantification

Short- and long-read alignment, expression quantification, count matrices.

Differential analysis

Robust statistics for differential expression and abundance.

Scalability

Seamless scaling from laptop to HPC/cloud with the same code.

Method benchmarking

Objective comparison of tools against truth sets and metrics.

Workflow

From data to insight.

How a bioinformatics project flows end to end.

Raw data

FASTQ / BAM in

QC

FastQC · MultiQC

Process

align · quantify

Model

stats / ML

Visualise

figures & reports

Reproduce

containers · provenance

Visual analytics

Publication-grade figures.

Interactive, live-rendered visualisations used in bioinformatics.

Volcano plotDifferential expression at a glance.

UMAP embeddingStructure and batch effects in high-dimensional data.

Pathway networkEnriched pathways from a gene list.

Coverage trackQC of alignment depth across a region.

Focus

Where we go deep.

Workflow engineering

Portable, auditable pipelines that others can rerun exactly.

AMR sequencing

Whole-genome bacterial pipelines (Illumina & Nanopore) for resistance profiling.

Method development

New algorithms where existing tools fall short.

Insights

Questions we answer.

A few of the things people ask about bioinformatics — and our short answers. Ask CGB-AI for more.

What makes analysis reproducible?

Pinned tool versions, containers, captured parameters and a workflow engine — so the same inputs always yield the same outputs.

Can you scale to large cohorts?

Yes — the same workflow runs across HPC and cloud with parallelisation, so cohort size is a scheduling problem, not a rewrite.

Selected research

Publications in Bioinformatics.

Drawn from our full record of 173 papers, filtered to this area.

Browse all publications →

Keep exploring

Start a bioinformatics project.

Tell us the biological question and the data you have — we will map out an approach.

Collaborate with us →