Emmanuelle Bourigault

Emmanuelle Bourigault

Research Engineer · Multimodal Systems · VLM Evaluation

About

I build multimodal ML systems end-to-end from large-scale data pipelines and model training to evaluation infrastructure and production deployment. My work spans VLM post-training (SFT, DPO, RLHF-style preference optimisation), hallucination and faithfulness measurement, and evaluation tooling for vision-language models at scale.

I hold a PhD from the University of Oxford (Visual Geometry Group), where I published at top venues and built production-quality codebases for multimodal learning, 3D reconstruction, and generative models.

Experience

May 2025 – Sep 2025

AI Engineer — Multimodal ML

QuantCo · London, UK
  • Building multimodal ML pipelines for production decision-support systems
  • Designing evaluation and regression-testing infrastructure for model releases
Oct 2021 – Apr 2025

PhD Researcher

University of Oxford · Oxford, UK
  • Built distributed training workflows for diffusion and vision-transformer models on multi-GPU clusters
  • Processed and curated datasets at scale (1B+ labelled masks, 48K+ source datasets)
  • Published 6 papers at ICCV, MICCAI, CVPR, and BMVC.
Oct 2021 – Jan 2022

ML Engineer Intern

Novartis · Oxford, UK
  • Built multimodal learning pipelines for medical imaging data
  • Worked across data engineering, model training, and evaluation
Sep 2020 – Sep 2021

AI Software Engineer

GE HealthCare · Oxford, UK
  • Developed ML-powered features for clinical imaging products
  • Integrated models into production software with CI/CD and testing
Jun 2020 – Sep 2020

AI Engineer Intern

Netdevices · Paris, France
  • Prototyped deep learning pipelines for healthcare applications

Open-Source & Tooling

mmeval-vrag

Pip-installable Python package for evaluating multimodal RAG systems. Measures retrieval quality, hallucination rate, answer faithfulness, and cross-modal alignment. Supports checkpoint comparison and automated regression testing for CI integration.

pip install mmeval-vrag

SciVLA-Verify

Agentic evaluation toolkit for multimodal scientific reasoning. Generates verified reasoning traces and SFT/DPO preference data for post-training pipelines. Designed for integration into automated model-improvement loops.

Technical Skills

Languages & Core: Python (proficient), C++ (intermediate), Bash, Git
ML Frameworks: PyTorch, JAX, Hugging Face Transformers, vLLM, ONNX
Infrastructure: Docker, Kubernetes, AWS (EC2/S3/SageMaker), FastAPI, CI/CD, distributed training (multi-node GPU)
Data & Retrieval: FAISS, Pinecone, Chroma, PostgreSQL, large-scale data filtering, synthetic data generation, DICOM
Post-Training: SFT, DPO, RLHF-style preference optimisation, PEFT/LoRA/QLoRA, reward-model evaluation
Evaluation: Hallucination detection, retrieval faithfulness, visual grounding, cross-modal alignment, regression testing, benchmark design
Models: CLIP, BLIP-2, LLaVA-family, Vision Transformers, diffusion models, 3D reconstruction, multimodal RAG

Projects

UKBOB: Billion-Scale 3D Segmentation Pipeline

End-to-end pipeline generating 1B+ labelled masks across 48K+ datasets for generalizable 3D segmentation. Semi-supervised learning with automated quality control.

PyTorch Distributed Training Large-Scale Data

FrEVL: Parameter-Efficient VLM Adaptation

Vision-language adaptation using frozen CLIP/BLIP embeddings. Strong multimodal performance with significantly fewer trainable parameters. Designed for fast iteration and low compute cost.

CLIP BLIP Efficient Fine-Tuning

Agentic RAG System

Retrieval-augmented generation system with grounding diagnostics, retrieval quality metrics, and evaluation of multimodal outputs. Built for decision-support use cases.

RAG FastAPI Vector DB

X-Diffusion: 3D Volume Generation

Cross-sectional diffusion model generating complete 3D volumes from sparse inputs. Production-oriented codebase with reproducible training and inference pipelines.

Diffusion Models 3D PyTorch

MVDiff: Multi-View Diffusion for 3D Reconstruction

Scalable multi-view generation pipeline for 3D object reconstruction from single images with flexible viewpoint conditioning.

Diffusion Multi-View 3D Recon

2D→3D Shape Estimation (ViT)

Vision Transformer pipeline estimating 3D shape from single 2D images with clinically relevant evaluation and geometric reasoning.

ViT 3D Recon Medical

Publications

2025
UKBOB: One Billion Labeled Masks for Generalizable 3D Segmentation
Emmanuelle Bourigault, Amir Jamaludin, Abdullah Hamdi
ICCV 2025
2025
FrEVL: Leveraging Frozen Pretrained Embeddings for Efficient Vision-Language Understanding Spotlight
Emmanuelle Bourigault, Pauline Bourigault
ICCV Workshop 2025
2025
X-Diffusion: Generating 3D Volumes From a Single Image Oral
Emmanuelle Bourigault, Abdullah Hamdi, Amir Jamaludin
ICCV Workshop 2025
2024
Estimating 3D Shape from 2D Images Using Vision Transformers Oral
Emmanuelle Bourigault, Amir Jamaludin, Andrew Zisserman
MICCAI 2024
2024
Multi-Modal Information Bottleneck Attribution with Cross-Attention Guidance
Pauline Bourigault, Emmanuelle Bourigault, Danilo Mantic
BMVC 2024
2024
MVDiff: Scalable Multi-View Diffusion for 3D Reconstruction
Emmanuelle Bourigault, Pauline Bourigault
CVPR Workshop 2024

Education

2021 – 2025

PhD in Multimodal AI & Computer Vision

University of Oxford — Visual Geometry Group
Advisor: Prof. Andrew Zisserman
2019 – 2020

MSc in Computational Neuroscience

Imperial College London
2016 – 2019

BSc in Mathematics & Statistics

University College London (UCL)