Emmanuelle Bourigault

Emmanuelle Bourigault

Multimodal AI Research Scientist

About

I specialize in building multimodal AI systems that bridge vision, language, and 3D understanding. My research focuses on vision-language models, LLM fine-tuning, and generative AI, turning cutting-edge research into production-ready solutions. Prior to that, I completed my Ph.D. at the University of Oxford's Visual Geometry Group (VGG) working on multimodal learning, generative models, and 3D reconstruction.

Education

2021 - 2025

Ph.D. in Engineering - Multimodal AI & Computer Vision

University of Oxford
Advisor: Prof. Andrew Zisserman
2019 - 2020

MSc in Computational Neuroscience

Imperial College London
2016 - 2019

BSc in Mathematics/Statistics

University College London (UCL)

Technical Skills

LLM & NLP: LangChain, LlamaIndex, Transformers, PEFT, LoRA/QLoRA, GPT-4, Claude, Llama-3, Mistral, BERT/RoBERTa, T5, RLHF, DPO, Constitutional AI, Chain-of-Thought, RAG, Prompt Engineering
Multimodal AI: CLIP, BLIP, BLIP-2, Flamingo, LLaVA, CogVLM, Cross-attention, Co-attention, Multimodal Transformers, VQA, Image Captioning, Visual Grounding
Computer Vision: Vision Transformers (ViT), SAM, DINO, Object Detection, Segmentation, 3D Reconstruction, Diffusion Models (Stable Diffusion, DDPM), Generative Models (3D/4D)
Languages & Tools: Python, C++, MATLAB, R, PyTorch, TensorFlow, JAX, Keras, scikit-learn, Weights & Biases, TensorBoard, Hugging Face, Git, Docker, CUDA, SQL, OpenCV
Infrastructure: AWS, Google Cloud, Azure, Vector Databases (Pinecone, FAISS, Chroma), Distributed Training, MLOps, FastAPI
Soft Skills: Cross-functional collaboration, stakeholder management, research leadership, scientific communication, mentorship, project management, problem-solving, critical thinking, adaptability

Work Experience

May 2025 – Sep 2025

Deep Learning Research Engineer

QuantCo, London, UK
  • Led implementation of production-ready ML pipelines
  • Engineered self-supervised models for image detection/segmentation and 3D reconstruction with scarce labels
Sep 2020 – Sep 2021

AI Software Engineer (part-time)

GE HealthCare, Oxford, UK
  • Implemented automated quality-checks for multi-modal data integration using C++
  • Developed text extraction pipelines from documents for downstream NLP tasks
  • Collaborated effectively with senior software engineering team of 8+ members
Jun 2020 – Sep 2020

AI Research Intern

Netdevices, Paris, France
  • Built ML pipelines integrating structured and unstructured data for outcome prediction
  • Automated data management pipelines using ML for enterprise applications

Publications

2025
UKBOB: One Billion Labeled Masks for Generalizable 3D Segmentation
Emmanuelle Bourigault, Amir Jamaludin, Abdullah Hamdi
International Conference on Computer Vision (ICCV)
Large-scale semi-supervised framework for 3D segmentation with 1B+ labeled masks from 48K+ datasets, enabling generalizable models across diverse imaging modalities.
2025
FrEVL: Leveraging Frozen Pretrained Embeddings for Efficient Vision-Language Understanding Spotlight
Emmanuelle Bourigault, Pauline Bourigault
International Conference on Computer Vision Safe and Trustworthy Multimodal AI Systems Workshop (ICCVW)
Safe and trustworthy vision-language multi-modal approach leveraging frozen CLIP and BLIP embeddings for efficient understanding. Achieved 87% performance of full fine-tuning with 10× fewer parameters.
2025
X-Diffusion: Generating 3D Volumes From a Single Image Using Cross-Sectional Diffusion Models Oral
Emmanuelle Bourigault, Abdullah Hamdi, Amir Jamaludin
International Conference on Computer Vision GAIA Workshop (ICCVW)
Novel diffusion model approach for generating detailed 3D volumes from single 2D images using cross-sectional conditioning.
2024
Estimating 3D Shape from 2D Images Using Vision Transformers Oral
Emmanuelle Bourigault, Amir Jamaludin, Andrew Zisserman
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
2D-to-3D reconstruction using Vision Transformers for shape estimation from single images, achieving state-of-the-art accuracy.
2024
Multi-Modal Information Bottleneck Attribution with Cross-Attention Guidance
Pauline Bourigault, Emmanuelle Bourigault, Danilo Mantic
British Machine Vision Conference (BMVC)
Multi-modal attribution method using information bottleneck theory with cross-attention mechanisms for interpretable AI in vision-language models.
2024
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View
Emmanuelle Bourigault, Pauline Bourigault
Computer Vision and Pattern Recognition Workshop on Generative AI (CVPR)
Scalable multi-view diffusion approach for 3D object reconstruction from single images with flexible viewpoint generation.

Research Projects

Vision-Language CLIP Model

Production-ready CLIP-based multimodal model training only a lightweight fusion network achieving 85-95% SOTA performance with 10 times fewer parameters.

View Project →

LLM RAG System with LangChain

Production RAG system for decision support integrating multimodal inputs.

View Project →

FinDiffusion

Conditional diffusion models for generating realistic synthetic financial time series.

View Project →

Safe RL Normalising Flows

Safe Reinforcement Learning with Normalizing Flows for Uncertainty Quantification in Time Series.

View Project →

UKBOB: Large-Scale Segmentation

Developing one billion labeled masks for generalizable 3D segmentation across diverse domains, enabling large-scale training of AI models.

View Project →

3D Reconstruction from 2D Images

Automated framework using Vision Transformers to estimate 3D shape from single 2D images, achieving state-of-the-art reconstruction accuracy.

View Project →

X-Diffusion: 3D Generation

Cross-sectional diffusion model for generating complete 3D volumes from single slices, setting new benchmarks in volumetric synthesis.

View Project →