I’m a research scientist working on speech, language, and multimodal machine learning. My work has focused on making spoken-language systems robust under real-world conditions — limited training data, cross-lingual transfer, noisy recordings, and information that crosses audio, video, and text.
The thread runs from articulatory phonetics during my PhD, through low-resource speech recognition and language-model data augmentation in the Babel programme at LIMSI/CNRS, environmental sound-event detection at Tampere University, and self-supervised multimodal modelling at Aalto University.
PhD from NTU Singapore (2012). See research for selected papers, or Google Scholar for the full list.
📚 Google Scholar · 💼 LinkedIn · 💻 GitHub