Metabolomic and Proteomics Biomarker Discovery

Home / Projects Single Page

Image

Population Scale Metabolomics Analysis and Capacity Building

Collaborated with cross-functional teams at Sapient Bioanalytics, including mass spectrometry, computational chemistry, and principal scientists from the MOMI sites, managing and cleaning data from 50,000 samples, each with over 40,000 metabolomic features, making it the largest metabolomics study to date.
capacity-building initiatives in developing countries by organizing workshops on metabolomics data analysis and offering mentorship to empower global researchers and healthcare professionals.

To The Top

Image

Biomarker identification

Led the analysis on population-scale biomarker identification and risk score machine learning for preeclampsia, stillbirth, small for gestational age, and preterm birth across five global locations as part of the Gates Foundation's MOMI project.

To The Top

Image

Pipeline Development

Developed and Optimized a Three-Stage Analytical Pipeline: Data Cleaning and Stratification: Streamlined the process for handling large-scale datasets, ensuring the integrity and quality of the data for subsequent analysis. Feature Engineering and Regression Analysis with PySpark: Leveraged advanced statistical learning techniques and conducted meta-analysis and pooled analysis to extract meaningful features, synthesize results, and develop predictive models, enabling scalable and efficient processing of vast datasets.

To The Top

Image Image

Data Visualization and Pathway Enrichment Analysis

Employed sophisticated data visualization tools and conducted pathway enrichment analysis to reveal significant biological pathways, providing deeper insights into the data's biological relevance.

To The Top

Image

Machine Learning Risk Score Development

Metabolic Risk Score Development with XGBoost: Applied cutting-edge machine learning techniques using XGBoost with derived biomarkers, combined with hyperparameter optimization via Optuna, to develop a metabolic risk scoring system.

To The Top

Image

Conducted mass-spectrometry-based metabolomic and proteomic analyses

Conducted mass-spectrometry-based metabolomic and proteomic analyses on semaglutide-lead-compound-treated samples, employing advanced bioinformatics workflows and machine learning techniques to identify molecular signatures, potential biomarkers, and mechanistic insights supporting therapeutic development.

To The Top

Take a Chance!

“IN THE END… We only regret the chances we didn’t take, the relationships we were afraid to have,and the decisions we waited too long to make.” ― Lewis Carroll

Download Resume