
Click the icon to the right to download a PDF version.

General Information

Name Manan Saxena
Phone number +1 (814) 769-0852
Location State College, PA, USA (Willing to relocate)

Skills and Tools

Programming Languages C++, Python, R, SQL, Stan, JavaScript, HTML, CSS
Web Development Flexbox, Bootstrap, webapp2, Google App Engine, Google Datastore, jQuery, Docker, Github, Linux, Microsoft Office
Data Science PyTorch, Keras, LangChain, Streamlit, Pandas, Scikit-learn, Scipy, Matplotlib, statsmodel, Prophet, Rcpp, Tidyverse, OpenCV, Eigen, Boost, Tableau
Statistical Methods Generalized Linear Models, Dynamic Linear Models, Classical Time Series Models(ARIMA, AR, MA, Exponential Smoothing), PCA, Cluster Analysis, ANOVA, Missing Value and Imputation methods, Markov Chain Monte Carlo
Machine Learning Methods Linear Regression, Logistic Regression, Decision Trees, Random Forest, Support Vector Machine, Gaussian Mixture Models, Expectation Maximization, XGBoost, Neural Networks


  • 2022-2024
    Master of Science, Informatics (Data Science)
    Penn State University
    • Advisor: Prof. Justin Silverman
    • Cumulative GPA: 3.9/4.0
    • Courses: Applied Statistics, Theory of Statistics, Data Mining, Deep Learning, Advanced Computer Vision
  • 2016-2020
    Bachelor of Technology, Software Engineering
    Delhi College of Engineering
    • Advisors: Prof. Aruna Bhat, Prof. Dinesh Kumar Vishwakarma, Prof. Chhavi Dhiman
    • Cumulative GPA: 3.9/4.0
    • Courses: Artificial Intelligence, Machine Learning, Natural Language Processing, Data Structures and Algorithms


  • May' 23 - Present
    Statistical Researcher
    Penn State University
    • Advisor: Prof. Justin Silverman
    • Developed scalable Bayesian inference for a class of time-series models used to represent longitudinal multinomial data.
    • Validated our approach with application to understanding the factors and patterns in microbiota variation in the humangut.
    • Used matrix calculus to calculate closed-form gradients of the model for parameter estimation leading to optimizer converging 1.5x faster than automatic differentiation in Stan.
    • Generated 95% Credible Intervals of the estimated parameter using Bayesian Bootstrap.
    • Created a high-performance C++ header library for parallelized Metropolis probability distribution sampler with an R interface that is 32x faster than base R.
  • Sept' 21 - June' 22
    Software Development Engineer
    • Managed end-to-end software development of a feature addition to the core sequence builder functionality of the platform
    • Ensured zero-fault live deployments and optimal cross-platform performance.
    • More than 800 users within a month of release
    • Directed UI/UX refinement project, integrating customer feedback collected during interactions and general feedback.
    • Developed REST API endpoints in webapp2 framework integrated with Google App Engine and Google Datastore. Built UIs using Bootstrap and handled dynamic behavior with JavaScript and jQuery.
    • Wrote scripts to automate the restructuring of legacy data using Google Sheets API.
  • June' 19 - July' 21
    Machine Learning Engineer
    Trinity College Dublin
    • Developed proof-of-concept which predicted injuries and subsequent rehabilitation needs for contact sports from videos footage. Used 3D pose estimation, object tracking, and instance image segmentation.
    • Tested proof-of-concept level model on novel rugby tackle data set, comparing to industry benchmark motion capture systems (VICON).
    • Created a model to forecast player movements in rugby tackles using the Kalman Filter.
    • Built an automated pipeline for camera calibration and face blurring to acquire human motion video datasets.
    • Collaborated with coaches and physiotherapists during prototype development to integrate domain knowledge.


  • Jan' 24
    Music Lyrics Analysis and Q&A System
    Developed an interactive system for analyzing and responding to queries about music lyrics. Utilized LangChain framework, combined with ChatGPT (LLM) API for natural language processing (NLP) and YouTube API for lyric extraction from music videos.
  • Aug' 19 - May' 20
    Classification of Breast Cancer Histology Images through Distilling Knowledge
    Implemented a light CNN model for high-resolution breast cancer histology image classification, utilizing knowledge distillation techniques and attention maps with KL-divergence as the loss metric. Leveraged ResNet 50 as a teacher model to improve the performance of a lighter ResNet 8 model, boosting its accuracy from 75% to 80%.
  • Dec' 18 - May' 19
    Skeleton-Based View Invariant Deep Features for Human Activity Recognition
    Introduced novel view-invariant skeletal features to describe spatial-temporal characteristics of human motion. Achieved a 2% accuracy improvement over existing state-of-the-art models on the NUCLA dataset through the application of transfer learning and dynamic image techniques.

Honors and Awards

  • 2016
    • All India Rank 2,661 out of 1.4 million candidates in Joint Entrance Examination Mains (JEE Mains)
  • 2012
    • National Talent Search Examination scholar, ranked among the top 1,000 out of 1.2 million candidates, awarded a scholarship until the completion of undergraduate studies.

Professional Affiliations

  • 2024
    • Member, American Statistical Association (ASA).

Other Interests

  • Music
  • Sports
  • Movies
  • Anime