Conducted a large-scale statistical analysis of voltage behavior across 200+ corporations to define normal operating ranges. Applied feature engineering, unsupervised learning algorithms (GMM, DBSCAN, K-Means, PCA), and developed a fully automated, scalable ML pipeline using Python (OOP) for anomaly detection and voltage profiling.
The solution optimized operational efficiency, reduced costs, and enabled automated technician dispatch through intelligent voltage anomaly identification.
Note: Project details are protected under NDA and not open-sourced.
A hands-on case study exploring Simple and Multiple Linear Regression to predict product sales based on advertising budgets across TV, Radio, and Newspaper. Key contributions include manual implementation of OLS and Gradient Descent, hypothesis testing, confidence interval analysis, and interaction effect modeling—emphasizing both theoretical foundations and statistical interpretability using Python.
View on GitHubManually implemented and visualized gradient descent optimization paths across various cost functions to investigate convergence behaviors under different initialization strategies and learning rates. Explored phenomena like slow convergence, divergence, oscillations, and local minima through a series of four structured case studies.
This project highlights practical limitations of gradient-based optimization and provides valuable insights for debugging, hyperparameter tuning, and model optimization in machine learning workflows.
A comprehensive, step-by-step exploration of core statistical inference concepts—null vs. alternative hypotheses, p-values, significance levels, and critical values—paired with a deep dive into one-tailed vs. two-tailed tests. Through detailed markdown notes, custom visual illustrations, and runnable Jupyter notebooks (with a simulated dataset), this project helps students, data practitioners, and interview-prep learners build strong intuition and hands-on skills in hypothesis testing.
View on GitHubDesigned and visualized both parametric (t-distribution) and non-parametric (bootstrap) confidence intervals to demonstrate how we estimate uncertainty in statistical analysis. This project clarifies what Confidence Interval really mean, how they’re constructed, and debunks common misinterpretations—all through hands-on simulations and intuitive visual guides.
View on GitHub