Projects

Electricity Price Forecasting: Autoregressive and LSTM Model

Developed a electricity price forecasting model to predict day-ahead Locational Marginal Prices (LMP) using time-series analysis and machine learning. After extensive data exploration, which revealed strong daily and weekly seasonality, implemented and compared multiple models, including Seasonal ARIMAX and LSTM neural networks. The optimized LSTM model, enhanced with lagged features, seasonal differencing, and time-based variables, outperformed traditional statistical methods, achieving a MAE of 3.92 and RMSE of 9.21—outperforming the benchmark SARIMAX model (MAE: 4.05, RMSE: 9.87).

  •   Code and Report

AI in Decision Making and Control

Vehicle Trajectory Forecasting Using LSTM Networks

This project develops an LSTM-based deep learning model to predict a vehicle's future trajectory by analyzing its past movement patterns. The model takes 62 timesteps of sequential motion data (including position, velocity, and acceleration) as input and forecasts the vehicle's coordinates (Local_X, Local_Y) for the next 5 timesteps. The architecture consists of a 50-unit LSTM layer followed by two dense layers (30 and 10 units, ReLU activation), trained on 9,400 time-series files over 60 epochs using the Adam optimizer and MSE loss. Designed for applications in autonomous driving and traffic analysis, this model effectively captures temporal dependencies to enable accurate short-term trajectory predictions.

  •   Code

Data Mining and Analytics

Predictive Maintenance of Aircraft Engine

Predictive Maintenance techniques are employed to assess the condition of equipment, enabling proactive maintenance or failure prevention before issues occur. This approach is highly beneficial as it significantly reduces equipment downtime costs. To solve regression task of predicting the remaining useful life (RUL) of a machine, I built a LSTM model using keras library. [In Progress]

  •   Code

Home Credit Default Risk

Developed a machine learning model for kaggle competition as part of course project to predict the likelihood of loan default for Home Credit Group, a financial institution serving underbanked populations. Performed extensive exploratory data analysis on a dataset of 300,000+ loan applications with 122 features. Preprocessed data by handling missing values, encoding features, and normalization. Compared Logistic Regression, Random Forest, and boosting methods, optimizing performance with AutoML (PyCaret/H2O.ai) for hyperparameter tuning. XGBoost achieved the best performance with an ROC-AUC score of 0.74.

  •   Code

US Vehicle Accidents Analysis

This project aimed to analyze and extract insights from US road accident data spanning 2016-2022.I focused on examining California as the top state for road accidents within the United States. Through comprehensive analysis, it aimed to uncover the primary factors contributing to California's high accident rates compared to other states, identifying patterns and trends to inform targeted strategies for accident prevention and improving overall road safety.

  •   Code

Building Energy Analysis

Performed time series analysis on 500+ time-series meter data from buildings data genome project. Utilized K-means Clustering on electrical meter data to identify daily load profiles and implemented a k-nearest neighbor regression model to accurately predict energy consumption with a MAPE of 6.59%.

  •   Code

Tableau Dashboards

I developed two interactive Tableau dashboards as part of my projects. One dashboard focused on Netflix movie analytics, offering insights into viewership trends, ratings, and popular genres. The other dashboard tracked retail sales data for a bicycle company operating in Australia, providing detailed analysis of sales performance, geographical distribution, and product trends.

  •   Netflix   •   Bicycle Sales

Risk Analysis

Risk Analysis for Failure of EV Batteries

Quantitative risk assesment associated with electric vehicle batteries using fault tree and event tree analysis, to identify potential failure modes and their probabilities, highlighting risks of component failure due to overheating. Performance risk assessment use cyclic life testing data to assess battery longevity and failure rates.key finding include the risk value of top event is only 8.19%.Reliability modeling using Weibull distribution and uncertainty analysis estimates mean time to failure (MTTF) of 13,080 hrs.

  •   Report

Simulation of Production Systems

Optimizing CNC Stations in Turbine Manufacturing

This project optimizes Mareana Turbine's production lines, identifying bottlenecks at QA and CNC stations through Flexsim simulation. Proposed enhancements include additional workstations for key impeller lines and adopting Industry 4.0 technologies like predictive maintenance and AI inspection to boost efficiency. The strategy aims to meet demand, increase revenue by $700K, and incorporate digital advancements at a $150K cost.

  •   Presentation