
A Databricks-based machine learning project aimed at predicting U.S. domestic flight delays by analyzing millions of flight and weather records. Using PySpark and MLlib, we applied distributed data processing and classification models at scale. Feature engineering and model training were executed using MapReduce-style operations to identify delay patterns influenced by weather, airline, and airport congestion factors. The project demonstrates how cloud-native platforms and big data tools can enable scalable, real-time predictive analytics.
Back to topCitation
For attribution, please cite this work as:
Bakr, Mohamed, Mohamed Bakr, Erica Landreth, Danielle Yoseloff, and
Shruti Gupta. 2025. “Flight Delay Predictions.” April 1. https://mohdbakr.com/projects/flight-delays/.