Skip to content

Spark application that creates a machine learning model for predicting the arrival delay of commercial flights

License

Notifications You must be signed in to change notification settings

alvaroame/FlightDelaySpark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlightDelaySpark

Spark application that creates a machine learning model for predicting the arrival delay of commercial flights

How to execute the script

First, install the requirements

python -m pip install -r requirements.txt

For local, execute spark-submit -master local[*] FlightDelay.py PATH SAMPLE LOG

  • PATH is the location of CVS files; default is data/*.csv
  • SAMPLE is the fraction [0-1] for sampling the original data set, 0.1 is 10%; default: 1.0 (100%)
  • LOG is the Log level: INFO, WARN, ERROR; default: WARN

You can find data here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HG7NV7

Execution example

  • Execution example with PATH
%SPARK_HOME%\bin\spark-submit --master local[*] FlightDelay.py file:///C:\UPM\big_data_assignments\data\2000
  • Execution example with PATH and SAMPLE
%SPARK_HOME%\bin\spark-submit --master local[*] FlightDelay.py file:///C:\UPM\FlightDelaySpark\data 0.1
  • Execution example with PATH, SAMPLE and LOG
%SPARK_HOME%\bin\spark-submit --master local[*] FlightDelay.py file:///C:\UPM\FlightDelaySpark\data 0.05 ERROR

About

Spark application that creates a machine learning model for predicting the arrival delay of commercial flights

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages