Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
2018-Airplane-Flights - MaRDI portal

2018-Airplane-Flights

From MaRDI portal
Dataset:6036941



OpenML43849MaRDI QIDQ6036941

OpenML dataset with id 43849

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102674/2018-Airplane-Flights.arff

Upload date: 24 March 2022



Dataset Characteristics

Number of features: 14 (numeric: 11, symbolic: 0 and in total binary: 0 )
Number of instances: 9,534,417
Number of instances with missing values: 0
Number of missing values: 0

Dataset Description Story View the ReadMe file in my Github repo for this project. Check out all the info on my portfolio's webpage for this project. As I write this, I'm a Data Science student. To add to my portfolio, I wanted to build a web app to predict the prices of airline flight prices: the user would be able to select an origin and a destination. I found a database from The Bureau of Transportation Statistics. I downloaded their data from Q1, Q2, Q3, and Q4 from 2018 a total of 27M+ rows and 42 columns. For my pricing prediction purposes, I eliminated unnecessary columns, renamed some columns, and refined it for consistency bringing it to a new total of 9M+ rows and 13 columns. Have fun and share your kernels, please! Column Descriptions 1. Unnamed: drop this column (it's a duplicate index column) 2-3. ItinID MktID: vaguely demonstrates the order in which tickets were ordered (lower ID 's being ordered first) 4. MktCoupons: the number of coupons in the market for that flight 5. Quarter: 1, 2, 3, or 4, all of which are in 2018 6. Origin: the city out of which the flight begins 7. OriginWac: USA State/Territory World Area Code 8. Dest: the city out of which the flight begins 9. DestWac: USA State/Territory World Area Code 10. Miles: the number of miles traveled 11. ContiguousUSA: binary column -- (2) meaning flight is in the contiguous (48) USA states, and (1) meaning it is not (ie: Hawaii, Alaska, off-shore territories) 12. NumTicketsOrdered: number of tickets that were purchased by the user 13. Airline Company: the two-letter airline company code that the user used from start to finish (key codes below) 14. PricePerTicket: target prediction column Airline Company Codes (in order of frequency for this dataset) WN -- Southwest Airlines Co. DL -- Delta Air Lines Inc. AA -- American Airlines Inc. UA -- United Air Lines Inc. B6 -- JetBlue Airways AS -- Alaska Airlines Inc. NK -- Spirit Air Lines G4 -- Allegiant Air F9 -- Frontier Airlines Inc. HA -- Hawaiian Airlines Inc. SY -- Sun Country Airlines d/b/a MN Airlines VX -- Virgin America USA State/Territory World Area Codes 1 Alaska 2 Hawaii 3 Puerto Rico 4 U.S. Virgin Islands 5 U.S. Pacific Trust Territories and Possessions 11 Connecticut 12 Maine 13 Massachusetts 14 New Hampshire 15 Rhode Island 16 Vermont 21 New Jersey 22 New York 23 Pennsylvania 31 Delaware 32 District of Columbia 33 Florida 34 Georgia 35 Maryland 36 North Carolina 37 South Carolina 38 Virginia 39 West Virginia 41 Illinois 42 Indiana 43 Michigan 44 Ohio 45 Wisconsin 51 Alabama 52 Kentucky 53 Mississippi 54 Tennessee 61 Iowa 62 Kansas 63 Minnesota 64 Missouri 65 Nebraska 66 North Dakota 67 South Dakota 71 Arkansas 72 Louisiana 73 Oklahoma 74 Texas 81 Arizona 82 Colorado 83 Idaho 84 Montana 85 Nevada 86 New Mexico 87 Utah 88 Wyoming 91 California 92 Oregon 93 Washington




This page was built for dataset: 2018-Airplane-Flights