A dataset containing daily total delays of major U.S. airlines. The raw data was obtained from the U.S. Bureau of Transportation Statistics, and pre-processed as described in Hentschel et al. (2022) . Note: The CRAN version of this package contains only data from 2010-2013. The full dataset is available in the GitHub version of this package.
Format
A named list with three entries:
airportsA
data.frame, containing information about US airportsdelaysA numeric matrix, containing daily aggregated delays at the airports in the dataset
flightCountsA numeric array, containing yearly flight numbers between airports in the dataset
Details
flightCounts is a three-dimensional array, containing the number of flights in the dataset
between each pair of airports, aggregated on a yearly basis.
Each entry is the total number of flights between the departure airport (row)
and destination airport (column) in a given year (dimension 3).
This array does not contain any NAs, even if an airport did not operate
at all in a given year, which is simply indicated by zeros.
delays is a three-dimensional array containing daily total positive delays,
in minutes, of incoming and outgoing flights respectively.
Each column corresponds to an airport in the dataset and each row corresponds
to a day. The third dimension has length two, 'arrivals' containing delays of
incoming flights and 'departures' containing delays of outgoing flights.
Zeros indicate that there were flights arriving/departing at that airport
on a given day, but none of them had delays. NAs indicate that there were
no flights arriving/departing at that airport on that day at all.
airports is a data frame containing the following information about a number of US airports.
Some entries are missing, which is indicated by NAs.
IATA3-letter IATA code
Namename of the airport
Citymain city served by the airport
Countrycountry or territory where the airport is located (mostly
"United States")ICAO4-letter ICAO code
Latitudelatitude of the airport, in decimal degrees
Longitudelongitude of the airport, in decimal degrees
Altitudealtitude of the airport, in feet
Timezonetimezone of the airport, in hours offset from UTC
DSTDaylight savings time used at the airport. 'A'=US/Canada, 'N'=None.
Timezone2name of the timezone of the airport
References
Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292 , https://arxiv.org/abs/2210.14292.
See also
Other flight data related topics:
flightCountMatrixToConnectionList(),
getFlightDelayData(),
getFlightGraph(),
plotFlights()
Other datasets:
danube
Examples
# Get total number of flights in the dataset:
totalFlightCounts <- apply(flights$flightCounts, c(1,2), sum)
# Get number of flights for specific years in the dataset:
flightCounts_10_11 <- apply(flights$flightCounts[,,c('2010', '2011')], c(1,2), sum)
# Get list of connections from 2008:
connections_10 <- flightCountMatrixToConnectionList(flights$flightCounts[,,'2010'])