Housed at the Dartmouth Flood Observatory (DFO) and maintained by the University of Colorado, the Global Active Archive of Large Flood Events is a comprehensive repository that records significant flood events from 1985 to the present day. This dynamic and “active” archive continually incorporates data from contemporary flood events, ensuring the information remains current and relevant.
Key Features:
Diverse Data Sources: The information within the archive is meticulously collated from a wide array of sources, including news reports, governmental agencies, instrumental measurements, and satellite remote sensing.
Granular Event Details: Each recorded flood event within the archive is distinct, and its associated details are presented in an “area affected” map. Despite the discrete nature of these records, it’s acknowledged that certain regions experience recurrent flooding, which may necessitate a nuanced approach in classifying such events.
Comprehensive Coverage: The archive boasts a global purview, capturing the magnitude and impact of significant flood events worldwide. It’s noteworthy that while death and displacement estimates from tropical storms encompass all causative factors, storms devoid of notable river flooding aren’t incorporated.
Accessible Formats: The archive is versatile in its presentation:
An online table enumerates recent flood events.
Detailed records from 1985 onwards are accessible in both Excel (.xlsx and .xml) formats. These are periodically updated.
GIS users can tap into zip-compressed MapInfo and Shp format files that elucidate flood catalog numbers, centroids, and other salient attribute details.
Satellite Imaging: A considerable number of flood events have been vividly captured through satellite imaging. These have been further processed by the DFO to generate detailed maps delineating inundation extents. An index of these maps is accessible via the DFO Flood Maps Index.
For those interested in understanding the methodology behind the creation of these invaluable maps and tables, additional notes and insights are available within the archive.
The Global Active Archive, given its academic and research-oriented nature, is generously made available at no charge for scholarly pursuits and educational endeavors.
Sample Citation: G.R. Brakenridge. Global Active Archive of Large Flood Events. Dartmouth Flood Observatory, University of Colorado, USA.
The code chunk provided is written in R and is used for loading multiple libraries/packages. These libraries contain functions and tools that extend the capabilities of R for various tasks. Specifically:
glue: Provides tools to format R strings with embedded code. sf: Used for handling and analyzing spatial data. terra: Offers tools for spatial data handling and analysis. ggplot2: A popular data visualization package in R. dplyr: Part of the tidyverse and is used for data manipulation. ggthemes: Adds extra themes, geoms, and scales to ggplot2. forecast: Used for time series forecasting.
When executed, this code will make the functions and tools from these libraries available for use in the R session.
library(glue)library(sf)
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(terra)
terra 1.7.39
Attaching package: 'terra'
The following object is masked from 'package:glue':
trim
library(ggplot2)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:terra':
intersect, union
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggthemes)library(forecast)
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
library(lubridate)
Attaching package: 'lubridate'
The following objects are masked from 'package:terra':
intersect, union
The following objects are masked from 'package:base':
date, intersect, setdiff, union
ggplot(data=flood_point_data, aes(color=Severity)): This initializes a ggplot object using the flood_point_data dataframe. It also sets the aesthetic color based on the ‘Severity’ column. This means that different severities will be represented by different colors.
geom_sf(size=0.01): Adds spatial data points to the plot. The size=0.01 argument reduces the size of these points, making them small dots on the plot. The original code used cex=0.01, which isn’t a valid argument for geom_sf in ggplot2. Instead, the correct argument to control the size of the points is size.
The resulting plot would visualize the spatial distribution of flood points, with colors indicating the severity of each flood. Adjusting the size parameter can make the dots larger or smaller based on preference.
ggplot(data=flood_point_data): Initializes a ggplot object using the flood_point_data dataframe.
geom_point(aes(x=Began ,y=log_dead), color=“grey90”, alpha=0.8): Adds scatter plot points to the ggplot object, with the x-axis representing the date the flood began (Began) and the y-axis representing the logarithm of the death toll (log_dead). The points are colored light gray (grey90) with a transparency of 0.8, making them slightly see-through.
geom_density_2d(aes(x=Began ,y=log_dead), color=“cornflowerblue”): This adds a 2D density estimation to the plot. The density estimation essentially provides a visual representation of where most of the data points cluster, helping identify patterns in the scatter plot. The density lines are colored cornflower blue.
theme_tufte(): Applies the Tufte theme to the plot, which is a minimalist theme based on the design principles of Edward Tufte, a pioneer in the field of data visualization. This theme helps reduce non-data ink, thereby enhancing the clarity of the visualization.
The resulting plot would visually represent the relationship between the date floods started and the logarithm of the associated death toll, with additional visual cues indicating areas of high density (many events with similar characteristics).
non_spatial_version_data <- as.data.frame(flood_point_data): The spatial data flood_point_data is converted to a regular non-spatial dataframe, and stored in the variable non_spatial_version_data.
first_date <- min(non_spatial_version_data[,1]): It’s assumed that the first column of non_spatial_version_data contains the date data. The minimum date (earliest date) in this column is identified and stored in the variable first_date.
last_date <- max(non_spatial_version_data[,1]): Similarly, the maximum date (most recent date) in the first column is identified and stored in the variable last_date.
This code effectively extracts the range of dates present in the flood_point_data. However, a more robust approach might be to use column names rather than column indices for clarity and to avoid potential errors if the structure of the dataframe changes in the future.
death_series <-ts(st_drop_geometry(flood_point_data[,9]), frequency =365, start =c(1985, 1), end =c(2021,300))head(death_series)
Time Series:
Start = c(1985, 1)
End = c(1985, 6)
Frequency = 365
[1] 26 229 43 21 19 2
st_drop_geometry(flood_point_data[,9]): Before creating the time series, you’re using the st_drop_geometry function from the sf package to drop spatial information from the data. You’re then selecting the 9th column, which presumably contains death data (based on our prior discussions). However, it would be more robust to reference the column by name instead of its index to ensure the correct data is being selected.
ts(…, frequency = 365, start = c(1985, 1), end = c(2021,300)): This initializes a time series (ts) object. The frequency argument is set to 365, suggesting that the data is daily. The start argument specifies that the series begins in January 1985. The end argument, however, seems to have an oversight; the day value of 300 is out of range for typical calendar days. Perhaps you meant to represent the 300th day of 2021, but typically, the ts function expects the format c(year, frequency_value), with the frequency value not exceeding the specified frequency (in this case, 365).
death_series: Displays the created time series object.
death_series %>% decompose(): The decompose() function from the base R stats package breaks a time series into its component parts: trend, seasonal, and random (also called “remainder” or “error”). This allows for a clearer understanding of the underlying patterns in the data.
autoplot(): This function from the forecast package provides automatic ggplot2-based visualization of time series objects. When used on the output of decompose(), it generates a multipanel plot showing the observed, trend, seasonal, and remainder components of the time series.
theme_tufte(): This adds the Tufte theme (from the ggthemes package) to the plot. As previously mentioned, Edward Tufte’s design principles prioritize clarity and minimalism, and this theme helps enhance the readability of the plot by reducing non-data ink.
death_series %>% stl(s.window = “periodic”): The stl() function decomposes the time series using LOESS. It’s a more flexible method compared to the traditional decompose() function, especially when dealing with non-linear trends or changing seasonal patterns. The s.window = “periodic” argument indicates that the seasonal component should be treated as periodic, implying a stable seasonal pattern throughout the series.
autoplot(): Again, this function from the forecast package provides an automatic ggplot2-based visualization of time series objects. When applied to the output of stl(), it creates a multipanel plot displaying the observed series, trend component, seasonal component, and remainder (sometimes termed “noise” or “error”).
theme_tufte(): This overlays the Tufte theme on the plot, providing a minimalist design that emphasizes the data. In essence, this code decomposes the death_series time series using the STL method and visualizes the observed series, its trend, its seasonal patterns, and the remainder, all presented in a clear and minimalist style courtesy of the theme_tufte(). This visualization can be immensely valuable in understanding underlying patterns and variations within time series data.