Downloading DWD 1 Minute Data
The DWD (German weather Service) provides precipitation data with 1 minute resolution at the following url.
To get information about the parameters in the dataset download the meta data. Inside this zip folder you will find a file called Metadaten_Parameter_... .html
with the relevant information.
In this tutorial we will use the following packages:
To download the data we first set the base url:
get folder structure
Files are stored in folders named after the year. To scrape the available years we use the rvest package:
This queries all the a
elements (links) of the page and extracts the href
attribute.
We get a vector with the follwing content:
Since the first element is a link to the parent folder, we exclude it:
To extract the month folders we use a similar approach. Let’s explain it first without a loop:
With paste0(baseurl, year)
we add the year to the url. With str_subset
we extract only those filenames that correspond to the station that we want to download.
The station id can be taken from this file.
Downloading all data
To start downloading data we first have to create a download folder:
With a nested loop we download the data for every month for every year.
read data
To read the downloaded data we use the readr package. The function read_delim offers two functionalities that come in handy here:
- read several files at once and combine the data into a data frame.
- read directly from zip file without the need to extract the files first.
The options we use here are:
delim
: set the delimiter to;
na
: set the value-999
to beNA
trim_ws
: trim whitespace around the data valuescol_types
: define column types as double (d
) or character (c
)
We use the character column type to read the datetime columns, since the date time column format (T
) fails in this case. We use as.POSIXct()
with a custom format string to convert the datetime columns without errors.
save data
We can save the data with the following command:
This way we can load it directly into our next data analysis project.