Thursday, 21 February 2019

Shiny App to access NOAA data

Now that the US Government shutdown is over, it is time to download NOAA weather daily summaries in bulk and store them somewhere safe so that at the next shutdown we do not need to worry.

Below is the code to download data for a series of years:

NOAA_BulkDownload <- function(Year, Dir){
  URL <- paste0("",Year,"/gsod_",Year,".tar")
  download.file(URL, destfile=paste0(Dir,"/gsod_",Year,".tar"),
  if(dir.exists(paste0(Dir,"/NOAA Data"))==FALSE){dir.create(paste0(Dir,"/NOAA Data"))}
        exdir=paste0(Dir,"/NOAA Data"))

An example on how to use this function is below:

Date <- 1980:2019
lapply(Date, NOAA_BulkDownload, Dir="C:/Users/fabio.veronesi/Desktop/New folder")

Theoretically, the process can be parallelized using parLappy, but I have not tested it.

Once we have all the file in one folder we can create the Shiny app to query these data.
The app will have a dashboard look with two tabs: one with a Leaflet map showing the location of the weather stations (markers are shown only at a certain zoom level to decrease loading time and RAM usage), below:

The other tab will allow the creation of the time-series (each file represents only 1 year, so we need to bind several files together to get the full period we are interested in) and it will also do some basic data cleaning, e.g. turn T from F to C, or snow depth from inches to mm. Finally, from this tab users can view the final product and download a cleaned csv.

The code for ui and server scripts is on my GitHub:

Saturday, 16 February 2019

Weather Forecast from MET Office

This is another function I wrote to access the MET office API and obtain a 5-day ahead weather forecast:

METDataDownload <- function(stationID, product, key){
  library("RJSONIO") #Load Library
  connectStr <- paste0("",stationID,"?res=",product,"&key=",key)
  con <- url(connectStr)
  data.json <- fromJSON(paste(readLines(con), collapse=""))
  LocID <- data.json$SiteRep$DV$Location$`i`
  LocName <- data.json$SiteRep$DV$Location$name
  Country <- data.json$SiteRep$DV$Location$country
  Lat <- data.json$SiteRep$DV$Location$lat
  Lon <- data.json$SiteRep$DV$Location$lon
  Elev <- data.json$SiteRep$DV$Location$elevation
  Details <- data.frame(LocationID = LocID,
                        LocationName = LocName,
                        Country = Country,
                        Lon = Lon,
                        Lat = Lat,
                        Elevation = Elev)
  param <-"rbind",data.json$SiteRep$Wx$Param)
  if(product == "daily"){
    dates <- unlist(lapply(data.json$SiteRep$DV$Location$Period, function(x){x$value}))
    DayForecast <-"rbind", lapply(data.json$SiteRep$DV$Location$Period, function(x){x$Rep[[1]]}))
    NightForecast <-"rbind", lapply(data.json$SiteRep$DV$Location$Period, function(x){x$Rep[[2]]}))
    colnames(DayForecast)[ncol(DayForecast)] <- "Type"
    colnames(NightForecast)[ncol(NightForecast)] <- "Type"
    ForecastDF <- plyr::rbind.fill.matrix(DayForecast, NightForecast) %>%
      as_tibble() %>%
      mutate(Date = as.Date(rep(dates, 2))) %>%
      mutate(Gn = as.numeric(Gn),
             Hn = as.numeric(Hn),
             PPd = as.numeric(PPd),
             S = as.numeric(S),
             Dm = as.numeric(Dm),
             FDm = as.numeric(FDm),
             W = as.numeric(W),
             U = as.numeric(U),
             Gm = as.numeric(Gm),
             Hm = as.numeric(Hm),
             PPn = as.numeric(PPn),
             Nm = as.numeric(Nm),
             FNm = as.numeric(FNm))
  } else {
    dates <- unlist(lapply(data.json$SiteRep$DV$Location$Period, function(x){x$value}))
    Forecast <-"rbind", lapply(lapply(data.json$SiteRep$DV$Location$Period, function(x){x$Rep}), function(x){"rbind",x)}))
    colnames(Forecast)[ncol(Forecast)] <- "Hour"
    DateTimes <- seq(ymd_hms(paste0(as.Date(dates[1])," 00:00:00")),ymd_hms(paste0(as.Date(dates[length(dates)])," 21:00:00")), "3 hours")
      extra_lines <- length(DateTimes)-nrow(Forecast)
      for(i in 1:extra_lines){
        Forecast <- rbind(rep("0", ncol(Forecast)), Forecast)
    ForecastDF <- Forecast %>%
      as_tibble() %>%
      mutate(Hour = DateTimes) %>%
      filter(D != "0") %>%
      mutate(F = as.numeric(F),
             G = as.numeric(G),
             H = as.numeric(H),
             Pp = as.numeric(Pp),
             S = as.numeric(S),
             T = as.numeric(T),
             U = as.numeric(U),
             W = as.numeric(W))
  list(Details, param, ForecastDF)

The API key can be obtained for free at this link:

Once we have an API key we can simply insert the station ID and the type of product we want to obtain the forecast. We can select between two products: daily and 3hourly

To obtain the station ID we need to use another query and download an XML with all stations names and ID:


url = paste0("",key)
XML_StationList <- read_xml(url)

write_xml(XML_StationList, "StationList.xml")

This will save an XML, which we can then open with a txt editor (e.g. Notepad++).

The function can be used as follows:

METDataDownload(stationID=3081, product="daily", key)

It will return a list with 3 elements:

  1. Station info: Name, ID, Lon, Lat, Elevation
  2. Parameter explanation
  3. Weather forecast: tibble format
I have not tested it much, so if you find any bug you are welcome to tweak it on GitHub:

Geocoding function

This is a very simple function to perform geocoding using the Google Maps API:

getGeoCode <- function(gcStr, key)  {
  library("RJSONIO") #Load Library
  gcStr <- gsub(' ','%20',gcStr) #Encode URL Parameters
  #Open Connection
  connectStr <- paste0('',gcStr, "&key=",key) 
  con <- url(connectStr)
  data.json <- fromJSON(paste(readLines(con), collapse=""))
  #Flatten the received JSON
  data.json <- unlist(data.json)
  if(data.json["status"]=="OK")   {
    lat <- data.json[""]
    lng <- data.json["results.geometry.location.lng"]
    gcodes <- c(lat, lng)
    names(gcodes) <- c("Lat", "Lng")
    return (gcodes)

Essentially, users need to get an API key from google and then use as an input (string) for the function. The function itself is very simple, and it is an adaptation of some code I found on-line (unfortunately I did not write down where I found the original version so I do not have a way to reference the source, sorry!!).

geoCodes <- getGeoCode(gcStr="11 via del piano, empoli", key)

To use the function we simply need to include an address, and it will return its coordinates in WGS84.
It can be used in a mutate call within dplyr and it is reasonably fast.

The repository is here: