Sunday, 11 March 2018

Street Crime UK - Shiny App

Introduction

This is a shiny app to visualize heat maps of Street Crimes across Britain from 2010-12 to 2018-01 and test their spatial pattern.
The code for both ui.R and server.R is available from my GitHub at: https://github.com/fveronesi/StreetCrimeUK_Shiny

Usage

Please be aware that this apps downloads data from my personal Dropbox once it starts and every time the user changes some of the settings. This was the only work-around I could think of to use external data in shinyapps.io for free. However, this also makes the app a bit slow, so please be patient.
Users can select a date with two sliders (I personally do not like the dateInput tool), then a crime type and click Draw Map to update the map with new data. I also included a option to plot the Ripley K-function (function Kest in package spatstat) and the p-value of the quadrat.test (again from spatstat). Both tools work using the data shown within the screen area, so their results change as users interact with the map. The Ripley K function shows a red dashed line with the expected nearest neighbour distribution of points that are randomly distributed in space (i.e. follow a Poisson distribution). The black line is the one computed from the points shown on screen. If the black line is above the red means the observations shown on the map are clustered, while if it is below the red line means the crimes are scattered regularly in space. A more complete overview of the Ripley K function is available at this link from ESRI.
The p-value from the quadrat test is testing a null hypothesis that the crimes are scattered randomly in space, against an alternative that they are clustered. If the p-value is below 0.05 (significance level of 5%) we can accept the alternative hypothesis that our data are clustered. Please be aware that this test does not account for regularly space crimes.

NOTE

Please not that the code here is not reproducible straight away. The app communicates with my Dropbox, though the package rdrop2, which requires a token to download data from Dropbox. More info github.com/karthik/rdrop2.
I am sharing the code to potentially use a taken downloaded from elsewhere, but the url that points to my Dropbox will clearly not be shared.

Preparing the dataset

Csv files with crime data can be downloaded directly from the data.police.uk website. Please check the dates carefully, since each of these files contains more that one years of monthly data. The main issue with these data is that they are divided by local police forces, so for example we will have a csv for each month from the Bedfordshire Police, which only covers that part of the country. Moreover, these csv contain a lot of data, not only coordinates; they also contain the type of crimes, plus other details, which we do not need and which makes the full collection a couple of Gb in size.
For these reasons I did some pre-processing, First of all I extracted all csv files into a folder named "CrimeUK" and then I ran the code below:
lista = list.files("E:/CrimesUK",pattern="street",recursive=T,include.dirs=T,full.names=T,ignore.case = T)

for(i in lista){
  DF = read.csv(i)

   write.table(data.frame(LAT=DF$Latitude, LON=DF$Longitude, TYPE=DF$Crime.type),
               file=paste0("E:/CrimesUK/CrimesUK",substr(paste(DF$Month[1]),1,4),"_",substr(paste(DF$Month[1]),6,7),".csv"),
               sep=",",row.names=F,col.names=F, append=T)
   print(i)
}
Here I first create a list of all csv files, with full link, searching inside all sub directory. Then I started a for loop to iterate through the files. The loop simply loads each file and than save part of its contents (namely coordinates and crime type) into new csv named after using year and month. This will help me identify which files to download from Dropbox, based on user inputs.
Once I had these files I simply uploded them to my Dropbox.

The link to test the app is:

fveronesi.shinyapps.io/CrimeUK/


A snapshot of the screen is below:

7 comments:

  1. Apps on shinyapps.io can read data from local files. For example, if you `read.csv` with a relative path to a file in the same directory or a subdirectory of the shiny app, the data file will be uploaded to shinyapps.io with the rest of the app. See for example http://docs.rstudio.com/shinyapps.io/Storage.html

    ReplyDelete
  2. As above, you can upload the data to shiny apps without having it read externally. The Police data also has an API which might be of interest for reproducibility.

    ReplyDelete
  3. Thank you both for the comments.
    My main issue is that the past 15 years of data for England, Wales and Northern Ireland are about 1.68 Gb in size. We are not talking about a small table.

    I'm not sure whether shinyapps.io allows to upload data of that size. Moreover, loading the whole dataset upfront in R would not be practical. Even assuming we have enough RAM on the server, the loading process would take way too long for a web based application.

    I decided to pre-process the data into smaller files, one per month. When the user interacts with the map the data are downloaded depending on what the user selects. This way R will only need to download from Dropbox and load chunks of about 20 Mb.

    Fabio

    ReplyDelete
  4. Great job!, really, and very inspiring. Thanks.

    ReplyDelete
  5. where does input$mymap_bounds come from ?

    ReplyDelete
    Replies
    1. Please look this page: https://rstudio.github.io/leaflet/shiny.html

      In particular the section Inputs/Events

      There are a series of functions that we can use to observe how the user interacts with the map. One of these functions is: input$MAPID_bounds
      This takes the coordinates of the window the user is looking at (after zooming in). I just replaced MAPID with the ID I selected for my map, which is mymap.

      Delete
  6. Ow its a nice post. Thanks for sharing this type of Freelancing tips.
    I already bookmarking your site. You can also visit my Freelancing Blog
    for get some more information about Freelancing Tutorial
    and Where You can Watch all type of Live streaming Game/Sports.

    ReplyDelete