Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tc_upset bug #25

Closed
joeytalbot opened this issue Jul 17, 2020 · 5 comments
Closed

tc_upset bug #25

joeytalbot opened this issue Jul 17, 2020 · 5 comments

Comments

@joeytalbot
Copy link
Contributor

joeytalbot commented Jul 17, 2020

Casualties or vehicles are not being properly assigned to the 'Other' category.

The figure in the function help example has lots of accidents with lone pedestrians or with lone cyclists. This isn't correct. Instead these crashes should probably be 'pedestrian + other' and 'cyclist + other.'

remotes::install_github("saferactive/trafficalmr")
library(trafficalmr)
remotes::install_github("krassowski/complex-upset")

crash_summary = tc_join_stats19(crashes_wf, casualties_wf, vehicles_wf)


# create plot with 'Other' category
casualties_wf2 = dplyr::mutate(
  casualties_wf,
  casualty_type_simple = dplyr::case_when(
    casualty_type == "Car occupant" ~ "Car",
    casualty_type == "Pedestrian" ~ "Pedestrian",
    casualty_type == "Cyclist" ~ "Cyclist",
    TRUE ~ "Other"
  )
)
crash_summary = tc_join_stats19(crashes_wf, casualties_wf2, vehicles_wf)
tc_upset(crash_summary, casualty_type = c("Car", "Pedestrian", "Bicycle", "Other"))
@Robinlovelace
Copy link
Contributor

Great, can you try creating a 'reprex' (reproducible example with output generated using the reprex package) with the following command?:

reprex::reprex({
  remotes::install_github("saferactive/trafficalmr")
  library(trafficalmr)
  remotes::install_github("krassowski/complex-upset")
  
  crash_summary = tc_join_stats19(crashes_wf, casualties_wf, vehicles_wf)
  
  
  # create plot with 'Other' category
  casualties_wf2 = dplyr::mutate(
    casualties_wf,
    casualty_type_simple = dplyr::case_when(
      casualty_type == "Car occupant" ~ "Car",
      casualty_type == "Pedestrian" ~ "Pedestrian",
      casualty_type == "Cyclist" ~ "Cyclist",
      TRUE ~ "Other"
    )
  )
  crash_summary = tc_join_stats19(crashes_wf, casualties_wf2, vehicles_wf)
  tc_upset(crash_summary, casualty_type = c("Car", "Pedestrian", "Bicycle", "Other"))
  })

@Robinlovelace
Copy link
Contributor

Update: I'm just trying this code and think I'm close to a solution.

#' table(vehicles_wf$vehicle_type)
#' vehicles_wf2 = dplyr::mutate(
#'   vehicles_wf,
#'   vehicle_type_simple = dplyr::case_when(
#'      vehicle_type == "Car" ~ "Car",
#'      vehicle_type == "Cyclist" ~ "Pedal cycle",
#'      TRUE ~ "Other"
#'     )
#'   )

@Robinlovelace
Copy link
Contributor

After the fix the figure looks like this on my computer:

image

@joeytalbot
Copy link
Contributor Author

What about car-bicycle crashes? They are missing now. I find it hard to believe there are fewer of them than car-bicycle-other crashes.

@Robinlovelace
Copy link
Contributor

Robinlovelace commented Jul 17, 2020

Good question. I'm not sure. May be worth doing a bit of exploration, e.g. starting with this (updated) code:

library(trafficalmr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
nrow(crashes_wf)
#> [1] 3449
nrow(casualties_wf)
#> [1] 4245
nrow(vehicles_wf)
#> [1] 6277
crashes_joined1 = inner_join(crashes_wf, casualties_wf)
#> Joining, by = "accident_index"
nrow(crashes_joined1)
#> [1] 4245
crashes_joined2 = inner_join(crashes_joined1, vehicles_wf, by = "accident_index")
nrow(crashes_joined2)
#> [1] 8033
crashes_joined2 %>% 
  group_by(vehicle_type, casualty_type) %>% 
  summarise(n = n()) %>% 
  ungroup() %>% 
  top_n(n = 10, wt = n) 
#> `summarise()` regrouping output by 'vehicle_type' (override with `.groups` argument)
#> # A tibble: 10 x 3
#>    vehicle_type                  casualty_type                                 n
#>    <chr>                         <chr>                                     <int>
#>  1 Bus or coach (17 or more pas… Bus or coach occupant (17 or more pass s…   160
#>  2 Car                           Car occupant                               4072
#>  3 Car                           Cyclist                                     459
#>  4 Car                           Motorcycle 125cc and under rider or pass…   240
#>  5 Car                           Pedestrian                                  628
#>  6 Motorcycle 125cc and under    Motorcycle 125cc and under rider or pass…   305
#>  7 Motorcycle over 500cc         Motorcycle over 500cc rider or passenger     91
#>  8 Pedal cycle                   Cyclist                                     582
#>  9 Van / Goods 3.5 tonnes mgw o… Car occupant                                194
#> 10 Van / Goods 3.5 tonnes mgw o… Van / Goods vehicle (3.5 tonnes mgw or u…   103

Created on 2020-07-17 by the reprex package (v0.3.0)

Further exploration of this may provide solutions for #24.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants