Skip to content

Commit

Permalink
address #344 and drive by tm_player_bio fix
Browse files Browse the repository at this point in the history
  • Loading branch information
JaseZiv committed Dec 2, 2023
1 parent 8053e5f commit ef359b7
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 7 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: worldfootballR
Title: Extract and Clean World Football (Soccer) Data
Version: 0.6.4.0013
Version: 0.6.4.0014
Authors@R: c(
person("Jason", "Zivkovic", , "[email protected]", role = c("aut", "cre", "cph")),
person("Tony", "ElHabr", , "[email protected]", role = "ctb"),
Expand Down
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@
* `fb_league_stats(team_or_player = "player")` returning duplicate player hrefs (0.6.4.0008) [#331](https://github.com/JaseZiv/worldfootballR/issues/331)
* `fb_league_stats(team_or_player = "player")` returning wrong season's data for Australian league (0.6.4.0009) [#333](https://github.com/JaseZiv/worldfootballR/issues/333)
* `tm_player_market_values()` returing the player name and valuation on separate rows in the `player_name` column [#338] (https://github.com/JaseZiv/worldfootballR/issues/338) and also returning `NA`s for the `player_age` column (0.6.4.0010) [#336](https://github.com/JaseZiv/worldfootballR/issues/336)
* `fb_match_results()` returns `NA` goals due to inconsistent `iconv()` behvaior on different systems (0.6.4.0011) [#326](https://github.com/JaseZiv/worldfootballR/issues/326)
* `fb_match_results()` returns `NA` goals due to inconsistent `iconv()` behavior on different systems (0.6.4.0011) [#326](https://github.com/JaseZiv/worldfootballR/issues/326)
* `tm_team_player_urls()` fixed after change to server-side loading (0.6.4.0013) [#342](https://github.com/JaseZiv/worldfootballR/issues/342)
* `fb_teams_urls()` fixed to remove lower division teams being returned as a result of playoff promotion games (0.6.4.0014) [#344](https://github.com/JaseZiv/worldfootballR/issues/344)
* `tm_player_bio()` column name and data structure change for player date of birth (0.6.4.0014)


### Improvements
Expand Down
7 changes: 6 additions & 1 deletion R/tm_player_bio.R
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,13 @@ tm_player_bio <- function(player_urls) {

# some of the following columns may not exist, so this series of if statements will handle for this:
if(any(grepl("date_of_birth", colnames(full_bios)))) {
# tm changed the column name from date_of_birth to date_of_birth_age - need to clean up the age in parentheses
c_name <- names(full_bios[grepl("date_of_birth", colnames(full_bios))])
full_bios <- full_bios %>%
dplyr::mutate(date_of_birth = .tm_fix_dates(.data[["date_of_birth"]]))
dplyr::mutate(date_of_birth = .tm_fix_dates(gsub(" \\(.*", "", .data[[c_name]])))

# to not introduce breaking changes, we will keep the column name consistent with existing expected output and remove the new date_of_birth_age column
full_bios[, c_name] <- NULL
}

if(any(grepl("joined", colnames(full_bios)))) {
Expand Down
9 changes: 5 additions & 4 deletions R/worldfootballr_helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ fb_teams_urls <- function(league_url, time_pause=3) {
main_url <- "https://fbref.com"

urls <- league_season_page %>%
rvest::html_elements(".stats_table") %>%
rvest::html_nodes("img+ a") %>%
rvest::html_attr("href") %>%
unique() %>% paste0(main_url, .)
Expand Down Expand Up @@ -201,16 +202,16 @@ tm_league_team_urls <- function(country_name, start_year, league_url = NA) {
#' @export
#'
tm_team_player_urls <- function(team_url) {

main_url <- "https://www.transfermarkt.com"

tryCatch({team_page <- xml2::read_html(team_url)}, error = function(e) {team_page <- c()})

player_urls <- team_page %>%
rvest::html_nodes("#yw1 td.posrela .hauptlink a") %>% rvest::html_attr("href") %>%
unique() %>%
paste0(main_url, .)

return(player_urls)
}

Expand Down

0 comments on commit ef359b7

Please sign in to comment.