From 8ef557f5ca3aba322a2bb57e41a06a86854c321e Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 13:45:04 -0400 Subject: [PATCH 1/6] Update cran-comments.md --- cran-comments.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/cran-comments.md b/cran-comments.md index 6d843b4..5d3f057 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -1,12 +1,11 @@ ## Test environments -0.1.9 prep for cran +0.2.0 prep for cran ## Fix Issue to maintain CRAN ## R CMD check results -0 errors | 1 warnings | 2 note +0 errors | 0 warnings | 2 note -Fixed CRAN pretest result and changed using if statement to compare class -to use inherits in kmeans-funcs.R per pretest. +Minor updates. From ecafdb01074e1a402f8b91e7e35b86234ecb5506 Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 13:51:47 -0400 Subject: [PATCH 2/6] Increment version number to 0.2.0 --- DESCRIPTION | 2 +- NEWS.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 8abc062..f46f18f 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: healthyR Title: Hospital Data Analysis Workflow Tools -Version: 0.1.9.9000 +Version: 0.2.0 Authors@R: c( person("Steven","Sanderson", email = "spsanderson@gmail.com", role = c("aut","cre")), person("Steven Sanderson", role = "cph")) diff --git a/NEWS.md b/NEWS.md index 8aaf92c..3a7da34 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# healthyR (development version) +# healthyR 0.2.0 ## Breaking Changes None From e3322dadc2b256bb3f85981b103616857318e863 Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 15:18:31 -0400 Subject: [PATCH 3/6] Create CRAN-SUBMISSION --- CRAN-SUBMISSION | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 CRAN-SUBMISSION diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION new file mode 100644 index 0000000..420ea5e --- /dev/null +++ b/CRAN-SUBMISSION @@ -0,0 +1,3 @@ +Version: 0.2.0 +Date: 2022-07-18 19:17:32 UTC +SHA: ecafdb01074e1a402f8b91e7e35b86234ecb5506 From a4bad40b89b47cceb43128ef80f251ee6957d443 Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 15:51:22 -0400 Subject: [PATCH 4/6] Delete CRAN-SUBMISSION --- CRAN-SUBMISSION | 3 --- 1 file changed, 3 deletions(-) delete mode 100644 CRAN-SUBMISSION diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION deleted file mode 100644 index 420ea5e..0000000 --- a/CRAN-SUBMISSION +++ /dev/null @@ -1,3 +0,0 @@ -Version: 0.2.0 -Date: 2022-07-18 19:17:32 UTC -SHA: ecafdb01074e1a402f8b91e7e35b86234ecb5506 From 62bbf2ddd6bfb5c007420c9a2c99889fd53e6e5e Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 15:51:33 -0400 Subject: [PATCH 5/6] Increment version number to 0.2.0.9000 --- DESCRIPTION | 2 +- NEWS.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/DESCRIPTION b/DESCRIPTION index f46f18f..680e25a 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: healthyR Title: Hospital Data Analysis Workflow Tools -Version: 0.2.0 +Version: 0.2.0.9000 Authors@R: c( person("Steven","Sanderson", email = "spsanderson@gmail.com", role = c("aut","cre")), person("Steven Sanderson", role = "cph")) diff --git a/NEWS.md b/NEWS.md index 3a7da34..9b58541 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,5 @@ +# healthyR (development version) + # healthyR 0.2.0 ## Breaking Changes From 3d23ec736d27da008713457f5b4f2a2a120b2190 Mon Sep 17 00:00:00 2001 From: "Steven Paul Sanderson II, MPH" Date: Mon, 18 Jul 2022 20:40:26 -0400 Subject: [PATCH 6/6] update site Fixes #146 --- _pkgdown.yml | 6 + docs/404.html | 4 +- docs/CODE_OF_CONDUCT.html | 4 +- docs/LICENSE-text.html | 4 +- docs/LICENSE.html | 4 +- docs/articles/getting-started.html | 82 ++--- docs/articles/index.html | 4 +- docs/articles/kmeans-umap.html | 278 +++++++------- .../figure-html/scree_plt-1.png | Bin 63840 -> 64277 bytes .../figure-html/umap_plt-1.png | Bin 131491 -> 125386 bytes docs/authors.html | 8 +- docs/index.html | 110 +++--- docs/news/index.html | 23 +- docs/pkgdown.yml | 4 +- docs/reference/Rplot002.png | Bin 36936 -> 38795 bytes docs/reference/category_counts_tbl.html | 64 ++-- docs/reference/color_blind.html | 132 +++++++ docs/reference/diverging_bar_plt.html | 77 ++-- docs/reference/diverging_lollipop_plt.html | 63 ++-- docs/reference/dx_cc_mapping.html | 6 +- .../figures/README-gartner_chart-1.png | Bin 7247 -> 6880 bytes docs/reference/gartner_magic_chart_plt-1.png | Bin 66748 -> 88127 bytes docs/reference/gartner_magic_chart_plt-2.png | Bin 0 -> 61380 bytes docs/reference/gartner_magic_chart_plt.html | 107 ++++-- docs/reference/hr_scale_color_colorblind.html | 134 +++++++ docs/reference/hr_scale_fill_colorblind.html | 134 +++++++ docs/reference/index.html | 20 +- docs/reference/kmeans_mapped_tbl.html | 85 +++-- docs/reference/kmeans_obj.html | 65 ++-- docs/reference/kmeans_scree_data_tbl.html | 65 ++-- docs/reference/kmeans_scree_plt-1.png | Bin 59425 -> 59742 bytes docs/reference/kmeans_scree_plt.html | 55 +-- docs/reference/kmeans_tidy_tbl.html | 148 ++++---- docs/reference/kmeans_user_item_tbl.html | 63 ++-- docs/reference/los_ra_index_plt-1.png | Bin 108880 -> 113545 bytes docs/reference/los_ra_index_plt-2.png | Bin 102454 -> 108124 bytes docs/reference/los_ra_index_plt.html | 73 ++-- docs/reference/los_ra_index_summary_tbl.html | 145 ++++---- docs/reference/named_item_list.html | 195 +++++----- docs/reference/opt_bin.html | 66 ++-- docs/reference/pipe.html | 6 +- docs/reference/px_cc_mapping.html | 6 +- docs/reference/save_to_excel.html | 13 +- docs/reference/service_line_augment.html | 347 +++++++++--------- docs/reference/service_line_vec.html | 341 ++++++++--------- docs/reference/sql_left.html | 15 +- docs/reference/sql_mid.html | 17 +- docs/reference/sql_right.html | 15 +- docs/reference/tidyeval.html | 18 +- docs/reference/top_n_tbl.html | 43 ++- docs/reference/ts_alos_plt.html | 83 +++-- docs/reference/ts_census_los_daily_tbl.html | 71 ++-- docs/reference/ts_median_excess_plt.html | 65 ++-- docs/reference/ts_plt.html | 87 +++-- docs/reference/ts_readmit_rate_plt.html | 85 +++-- docs/reference/ts_signature_tbl.html | 39 +- docs/reference/umap_list.html | 75 ++-- docs/reference/umap_plt.html | 69 ++-- docs/sitemap.xml | 9 + 59 files changed, 2169 insertions(+), 1463 deletions(-) create mode 100644 docs/reference/color_blind.html create mode 100644 docs/reference/gartner_magic_chart_plt-2.png create mode 100644 docs/reference/hr_scale_color_colorblind.html create mode 100644 docs/reference/hr_scale_fill_colorblind.html diff --git a/_pkgdown.yml b/_pkgdown.yml index 1dd1a3d..c7468fa 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -39,6 +39,12 @@ reference: - "umap_plt" - "diverging_lollipop_plt" - "diverging_bar_plt" + - subtitle: Color Blindness + desc: Functions to help with color blindness + contents: + - "color_blind" + - "hr_scale_fill_colorblind" + - "hr_scale_color_colorblind" - title: Augment Functions desc: Functions that Augment a tibble contents: diff --git a/docs/404.html b/docs/404.html index 2a7ed70..3a3d554 100644 --- a/docs/404.html +++ b/docs/404.html @@ -38,7 +38,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -116,7 +116,7 @@

Page not found (404)

-

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/CODE_OF_CONDUCT.html b/docs/CODE_OF_CONDUCT.html index f488f3a..cd79bef 100644 --- a/docs/CODE_OF_CONDUCT.html +++ b/docs/CODE_OF_CONDUCT.html @@ -23,7 +23,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -129,7 +129,7 @@

Attribution -

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index a8568b0..827d7f3 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -23,7 +23,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -91,7 +91,7 @@

License

-

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/LICENSE.html b/docs/LICENSE.html index dbbb2b3..573eeba 100644 --- a/docs/LICENSE.html +++ b/docs/LICENSE.html @@ -23,7 +23,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -95,7 +95,7 @@

MIT License

-

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/articles/getting-started.html b/docs/articles/getting-started.html index 089a993..5feb024 100644 --- a/docs/articles/getting-started.html +++ b/docs/articles/getting-started.html @@ -39,7 +39,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -101,7 +101,7 @@

A Quick Introduction

Steven P. Sanderson II, MPH

-

2022-04-25

+

2022-07-18

Source: vignettes/getting-started.Rmd @@ -118,11 +118,11 @@

Libaray Load
-library(healthyR)
-library(healthyR.data)
-library(timetk)
-library(dplyr)
-library(purrr)
+library(healthyR) +library(healthyR.data) +library(timetk) +library(dplyr) +library(purrr)

Generate Sample Data @@ -133,23 +133,23 @@

Generate Sample Data
-# Get Length of Stay Data
-data_tbl <- healthyR_data
-
-df_tbl <- data_tbl %>%
-  filter(ip_op_flag == "I") %>%
-  select(visit_end_date_time, length_of_stay) %>%
-  summarise_by_time(
-    .date_var = visit_end_date_time
-    , .by     = "day"
-    , visits  = mean(length_of_stay, na.rm = TRUE)
-  ) %>%
-  filter_by_time(
-    .date_var     = visit_end_date_time
-    , .start_date = "2012"
-    , .end_date   = "2019"
-  ) %>%
-  set_names("Date","Values")

+# Get Length of Stay Data +data_tbl <- healthyR_data + +df_tbl <- data_tbl %>% + filter(ip_op_flag == "I") %>% + select(visit_end_date_time, length_of_stay) %>% + summarise_by_time( + .date_var = visit_end_date_time + , .by = "day" + , visits = mean(length_of_stay, na.rm = TRUE) + ) %>% + filter_by_time( + .date_var = visit_end_date_time + , .start_date = "2012" + , .end_date = "2019" + ) %>% + set_names("Date","Values")

Plot the Time Series @@ -157,26 +157,26 @@

Plot the Time Series
-ts_alos_plt(
-  .data = df_tbl
-  , .date_col = Date
-  , .value_col = Values
-  , .by = "month"
-  , .interactive = FALSE
-)

+ts_alos_plt( + .data = df_tbl + , .date_col = Date + , .value_col = Values + , .by = "month" + , .interactive = FALSE +)

And with the .interactive option set to TRUE:

-ts_alos_plt(
-  .data = df_tbl
-  , .date_col = Date
-  , .value_col = Values
-  , .by = "month"
-  , .interactive = TRUE
-)
-
-

As we can see, this function has the ability to return either a +ts_alos_plt( + .data = df_tbl + , .date_col = Date + , .value_col = Values + , .by = "month" + , .interactive = TRUE +) +

+

As we can see, this function has the ability to return either a static plot or and interactive plot. Under the hood it is using the timetk::plot_time_series function. You can find out more on the the timetk function here.

@@ -202,7 +202,7 @@

Plot the Time Series

-

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/articles/index.html b/docs/articles/index.html index ca7fdd4..cfbf147 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -23,7 +23,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -90,7 +90,7 @@

All vignettes

-

Site built with pkgdown 2.0.3.

+

Site built with pkgdown 2.0.5.

diff --git a/docs/articles/kmeans-umap.html b/docs/articles/kmeans-umap.html index e380693..3e4a4a7 100644 --- a/docs/articles/kmeans-umap.html +++ b/docs/articles/kmeans-umap.html @@ -39,7 +39,7 @@ healthyR - 0.1.9.9000 + 0.2.0.9000 @@ -98,7 +98,7 @@

Clustering with K-Means and UMAP

Steven P. Sanderson II, MPH

-

2022-04-25

+

2022-07-18

Source: vignettes/kmeans-umap.Rmd @@ -115,7 +115,7 @@

Libaray Load
-library(healthyR)
+library(healthyR)

Information @@ -139,26 +139,26 @@

InformationGenerate some data

-library(healthyR.data)
-library(dplyr)
-library(broom)
-library(ggplot2)
-
-data_tbl <- healthyR_data %>%
-    filter(ip_op_flag == "I") %>%
-    filter(payer_grouping != "Medicare B") %>%
-    filter(payer_grouping != "?") %>%
-    select(service_line, payer_grouping) %>%
-    mutate(record = 1) %>%
-    as_tibble()
-
-data_tbl %>%
-  glimpse()
-#> Rows: 116,823
-#> Columns: 3
-#> $ service_line   <chr> "Medical", "Schizophrenia", "Syncope", "Pneumonia", "Ch~
-#> $ payer_grouping <chr> "Blue Cross", "Medicare A", "Medicare A", "Medicare A",~
-#> $ record         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~
+library(healthyR.data) +library(dplyr) +library(broom) +library(ggplot2) + +data_tbl <- healthyR_data %>% + filter(ip_op_flag == "I") %>% + filter(payer_grouping != "Medicare B") %>% + filter(payer_grouping != "?") %>% + select(service_line, payer_grouping) %>% + mutate(record = 1) %>% + as_tibble() + +data_tbl %>% + glimpse() +#> Rows: 116,823 +#> Columns: 3 +#> $ service_line <chr> "Medical", "Schizophrenia", "Syncope", "Pneumonia", "Ch… +#> $ payer_grouping <chr> "Blue Cross", "Medicare A", "Medicare A", "Medicare A",… +#> $ record <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…

Now that we have our data we need to generate what is called a user item table. To do this we use the function kmeans_user_item_tbl which takes in just a few @@ -173,25 +173,25 @@

Generate some dataUser Item Tibble

-uit_tbl <- kmeans_user_item_tbl(data_tbl, service_line, payer_grouping, record)
-
-uit_tbl
-#> # A tibble: 23 x 12
-#>    service_line     `Blue Cross` Commercial Compensation `Exchange Plans`    HMO
-#>    <chr>                   <dbl>      <dbl>        <dbl>            <dbl>  <dbl>
-#>  1 Alcohol Abuse          0.0941    0.0321      0.000525          0.0116  0.0788
-#>  2 Bariatric Surge~       0.317     0.0583      0                 0.0518  0.168 
-#>  3 Carotid Endarte~       0.0845    0.0282      0                 0       0.0141
-#>  4 Cellulitis             0.110     0.0339      0.0118            0.00847 0.0805
-#>  5 Chest Pain             0.144     0.0391      0.00290           0.00543 0.112 
-#>  6 CHF                    0.0295    0.00958     0.000518          0.00414 0.0205
-#>  7 COPD                   0.0493    0.0228      0.000228          0.00548 0.0342
-#>  8 CVA                    0.0647    0.0246      0.00107           0.0107  0.0524
-#>  9 GI Hemorrhage          0.0542    0.0175      0.00125           0.00834 0.0480
-#> 10 Joint Replaceme~       0.139     0.0179      0.0336            0.00673 0.0516
-#> # ... with 13 more rows, and 6 more variables: Medicaid <dbl>,
-#> #   `Medicaid HMO` <dbl>, `Medicare A` <dbl>, `Medicare HMO` <dbl>,
-#> #   `No Fault` <dbl>, `Self Pay` <dbl>
+uit_tbl <- kmeans_user_item_tbl(data_tbl, service_line, payer_grouping, record) + +uit_tbl +#> # A tibble: 23 × 12 +#> service_line `Blue Cross` Commercial Compensation `Exchange Plans` HMO +#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> +#> 1 Alcohol Abuse 0.0941 0.0321 0.000525 0.0116 0.0788 +#> 2 Bariatric Surge… 0.317 0.0583 0 0.0518 0.168 +#> 3 Carotid Endarte… 0.0845 0.0282 0 0 0.0141 +#> 4 Cellulitis 0.110 0.0339 0.0118 0.00847 0.0805 +#> 5 Chest Pain 0.144 0.0391 0.00290 0.00543 0.112 +#> 6 CHF 0.0295 0.00958 0.000518 0.00414 0.0205 +#> 7 COPD 0.0493 0.0228 0.000228 0.00548 0.0342 +#> 8 CVA 0.0647 0.0246 0.00107 0.0107 0.0524 +#> 9 GI Hemorrhage 0.0542 0.0175 0.00125 0.00834 0.0480 +#> 10 Joint Replaceme… 0.139 0.0179 0.0336 0.00673 0.0516 +#> # … with 13 more rows, and 6 more variables: Medicaid <dbl>, +#> # `Medicaid HMO` <dbl>, `Medicare A` <dbl>, `Medicare HMO` <dbl>, +#> # `No Fault` <dbl>, `Self Pay` <dbl>

The table is aggregated by item for the various users to which the algorithm will be applied.

Now that we have this data we need to find what will be out optimal k @@ -209,52 +209,52 @@

User Item TibbleK-Means Mapped Tibble

-kmm_tbl <- kmeans_mapped_tbl(uit_tbl)
-
-kmm_tbl
-#> # A tibble: 15 x 3
-#>    centers k_means  glance          
-#>      <int> <list>   <list>          
-#>  1       1 <kmeans> <tibble [1 x 4]>
-#>  2       2 <kmeans> <tibble [1 x 4]>
-#>  3       3 <kmeans> <tibble [1 x 4]>
-#>  4       4 <kmeans> <tibble [1 x 4]>
-#>  5       5 <kmeans> <tibble [1 x 4]>
-#>  6       6 <kmeans> <tibble [1 x 4]>
-#>  7       7 <kmeans> <tibble [1 x 4]>
-#>  8       8 <kmeans> <tibble [1 x 4]>
-#>  9       9 <kmeans> <tibble [1 x 4]>
-#> 10      10 <kmeans> <tibble [1 x 4]>
-#> 11      11 <kmeans> <tibble [1 x 4]>
-#> 12      12 <kmeans> <tibble [1 x 4]>
-#> 13      13 <kmeans> <tibble [1 x 4]>
-#> 14      14 <kmeans> <tibble [1 x 4]>
-#> 15      15 <kmeans> <tibble [1 x 4]>
+kmm_tbl <- kmeans_mapped_tbl(uit_tbl) + +kmm_tbl +#> # A tibble: 15 × 3 +#> centers k_means glance +#> <int> <list> <list> +#> 1 1 <kmeans> <tibble [1 × 4]> +#> 2 2 <kmeans> <tibble [1 × 4]> +#> 3 3 <kmeans> <tibble [1 × 4]> +#> 4 4 <kmeans> <tibble [1 × 4]> +#> 5 5 <kmeans> <tibble [1 × 4]> +#> 6 6 <kmeans> <tibble [1 × 4]> +#> 7 7 <kmeans> <tibble [1 × 4]> +#> 8 8 <kmeans> <tibble [1 × 4]> +#> 9 9 <kmeans> <tibble [1 × 4]> +#> 10 10 <kmeans> <tibble [1 × 4]> +#> 11 11 <kmeans> <tibble [1 × 4]> +#> 12 12 <kmeans> <tibble [1 × 4]> +#> 13 13 <kmeans> <tibble [1 × 4]> +#> 14 14 <kmeans> <tibble [1 × 4]> +#> 15 15 <kmeans> <tibble [1 × 4]>

As we see there are three columns, centers, k_means and glance. The k_means column is the k_means list object and glance is the tibble returned by the broom::glance function.

-kmm_tbl %>%
-  tidyr::unnest(glance)
-#> # A tibble: 15 x 6
-#>    centers k_means  totss tot.withinss betweenss  iter
-#>      <int> <list>   <dbl>        <dbl>     <dbl> <int>
-#>  1       1 <kmeans>  1.41       1.41    1.33e-15     1
-#>  2       2 <kmeans>  1.41       0.592   8.17e- 1     1
-#>  3       3 <kmeans>  1.41       0.372   1.04e+ 0     2
-#>  4       4 <kmeans>  1.41       0.276   1.13e+ 0     2
-#>  5       5 <kmeans>  1.41       0.202   1.21e+ 0     2
-#>  6       6 <kmeans>  1.41       0.159   1.25e+ 0     3
-#>  7       7 <kmeans>  1.41       0.124   1.28e+ 0     2
-#>  8       8 <kmeans>  1.41       0.0922  1.32e+ 0     2
-#>  9       9 <kmeans>  1.41       0.0745  1.33e+ 0     4
-#> 10      10 <kmeans>  1.41       0.0576  1.35e+ 0     2
-#> 11      11 <kmeans>  1.41       0.0460  1.36e+ 0     3
-#> 12      12 <kmeans>  1.41       0.0363  1.37e+ 0     3
-#> 13      13 <kmeans>  1.41       0.0272  1.38e+ 0     2
-#> 14      14 <kmeans>  1.41       0.0231  1.39e+ 0     2
-#> 15      15 <kmeans>  1.41       0.0161  1.39e+ 0     3
+kmm_tbl %>% + tidyr::unnest(glance) +#> # A tibble: 15 × 6 +#> centers k_means totss tot.withinss betweenss iter +#> <int> <list> <dbl> <dbl> <dbl> <int> +#> 1 1 <kmeans> 1.41 1.41 1.33e-15 1 +#> 2 2 <kmeans> 1.41 0.592 8.17e- 1 1 +#> 3 3 <kmeans> 1.41 0.372 1.04e+ 0 2 +#> 4 4 <kmeans> 1.41 0.276 1.13e+ 0 2 +#> 5 5 <kmeans> 1.41 0.202 1.21e+ 0 4 +#> 6 6 <kmeans> 1.41 0.159 1.25e+ 0 3 +#> 7 7 <kmeans> 1.41 0.124 1.28e+ 0 3 +#> 8 8 <kmeans> 1.41 0.0922 1.32e+ 0 3 +#> 9 9 <kmeans> 1.41 0.0716 1.34e+ 0 2 +#> 10 10 <kmeans> 1.41 0.0576 1.35e+ 0 3 +#> 11 11 <kmeans> 1.41 0.0460 1.36e+ 0 2 +#> 12 12 <kmeans> 1.41 0.0363 1.37e+ 0 3 +#> 13 13 <kmeans> 1.41 0.0282 1.38e+ 0 2 +#> 14 14 <kmeans> 1.41 0.0231 1.39e+ 0 2 +#> 15 15 <kmeans> 1.41 0.0161 1.39e+ 0 3

As stated we use the tot.withinss to decide what will become our k, an easy way to do this is to visualize the Scree Plot, also known as the elbow plot. This is done by @@ -265,30 +265,30 @@

K-Means Mapped TibbleScree Plot and Data

-kmeans_scree_plt(.data = kmm_tbl)
+kmeans_scree_plt(.data = kmm_tbl)

If we want to see the scree plot data that creates the plot then we can use another function kmeans_scree_data_tbl.

-kmeans_scree_data_tbl(kmm_tbl)
-#> # A tibble: 15 x 2
-#>    centers tot.withinss
-#>      <int>        <dbl>
-#>  1       1       1.41  
-#>  2       2       0.592 
-#>  3       3       0.372 
-#>  4       4       0.276 
-#>  5       5       0.202 
-#>  6       6       0.159 
-#>  7       7       0.124 
-#>  8       8       0.0922
-#>  9       9       0.0745
-#> 10      10       0.0576
-#> 11      11       0.0460
-#> 12      12       0.0363
-#> 13      13       0.0272
-#> 14      14       0.0231
-#> 15      15       0.0161
+kmeans_scree_data_tbl(kmm_tbl) +#> # A tibble: 15 × 2 +#> centers tot.withinss +#> <int> <dbl> +#> 1 1 1.41 +#> 2 2 0.592 +#> 3 3 0.372 +#> 4 4 0.276 +#> 5 5 0.202 +#> 6 6 0.159 +#> 7 7 0.124 +#> 8 8 0.0922 +#> 9 9 0.0716 +#> 10 10 0.0576 +#> 11 11 0.0460 +#> 12 12 0.0363 +#> 13 13 0.0282 +#> 14 14 0.0231 +#> 15 15 0.0161

With the above pieces of information we can decide upon a value for k, in this instance we are going to use 3. Now that we have that we can go ahead with creating the umap list object @@ -300,7 +300,7 @@

UMAP List Object

Now lets go ahead and create our UMAP list object.

-ump_lst <- umap_list(.data = uit_tbl, kmm_tbl, 3)
+ump_lst <- umap_list(.data = uit_tbl, kmm_tbl, 3)

Now that it is created, lets take a look at each item in the list. The umap_list function returns a list of 5 items.