Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Can not compute Euclidean distance with nominal attributes #10

Open
BobMuenchen opened this issue Feb 15, 2020 · 0 comments
Open

Comments

@BobMuenchen
Copy link

Thanks for all your work on this useful package! I was surprised to see that Euclidean distance could not be used on a formula that contained only numeric variables. The function seems to care if the dataset contains factors, even when they're not used in the formula. That may be as designed, so I'm just reporting this in case you view it as an error.

library("UBL")
Loading required package: MBA
Loading required package: gstat
Registered S3 method overwritten by 'xts':
method from
as.zoo.xts zoo
Loading required package: automap
Loading required package: sp
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
library(MASS)
data(cats)
head(cats)
Sex Bwt Hwt
1 F 2.0 7.0
2 F 2.0 7.4
3 F 2.0 9.5
4 F 2.1 7.2
5 F 2.1 7.3
6 F 2.1 7.6
length(cats$Sex)
[1] 144

I'm adding a factor for color:

cats$color <- gl(n = 2, k=1, length = 144, label = c("black","white") )
head(cats)
Sex Bwt Hwt color
1 F 2.0 7.0 black
2 F 2.0 7.4 white
3 F 2.0 9.5 black
4 F 2.1 7.2 white
5 F 2.1 7.3 black
6 F 2.1 7.6 white

I'm not using color, but it yields an error message anyway:

mysmote.cats <- SmoteClassif(Sex ~ Bwt + Hwt, cats, list(M = 0.8, F = 1.8))
Error in neighbours(tgt, dat, dist, p, k) :
Can not compute Euclidean distance with nominal attributes!

HEOM fixes it:

mysmote.cats <- SmoteClassif(Sex ~ Bwt + Hwt, cats, list(M = 0.8, F = 1.8), dist = "HEOM")

sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] UBL_0.0.6 randomForest_4.6-14 automap_1.0-14 sp_1.3-2 gstat_2.0-4
[6] MBA_0.0-9 MASS_7.3-51.4 devtools_2.2.1 usethis_1.5.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.3 plyr_1.8.5 compiler_3.6.2 prettyunits_1.1.1 remotes_2.1.0 tools_3.6.2
[7] xts_0.11-2 testthat_2.3.1 digest_0.6.23 pkgbuild_1.0.6 pkgload_1.0.2 memoise_1.1.0
[13] lattice_0.20-38 rlang_0.4.4 cli_2.0.1.9000 rstudioapi_0.10 curl_4.3 withr_2.1.2
[19] desc_1.2.0 fs_1.3.1 rprojroot_1.3-2 grid_3.6.2 reshape_0.8.8 spacetime_1.2-2
[25] glue_1.3.1 R6_2.4.1 processx_3.4.1 fansi_0.4.1 sessioninfo_1.1.1 callr_3.4.0
[31] magrittr_1.5 intervals_0.15.1 backports_1.1.5 ps_1.3.0 ellipsis_0.3.0 assertthat_0.2.1
[37] FNN_1.1.3 crayon_1.3.4 zoo_1.8-7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant