-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mdo #12
Mdo #12
Conversation
The branch won't build. please fix :) |
multi_imbalance/resampling/MDO.py
Outdated
labels = list(set(y)) | ||
for class_label in labels: | ||
SC_minor, weights = self._choose_samples(X, y, class_label) | ||
if (len(SC_minor)) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that these extra brackets are necessary, do you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
multi_imbalance/resampling/MDO.py
Outdated
for i in range(len(S_minor)): | ||
sample_neighbours_indices = minority_class_neighbours_indices[i][1:] | ||
quantity_sample_neighbours_indices_with_same_label = sum(y[sample_neighbours_indices] == class_label) | ||
quantity_with_same_label_in_neighbourhood.append(quantity_sample_neighbours_indices_with_same_label) | ||
num = np.array(quantity_with_same_label_in_neighbourhood) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may think about extracting that logic to a separate function if it improves readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
multi_imbalance/resampling/MDO.py
Outdated
u = np.mean(SC_minor, axis=0) | ||
Z = SC_minor - u | ||
n_components = min(Z.shape) | ||
pca = PCA(n_components=n_components).fit(Z) | ||
T = pca.transform(Z) | ||
V = np.var(T, axis=0) | ||
oversampling_rate = goal_quantity - quantities[class_label] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these single letter variable names meaningful enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is sometimes really hard to use a meaningful variable name because in the paper there are mostly letters and often the purpose of existence is hard to explain. So, I changed these names where I could but where they are temporary variables to count expressions like in MDO_oversampling function I decided to leave it. Otherwise, they would be explained in worse way than before xD I hope that you will be ok with that ;)
multi_imbalance/resampling/MDO.py
Outdated
def _MDO_oversampling(T, V, oversampling_rate, weights): | ||
S_temp = list() | ||
for _ in range(oversampling_rate): | ||
idx = np.random.choice(np.arange(len(T)), p=weights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the results be repeatable, especially when testing and error checking? See url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Thanks, done
…ance to be coherent with paper (but it was equal), splitted logic, renamed variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job @plutasnyy 🥇
No description provided.