Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan: Add a robustness detector to the scan that perturbs categorial values #1847

Open
kevinmessiaen opened this issue Mar 14, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kevinmessiaen
Copy link
Member

kevinmessiaen commented Mar 14, 2024

🚀 Feature Request

Add a robustness detector to the scan that perturbs categorial values.

The scan should be able to a set of issues that capture the perturbations needed on a single categorial feature to:

(a) change the predicted label (classification)
(b) change the prediction by an amount that exceeds a certain threshold (regression)

🔈 Motivation

Currently the scan does not have any categorial perturbation.

@kevinmessiaen kevinmessiaen added enhancement New feature or request good first issue Good for newcomers labels Mar 14, 2024
@ChatBear
Copy link

Is this issue still active ? I would want to contribute to this issue

@alexcombessie
Copy link
Member

@kevinmessiaen I let you guide there, this seems easy to add, and a great idea of contribution!

@kevinmessiaen
Copy link
Member Author

Hello @ChatBear

Yes this is still an active issue, I can assign you to it. We would be grateful to have your contribution, let me know if you have question about this.

@ChatBear
Copy link

Thanks, i'll try to contribute, i'll need a bit of time to understand the repo, after that i'll try to post PR

@ChatBear
Copy link

ChatBear commented Apr 14, 2024

Hello, i have few questions about the issue.

What kind of pertubations do you except ? I was thinking of change the feature column with a probability of 0.1 (chosen arbitrary).

And do i need the create another detector from scratch, or i can use a detector from BaseTextPerturbationDetector ?

And i tried to create a branch, and i can't push in my own branch (i forked the repo but i am having trouble to create the pull request, i am kinda of new in open source so i apologize in advance if this question is inappropriate).

@kevinmessiaen
Copy link
Member Author

Hello,

The perturbation should be on categorical feature. It should only perturb on column of the dataset, the goal is to ensure that the model isn't too sensitive to noise. In this case the probability is not necessary since we want to test that the result isn't impacted when the value change. (it makes sense in text where we have typo rate for example).

Example is having a breed category with values potential values: ['Labrador', 'Husky', 'Beagle', ...]. The idea is to switch all Labrador` value to any other breed and so on.

It won't work to reuse BaseTextPerturbationDetector since it cast column as str but we can have numerical categories for example. But you can inspire from it.

@ChatBear
Copy link

Ok, thanks i can continue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Development

No branches or pull requests

3 participants