Potential Issue with Member/Non-member Data Handling in Inference Code #3

ganyuhhutao · 2024-05-05T09:54:24Z

Hello, I've been reviewing the code and noticed a potential issue regarding the handling of member and non-member data in the inference.py script. It seems that the indices from train_indices.csv are being used as non-member data for inference, which may not align with the typical definitions used in MIA.

I hope this information is helpful, and I look forward to any clarification or updates you can provide on this matter.

Thank you for your attention to this issue.

The text was updated successfully, but these errors were encountered:

snoop2head · 2024-05-05T10:38:55Z

@ganyuhhutao
Thank you for your attention to the repository!

I think it's one of my fault to write the code in confusing manner, but I think I did implemented in right way.

The target model is trained on the testset of the CIFAR dataset, where data points are saved as train_indices.csv. Testset of the CIFAR dataset is splitted as train and validation splits which the model is optimized on.

MIA/train_target.py

Lines 46 to 47 in 5dc858b

 testset = DSET_CLASS(root="./data", train=False, download=True, transform=transform) 

 testloader = DataLoader(testset, batch_size=CFG.val_batch_size, shuffle=False, num_workers=2)

MIA/train_target.py

Lines 64 to 69 in 5dc858b

 target_train_indices = np.random.choice(len(testset), CFG.target_train_size, replace=False) 

 target_eval_indices = np.setdiff1d(np.arange(len(testset)), target_train_indices) 

 # save target_train_indices as dataframe 

 pd.DataFrame(target_train_indices, columns=["index"]).to_csv( 

 "./attack/train_indices.csv", index=False 

 )

Subsequently, the target model inferences on member vs non-member data. The model has never seen trainset of the CIFAR dataset before, so it is non-member data. For non-member data, train_indices.csv that is saved earlier is used to index from trainset of the CIFAR dataset. The reason why indexing with train_indices.csv is just to because to match the number of datapoints between member and non-member.

MIA/inference_attack.py

Lines 53 to 64 in 5dc858b

 testset = DSET_CLASS(root="./data", train=False, download=True, transform=transform) 

 trainset = DSET_CLASS(root="./data", train=True, download=True, transform=transform) 

 print("mapped classes to ids:", testset.class_to_idx) 

 columns_attack_sdet = [f"top_{index}_prob" for index in range(CFG.topk_num_accessible_probs)] 

 # load member data 

 list_nonmember_indices = pd.read_csv("./attack/train_indices.csv")["index"].to_list() 

 list_member_indices = np.random.choice(len(testset), len(list_nonmember_indices), replace=False) 

 subset_nonmember = Subset(trainset, list_nonmember_indices) 

 subset_member = Subset(testset, list_member_indices)

Sorry if this part was confusing. If I were to write code now, I would have written as following:

list_nonmember_indices = np.random.choice(len(trainset), len(pd.read_csv("./attack/train_indices.csv")["index"].to_list()) , replace=False)

But in the end, I don't think there's a problem in the alignment of the code and the paper.

snoop2head · 2024-05-05T10:39:21Z

If possible, can please @dokyungs give authorization to the repo back to me so that I can clean up issues and fix the code?

ganyuhhutao · 2024-05-05T12:22:54Z

Thank you for your explanation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Issue with Member/Non-member Data Handling in Inference Code #3

Potential Issue with Member/Non-member Data Handling in Inference Code #3

ganyuhhutao commented May 5, 2024

snoop2head commented May 5, 2024 •

edited

Loading

snoop2head commented May 5, 2024

ganyuhhutao commented May 5, 2024

Potential Issue with Member/Non-member Data Handling in Inference Code #3

Potential Issue with Member/Non-member Data Handling in Inference Code #3

Comments

ganyuhhutao commented May 5, 2024

snoop2head commented May 5, 2024 • edited Loading

snoop2head commented May 5, 2024

ganyuhhutao commented May 5, 2024

snoop2head commented May 5, 2024 •

edited

Loading