Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To enhance privacy, reduce IP retention from 1 year to 6 months #22393

Closed
wants to merge 5 commits into from

Conversation

eobrain
Copy link

@eobrain eobrain commented Dec 17, 2022

The current default retention period for IP addresses is one year. That is not good for privacy.

Defaults matter -- probably most admins do not change them.

Changing the default to two days will improve privacy while still allowing rate-limiting to guard against spam or abuse, and allow allowing production debugging of recent events.

See discussion in #6474 and in Reddit thread.

See also the matching documentation PR mastodon/documentation#1133 which probably should be merged if this PR is merged.

@rinsuki
Copy link
Contributor

rinsuki commented Dec 19, 2022

I think you don't need to modify SESSION_RETENTION_PERIOD, since even SESSION_RETENTION_PERIOD is longer than IP_RETENTION_PERIOD, session's IP address will be removed.

SessionActivation.where('updated_at < ?', SESSION_RETENTION_PERIOD.ago).in_batches.destroy_all
SessionActivation.where('updated_at < ?', IP_RETENTION_PERIOD.ago).in_batches.update_all(ip: nil)

btw, I think it would be good to ask option to server administrator in mastodon:setup task instead of change .env.production.sample.

@ineffyble
Copy link
Member

Lowering SESSION_RETENTION_PERIOD to 2 days would have a user impact, since it'd mean users have to log back in every 2 days, right? In which case there's a tradeoff between convenience and privacy.

Previously both SESSION_RETENTION_PERIOD and IP_RETENTION_PERIOD
were reduced from one year to two days. This commit retains only the
change to IP_RETENTION_PERIOD.

The reasons are as follows:

As pointed out by @rinsuki in a [comment][1] on the pull request,
changing the IP_RETENTION_PERIOD will also cause the IP addresses
to be removed from older session records, so just reducing the
IP_RETENTION_PERIOD should be sufficient for IP address privacy.

And as @ineffyble pointed out in another [comment][2] changing the
session retention period would case a degredation of the user
experience.

[1]: mastodon#22393 (comment)
[2]: mastodon#22393 (comment)
@eobrain
Copy link
Author

eobrain commented Dec 26, 2022

Thanks for the review. Please take another look.

I changed the pull-request to only change the IP_RETENTION_PERIOD

Because as @rinsuki pointed out, changing just the IP retention period is enough to also remove the IP addresses from the session records.

And as @ineffyble pointed out, changing the
session retention period would cause a degradation of the user
experience.

eobrain added a commit to eobrain/documentation that referenced this pull request Dec 27, 2022
Also add a brief description of this setting.

The change in retention is in the code repo in
mastodon/mastodon#22393
so this commit should only be merged into main if that pull
request is also merged.
@eobrain
Copy link
Author

eobrain commented Dec 27, 2022

Also created a corresponding documentation PR: mastodon/documentation#1133

@eobrain eobrain changed the title To enhance privacy, reduce IP and Session retention from 1 year to 2 days To enhance privacy, reduce IP retention from 1 year to 2 days Dec 30, 2022
@smiba
Copy link
Contributor

smiba commented Jan 3, 2023

I don't really get this change. Yes, it would improve privacy, but it would also harm how one could moderate their instance.

IP banning, although easy to circumvent, is still an important part of moderation. Having longer retention would improve moderators and administrators ability to recognize hostile subnets or misbehaving / multi-account carrying IPs.

If an instance is already responsible for securing and hosting all of a user's social connections, including mention-only and followers-only posts, I don't see how an IP address is suddenly a noteworthy privacy risk.

I'm glad the session expiry is not affected (any more) in the commits because that would honestly absolutely damage user experience, but getting rid of IP addresses so fast would still affect moderators.

@eobrain
Copy link
Author

eobrain commented Jan 4, 2023

TL;DR -- It seems moderators could still use IP Banning even with a short IP retention

Hi @smiba, thanks for bringing up the issue of use of IP addresses for moderation.

According to Blocking by IP in the Mastodon docs, the way that IP banning is implemented is not in the Mastodon software itself but in the firewall.

If that is true then once the IP address has been added to the firewall config it no longer needs to be stored by the Mastodon server.

So to allow an IP banning moderation process the Mastodon server only needs to store the IP long enough for the moderators to realize a user needs to be banned so they can use the IP address in the firewall config. After that it is OK if the IP address is deleted from Mastodon, as it will continue to be stored in the firewall config. Maybe two days is not enough time to realize a user needs to be banned, but the current default of one year seems too long.

Even if two days is too short, I still think this is a good default. The admins of an instance where moderators are doing IP banning can still change the default retention period to something more suitable to the speed of their moderation process.

(Maybe I misunderstand how IP banning is used in moderation. Please let me know of anything I got wrong above.)

@seano-vs
Copy link
Sponsor Contributor

seano-vs commented Mar 9, 2024

I'm a privacy advocate, but this signal is just too valuable for moderation and this would cripple it. People create spam accounts from the same IPs all the time, and it's often spread out over months. At the very most, IPs should be one-way-hashed as UUIDs to mask the actual value- not expired after two days.

This is an odd change since this would only affect visibility of sensitive attributes only available to mods/admins. Like, they already have access to your everything- if they actually want your IP (and they will, for spam purposes) there are a hundred other ways to get it. I think not trusting the instance you host your account on is an issue outside the scope of this

@eobrain
Copy link
Author

eobrain commented Mar 9, 2024

I defer to those with experience in moderation and accept that two days is too short a default retention time.

However is the current value of one year correct? Can we change it to something like two months?

The GDPR says that retention be “no longer than is necessary”, and we should try to minimize legal risk to administrators by having a default retention time that is legally defensible according to this standard.

Is there some retention period that would be the minimum to counter, say, 99% of spam, based on the experience of real-world administration of Mastodon?

This was in response to discussions where Mastodon administrators
said that for spam protection processes to work they needed longer
retention periods.
@eobrain eobrain changed the title To enhance privacy, reduce IP retention from 1 year to 2 days To enhance privacy, reduce IP retention from 1 year to 6 months Mar 11, 2024
@nemobis
Copy link
Contributor

nemobis commented Mar 11, 2024

The MediaWiki CheckUser extension records the IP address of registered users for 3 months by default, so that would be one reasonable choice.
https://www.mediawiki.org/wiki/Extension:CheckUser

@smiba
Copy link
Contributor

smiba commented Mar 11, 2024

Please know that the expiry logic is not in .env.enviroment.sample, changing this file changes nothing other then the sample file that is supposed to be filled with default values

I've created the following PR a while ago and I recommend continuing in that one, as it properly changes the files required #23071

Your PR is missing the actual code change in the sidekiq worker.

@eobrain
Copy link
Author

eobrain commented Mar 12, 2024

Thank you @smiba, this PR has been open for so long that I forgot about your PR.

I agree your PR is more complete and that is the one that should be merged.

If there is pushback on the 31 days as being too short for spam protection purposes, you might consider increasing it to six months as I did in this PR. It's not ideal for privacy, but still better than the current one year.

@boehs
Copy link

boehs commented Apr 27, 2024

At the very most, IPs should be one-way-hashed as UUIDs to mask the actual value

Incredibly trivial to rainbow table that

@mjankowski
Copy link
Contributor

I agree your PR is more complete and that is the one that should be merged.

Closing this based on comment from PR author that the other PR is more appropriate. Will comment more over there.

@mjankowski mjankowski closed this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants