Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What's up with the spam waves lately?
11 points by zahlman 1 day ago | hide | past | favorite | 4 comments
I just had a look through the "new" queue with showdead on, and out of the most recent 200 submissions, I counted 22 that were not dead. I wasn't seeing false positives, either. A very large fraction of those dead submissions seem to come from one very specific blogspot domain (this time they all seem to be an identical URL, even, except for a TLD swap in some cases).

Does HN not just implement blacklists for URL submissions from certain domains / matching a regex pattern? I get that the showdead option is there so people can vouch for stuff but that would/should realistically never happen in this case. Can't more obvious spam just be deleted directly?

Also, how did HN become such a target for this? I would think that the audience here is generally savvy enough to avoid scams, and that having things linked here is not as beneficial for SEO as many other sites with UGC.






I have a script that downloads all the posts from HN and I trained models that can predict based on the title: (1) will an article get > 10 votes, (2) will an article get a ratio of comments/votes > 0.5, and (3) will an article be [dead].

The (1) model sucks (AUC-ROC maybe 0.6), the (2) model is better (AUC maybe 0.7) but the (3) model got an AUC pushing 0.98 which seemed unreasonably high.

My mental model of "[dead]" was that it happens to articles that get popular but are about politics or some other bad subject. What I found though is that HN gets bursts of spam like the one you're experiencing and with the system I had (i) the same headline would show up [dead] a large number of times and (ii) the same headline would show up in the train, eval and test data sets so of course the system got an unreasonably high score for [dead]. That's how I learned that HN gets these spam waves.


Thank you for the rundown. Very interesting to get some ideas of spammer tactics.

Well, to be fair, the spam blocking is working great, you just chose to turn it off. I saw a prior run a few days ago, where they kept changing tactics, and HN kept reacting and blocking - no idea if that was automatic or if dang was reacting to it, but either way it was a fun day of watching the battle go on.

But as to why HN is a target, you don't need a high percentage of hits to make it worth spamming. Scams are lucrative. If one in a million viewers actually follows the links and falls for the scam, that will more than cover costs of spamming links. So they will attack any site where it looks like there is any chance of getting through.


1) SPAMmers can be quite dim - if they were smarter many of them would be doing more useful and more profitable things.

2) I think the dynamic auto-kill seems to be working the way that it is intended to.

3) The current rather prolific idiot may be trying to probe for weaknesses, but is burning rather a lot of sockpuppet accounts and not being smart about the probes... See (1).




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: