Skip to content

Commit

Permalink
Update NRD feeds
Browse files Browse the repository at this point in the history
Update feeds used
Remove NRD count from README
Temporarily disable notification for failed NRD feed download
  • Loading branch information
jarelllama committed Jun 7, 2024
1 parent 964f0b0 commit 80d9aa9
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 28 deletions.
3 changes: 1 addition & 2 deletions SOURCES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,19 @@ Sources marked as inactive are not being automatically employed to retrieve doma
| [Fake Website Buster](https://fakewebsitebuster.com/) | Fake | | |
| [Google's Custom Search JSON API](https://developers.google.com/custom-search/v1/introduction) | Fake | | |
| [GunTab](https://www.guntab.com/scam-websites) | Firearm | | Yes |
| [Hagezi's NRD List](https://github.com/hagezi/dns-blocklists?tab=readme-ov-file#nrd) | NRD | - | - |
| [Jeroen Gui's phishing & scam feeds](https://jeroengui.be/anti-phishing-project/)[^1] | Phishing | | |
| [PetScams.com](https://petscams.com/) | Pet | | |
| [PhishStats](https://phishstats.info/)[^2] | Phishing | | |
| [Regex Matching](https://github.com/jarelllama/Scam-Blocklist/blob/main/config/phishing_targets.csv) | Phishing | | Yes |
| [Scam Directory](https://scam.directory/) | Any | | |
| [Scam.Delivery](https://scam.delivery/) | Non-delivery | Yes | - |
| [ScamAdvisor](https://www.scamadviser.com/) | Any | | |
| [Shreshta's NRD List](https://github.com/shreshta-labs/newly-registered-domains) | NRD | - | - |
| [Stop 419 Scams and Scammers](https://www.stop419scams.com/) | Any | Yes | - |
| [StopGunScams.com](https://stopgunscams.com/) | Firearm | | |
| [dnstwist](https://github.com/elceef/dnstwist) | Phishing | | |
| [openSquat](https://github.com/atenreiro/opensquat) | Phishing | Yes | - |
| [r/Scams](https://www.reddit.com/r/Scams/) | Any | Yes | - |
| [xRuffKez's NRD List](https://github.com/xRuffKez/NRD) | NRD | - | - |

[^1]: Only the scam feed is used for the light version.
[^2]: Only domains found in the NRD feed are used for the light version.
24 changes: 11 additions & 13 deletions scripts/tools.sh
Original file line number Diff line number Diff line change
Expand Up @@ -112,24 +112,22 @@ download_toplist() {
download_nrd_feed() {
[[ -f nrd.tmp ]] && return

url1='https://raw.githubusercontent.com/shreshta-labs/newly-registered-domains/main/nrd-1m.csv'
url2='https://feeds.opensquat.com/domain-names-month.txt'
url3='https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/wildcard/nrds.30-onlydomains.txt'
url1='https://raw.githubusercontent.com/xRuffKez/NRD/main/nrd-30day_part1.txt'
url2='https://raw.githubusercontent.com/xRuffKez/NRD/main/nrd-30day_part2.txt'
# Disabled due to size of the combined feeds
#url3='https://feeds.opensquat.com/domain-names-month.txt'

{
curl -sSL "$url1" || send_telegram \
"Error occurred while downloading NRD feeds."

# Download the bigger feeds in parallel
curl -sSLZH 'User-Agent: openSquat-2.1.0' "$url2" "$url3"
} | mawk '!/#/' > nrd.tmp
# Download the feeds in parallel
curl -sSLZ "$url1" "$url2" | mawk '!/#/' > nrd.tmp

# TODO: update method of checking if the feeds downloaded correctly
#
# Appears to be the best way of checking if the bigger feeds downloaded
# properly without checking each feed individually and losing
# parallelization.
if (( $(wc -l < nrd.tmp) < 9000000 )); then
send_telegram "Error occurred while downloading NRD feeds."
fi
#if (( $(wc -l < nrd.tmp) < 9000000 )); then
# send_telegram "Error occurred while downloading NRD feeds."
#fi

format_file nrd.tmp
}
Expand Down
13 changes: 0 additions & 13 deletions scripts/update_readme.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,6 @@ The [automated retrieval](https://github.com/jarelllama/Scam-Blocklist/actions/w
This blocklist aims to be an alternative to blocking all newly registered domains (NRDs) seeing how many, but not all, NRDs are malicious. A variety of sources are integrated to detect new malicious domains within a short time span of their registration date.
In the last 30 days, more than $(sum_nrds)[^1] malicious NRDs were found.
[^1]: Number calculated using NRDs from [Hagezi's NRD 30 feed](https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/wildcard/nrds.30-onlydomains.txt). The number of malicious NRDs found in reality is higher due to additional feeds being used. See the list of feeds used here: [SOURCES.md](https://github.com/jarelllama/Scam-Blocklist/blob/main/SOURCES.md)
## Download
| Format | Syntax |
Expand Down Expand Up @@ -297,15 +293,6 @@ sum_excluded() {
printf "%s" "$(( excluded_count * 100 / raw_count ))"
}

# Function 'sum_nrds' is an echo wrapper that returns the number of domains in
# the blocklist found in the NRD feed.
sum_nrds() {
# Only Hagezi's NRD feed is downloaded to save processing time
curl -sSL 'https://cdn.jsdelivr.net/gh/hagezi/dns-blocklists@latest/wildcard/nrds.30-onlydomains.txt' \
-o nrd.tmp
grep -cxFf "$RAW" nrd.tmp | sed 's/\([0-9]\{3\}\)$/,\1/'
}

# Entry point

trap 'find . -maxdepth 1 -type f -name "*.tmp" -delete' EXIT
Expand Down

0 comments on commit 80d9aa9

Please sign in to comment.