Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚸 Issue 664: bulk-import of many (750+) porn-domains from @TruthfullEdward #666

Merged
merged 2 commits into from
Feb 24, 2022

Conversation

thomasmerz
Copy link
Contributor

Summary

This PR fixes #664 by bulk-importing many new (750+) porn-domains from @TruthfullEdward
I checked if there are any duplicates by:

🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $ wc -l porn.txt
1908849 porn.txt
🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $ sort -u porn.txt | wc -l
1908849
🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $

Checklist

  • I have verified that I have not modified any files inside the alt-version folder (automated code will automatically update those files)

  • I have verified that I have not modified any files inside the dnsmasq-version folder (automated code will automatically update those files)

@ghost
Copy link

ghost commented Feb 23, 2022

@thomasmerz Thanks a lot! I hope there were not much duplicates left. I sorted the list alphabetically and removed duplicates. If you would have an interesting line of code for my Linux terminal, so I can do that process already for you (the way you like), please let me know!

@thomasmerz
Copy link
Contributor Author

You can remove duplicates beforehand with sort -u πŸ‘πŸ»

There has been no duplicates:

I checked if there are any duplicates by:

🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $ wc -l porn.txt
1908849 porn.txt
🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $ sort -u porn.txt | wc -l
1908849
🦎πŸ–₯  βœ” ~/temp/PRs/Lists [issue_664|βœ”]
10:14 $

@blocklistproject blocklistproject merged commit d1d2a9d into blocklistproject:master Feb 24, 2022
@thomasmerz thomasmerz deleted the issue_664 branch February 24, 2022 07:28
@spirillen
Copy link
Contributor

@thomasmerz Thanks a lot! I hope there were not much duplicates left. I sorted the list alphabetically and removed duplicates. If you would have an interesting line of code for my Linux terminal, so I can do that process already for you (the way you like), please let me know!

sort -u -f file.txt -o file.txt

Explain:

sort is a program that... sort lines in a file

-u = uniq, romoves duplicates
-f file to read
-o output to file

the file for -f & -o can be the same, as the file is read to memory for sort and the writes to file.tmp and first then moves the temp file to the destination file

@ghost
Copy link

ghost commented Feb 25, 2022

@spirillen Thanks! I always wrote sort filename.txt | uniq-u but that deleted all my file content. I'll give your codeline certainly a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

518 + 260 additional URL's for p*rn-blocklist
3 participants