Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitelist common false-positives #20

Closed
veloute opened this issue Jul 15, 2018 · 28 comments
Closed

Whitelist common false-positives #20

veloute opened this issue Jul 15, 2018 · 28 comments

Comments

@veloute
Copy link

veloute commented Jul 15, 2018

It might be a good idea to add a list of commonly whitelisted domains such as this to lower the probability of people having problems.

Found the list at https://firebog.net/ under "Whitelisting Suggestions".

@veloute veloute changed the title Whitelist Whitelist common false-positives Jul 15, 2018
@hectorm
Copy link
Owner

hectorm commented Jul 15, 2018

I prefer to solve false positives at the root and contact the maintainer of the original list, I have done it in some occasions and it has worked. But I understand that it is a tedious operation and not always the maintainer will eliminate that entry from the list.

Currently the --whitelist option of my script requires a list of regular expressions, so it does not directly support the whitelist you suggest. I will think about how to implement this feature without breaking the compatibility.

Meanwhile, I checked the domains that are currently present in both hBlock and anudeepND whitelist and these are the results.

comm -12 <(curl -fsS https://hblock.molinero.xyz/hosts_domains.txt | sort) <(curl -fsS https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt | sort)
Domain Blocked by
cdn3.optimizely.com blocklist.kowabit.de, winhelp2002.mvps.org
cdn.optimizely.com winhelp2002.mvps.org
d2c8v52ll5s99u.cloudfront.net winhelp2002.mvps.org
nexusrules.officeapps.live.com stevenblack
om.cbsi.com adguard-simplified, blocklist.kowabit.de, easyprivacy, winhelp2002.mvps.org
s.shopify.com blocklist.kowabit.de, winhelp2002.mvps.org
s.youtube.com blocklist.kowabit.de, someonewhocares.org
v.shopify.com adguard-simplified

@veloute
Copy link
Author

veloute commented Jul 15, 2018

I edited the script to include the list like this:

whitelist="$(curl https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt)"

it seems to work fine without any problems.

@hectorm
Copy link
Owner

hectorm commented Jul 16, 2018

Even though it seems to work, you are actually whitelisting more domains than you think, because the --whitelist option uses regular expressions instead of literal strings.

If you want to modify the script, I suggest you do this instead:

whitelist=$(curl -fsSL 'https://raw.githubusercontent.com/anudeepND/whitelist/master/domains/whitelist.txt' \
	| while read -r d; do printf '^%s$\n' "$(quoteRe "$d")"; done \
)

@veloute
Copy link
Author

veloute commented Jul 17, 2018

Huh, you're right. I don't have a tonne of experience with bash scripting, so thanks for that.

Would there be any negative reason to add this by default? You could even mirror the whitelist or host your own to only include domains which shouldn't be blocked but are in the blocklists.

@hectorm
Copy link
Owner

hectorm commented Jul 17, 2018

The solution I have given in #20 (comment) works for this case, but it is not generic enough, since it assumes that each line is a domain name, and is also too slow. In order to implement this feature I should first sanitize the input and improve the performance.

I still have improvements to make to the --blacklist and --whitelist options, so I will leave this issue pending until then.

On the other hand, are any of the domains that I mentioned in #20 (comment) causing you problems?

@erikdubois
Copy link

Can I thank you first for making this script - promoted it here -
https://arcolinux.com/use-hblock-to-improve-your-security-and-privacy-by-blocking-ads-tracking-and-malware-domains/

Can I add something to the wishing list?

What if you provide a .config/hblock/whitelist.txt to the users so they can add an url rather then a regex (that is not everybody's cup of tea).

Currently I want to whitelist analytics.google.com to see my traffic. Regex works fine but we have implemented your application in a timer and a service.

At that point the whitelisting is not applied or gone again and I need to manually intervene to whitelist it.
IF we could have a file with either regex or url it would be a great improvement to solve the false positives.

@hectorm
Copy link
Owner

hectorm commented Nov 19, 2018

I'm thinking of making the --whitelist and --blacklist options support only files, disable by default the regex support in the whitelist and add a -r, --whitelist-regex option to enable it.

Clearly this breaks compatibility, so I must increase the version to 2.0.0.

@erikdubois
Copy link

so --whitelist-regex will only support regex and no url and will read a file that contains 1 specific rule on each line?

and

whitelist will only support a file with urls in it and will read a file that contains 1 specific url on each line?

or is it something else?

Since Hblock runs by default in our system... it is best we supply already url's that we whitelist for the users like google analytics.

So it would be great that if you just type hblock the application looks for the two files and already includes or excludes what is in those files.If no files or content ... nothing happens.

So it is up to the users to manage their own whitelist and blacklist.

@hectorm
Copy link
Owner

hectorm commented Nov 20, 2018

I have created an experimental branch with these changes and other improvements. I haven't updated the readme, but you can use the --help option to understand how to use it.

@erikdubois
Copy link

erikdubois commented Nov 23, 2018

Thanks for your effort.
Tried it out and looks promising in code... does not work yet.
I am sure it will work.
Here are two screenshots of what I tried.
arcolinux-2018-11-23-1542985487_screenshot_3840x1080
arcolinux-2018-11-23-1542985478_screenshot_3840x1080

./hblock --whitelist HBLOCK_WHITELIST
and
./hblock --whitelist /etc/hblock.d/whitelist.list

Just ran hblock from the download folder - did not build anything and hblock was already installed

@hectorm
Copy link
Owner

hectorm commented Nov 23, 2018

The whitelist.list file must contain domain names separated by new lines. So instead of https://analytics.google.com it should be analytics.google.com.

@erikdubois
Copy link

it worked with this line of code
./hblock --whitelist /etc/hblock.d/whitelist.list using analytics.google.com in the whitelist list file
Could we program that IF there is nothing behind --whitelist we fall back on the /etc/hblock.d/whitelist.list file if it is there.

so

./hblock --whitelist

would be become the same thing or is that what this variable is for?

@hectorm
Copy link
Owner

hectorm commented Nov 23, 2018

In the new branch the whitelist works as follows.

If the environment variable HBLOCK_WHITELIST is defined, its value is used as whitelist, otherwise it checks the existence of the file /etc/hblock.d/whitelist.list and its content is used as whitelist, otherwise it uses the value built in the script itself, which is currently empty.

With the option --whitelist you can specify another file different from /etc/hblock.d/whitelist.list. Although thanks to your comments you have made me realize that if the user explicitly uses the --whitelist option, it should have priority over the environment variable HBLOCK_WHITELIST.

@erikdubois
Copy link

looking good - lets move forward
The installation of hblock should create the /etc/hblock.d/whitelist.list and blacklist.list automatically ;-)

@hectorm
Copy link
Owner

hectorm commented Nov 24, 2018

I think these files should be created explicitly by the user and not by the script.

When I make the priority change mentioned in the previous comment, I will make a merge in master and publish version 2.0.0.

@erikdubois
Copy link

erikdubois commented Nov 24, 2018 via email

@hectorm
Copy link
Owner

hectorm commented Nov 25, 2018

I'd really appreciate it if you'd take the time to check it out. Thank you!

@hectorm
Copy link
Owner

hectorm commented Nov 25, 2018

Done (d2f935a).

@erikdubois
Copy link

erikdubois commented Nov 25, 2018 via email

@erikdubois
Copy link

erikdubois commented Nov 25, 2018 via email

@erikdubois
Copy link

Crazy idea... this is what I did without reading the posts again.
created a folder in ~/.config/hblock and put whitelist file in it...
then read it again
This way it is in the users directory with full read and write capabilities...
arcolinux-2018-11-29-1543484439_screenshot_3840x1080

@erikdubois
Copy link

erikdubois commented Nov 29, 2018

tried the experimental branch and typing hblock works
/etc/hblock.d/whitelist.list containing google.analytics.com

Just for your information.
Here is our PKGBUILD
We will need to adapt it to add a folder with the whitelist.list containing ... false positives
The question is where shall hblock read its list.
/etc/hblock.d/whitelist.list
or
~/.config/hblock/whitelist.list

You decide the best option.

Should we not add already a file blacklist.list to the pkgbuild?

@hectorm
Copy link
Owner

hectorm commented Nov 29, 2018

I think the correct path should be /etc/hblock.d/whitelist.list as the user needs permission to modify the /etc/hosts file anyway and it is a change that affects all users.

On the other hand, since you are including the /etc/hblock.d/whitelist.list file, I think it's a good idea to also include /etc/hblock.d/blacklist.list for consistency. I have no plans to use any blacklist by default.

@erikdubois
Copy link

erikdubois commented Nov 30, 2018 via email

@erikdubois
Copy link

erikdubois commented Nov 30, 2018 via email

@hectorm
Copy link
Owner

hectorm commented Nov 30, 2018

All right, I just released version 2.0.0.

@hectorm hectorm closed this as completed Nov 30, 2018
@erikdubois
Copy link

erikdubois commented Nov 30, 2018 via email

@erikdubois
Copy link

works fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants