A List of Dirty, Naughty, Obscene, and Otherwise Bad Words to use in PHP via Composer.
Obvious warning: These lists contain material that many will find offensive. (But that's the point!)
If you think that implementing an automatic filter of bad words is a good idea, you should really check these articles first:
To sum it up, this library by no way should be used as a 100% way to get rid of obscene language in your application. People will always find out ways to bypass your filters. If you need to handle bad words, you need manual control. Tools like this one are meant to assist human moderation, not to replace it.
Use cases, where this library might be of use:
- Requesting an approval from a moderator after user input validation when potentially bad words are found.
- Refuse user input only when an exact match of a bad word is found.
(This is easily bypassed when a bad word is given one more arbitrary letter, just so the exact match won't succeed.)
composer require vasart/naughty-words
Receiving a plain list of naughty words:
use VasArt\NaughtyWords\NaughtyWords;
$naughtyWordsEn = NaughtyWords::getForLanguage('en');
The string 'en'
inside getForLanguage()
call here is the name of the file with bad words for the language of choice. See the list of available languages in Languages section.
Using built-in validator:
use VasArt\NaughtyWords\Validator;
$text = 'some user input with potentially bad words';
$naughtyWordsValidator = new Validator( [ 'en', 'ru' ] );
$naughtyWords = $naughtyWordsValidator->findNaughtyWords( $text );
var_export($naughtyWords); // [ 'en' => 'word', 'ru' => false ]
For examining how does built-in validator work exactly you should check:
- test cases inside ValidatorTest class;
- regular expression that is built inside WordsList class.
Name | Code |
---|---|
Arabic | ar |
Chinese | zh |
Czech | cs |
Danish | da |
Dutch | nl |
English | en |
Esperanto | eo |
Filipino | fil |
Finnish | fi |
French | fr |
French (CA) | fr-CA-u-sd-caqc |
German | de |
Hindi | hi |
Hungarian | hu |
Italian | it |
Japanese | ja |
Kabyle | kab |
Klingon | tlh |
Korean | ko |
Norwegian | no |
Persian | fa |
Polish | pl |
Portuguese | pt |
Russian | ru |
Spanish | es |
Swedish | sv |
Thai | th |
Turkish | tr |
If you need to use bad words inside an npm
project, you can install the word list using the naughty-words package.
The code, configuration and project description files are licensed under GNU GPL 3.0, see LICENSE.
The list of words is licensed under a Creative Commons Attribution 4.0 International License, see LICENSE.words. © 2012–2020 Shutterstock, Inc.