Simple Machines Community Forum

Customizing SMF => Modifications and Packages => Topic started by: dougiefresh on November 05, 2013, 03:00:43 AM

Title: Word Censor List
Post by: dougiefresh on November 05, 2013, 03:00:43 AM
Link to Mod (https://custom.simplemachines.org/mods/index.php?mod=3797)



WORD CENSOR LIST v1.5
By Dougiefresh (http://www.simplemachines.org/community/index.php?action=profile;u=253913) -> Link to Mod (http://custom.simplemachines.org/mods/index.php?mod=3797)



Introduction
So, you want to run a family friendly community, without any vulgar words appearing on your site. The easiest way to prevent that is to use SMF's word censor feature, but you have an empty list of words and don't want to spend an hour filling in every naughty word you know and some you don't.

Word Censor List will help you by adding a list of some commonly censored words and some uncommon ones to your forum.

Compatibility Notes
This mod was tested on SMF 2.0.5, but should work on earlier versions of SMF 2.0.x.  SMF 1.x is not and will not be supported.

Changelog
The changelog can be viewed at XPtsp.com (http://www.xptsp.com/board/free-modifications/word-censor-list/?tab=1).

License
Copyright (c) 2015 - 2018, Douglas Orend
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Title: Re: Word Censor List v1.1
Post by: dougiefresh on March 25, 2014, 09:43:21 AM
Updated to v1.1.  Upgrading from v1.0 to v1.1 is not necessary, as it does not replace the functionality provided, only fixes the settings installer.
Title: Re: Word Censor List v1.1
Post by: TonyG on December 22, 2014, 04:12:45 PM
I have a list of censor words that I carry around from one family-oriented site to another. Interested in an update to the list you have in edit_db.php? Do you already have a place for this or some preferred mechanism for doing this?
Thanks!
Title: Re: Word Censor List v1.1
Post by: Kindred on December 22, 2014, 04:26:12 PM
Quote from: dougiefresh on November 05, 2013, 03:00:43 AM
. The easiest way to prevent that is to use phpBB's word censor feature,

really? :)
Title: Re: Word Censor List v1.1
Post by: margarett on December 22, 2014, 04:28:20 PM
LOL ;D
Title: Re: Word Censor List v1.1
Post by: dougiefresh on December 22, 2014, 08:30:31 PM
Always interested in submissions....  Please share!
Title: Re: Word Censor List v1.1
Post by: Biology Forums on December 22, 2014, 08:46:11 PM
Always wanted a mod like this, thanks.
Title: Re: Word Censor List v1.1
Post by: dougiefresh on December 26, 2014, 01:10:55 PM
Quote from: Kindred on December 22, 2014, 04:26:12 PM
Quote from: dougiefresh on November 05, 2013, 03:00:43 AM
. The easiest way to prevent that is to use phpBB's word censor feature,

really? :)
:o Whoops!!!  I meant that you should use SMF's word censor feature.....  Fixed that in the first post!  ::)  I guess I should admit that I copied the description from a phpBB mod and didn't pay that much attention....
Title: Re: Word Censor List v1.1
Post by: TonyG on January 02, 2015, 08:00:39 PM
I just updated the list. Based on other entries, I added and modified a lot of words to include RegExp tests, but it doesn't look like any of those are working. I'm using the Advanced Censor mod which does a PHP function strstr, and Block Censor Words.

Has anyone here modified their filter to do regex tests with the censor list?

Thanks!
Title: Re: Word Censor List v1.1
Post by: Arantor on January 02, 2015, 10:41:54 PM
Considering that the internals of the censor function already use regex, I wish you the *very* best of luck performing the rewrite required to make that work as intended.
Title: Re: Word Censor List v1.1
Post by: TonyG on January 03, 2015, 07:40:40 PM
So am I to understand that this Word Censor List was invalid from the start?

I'll have to look at the regexp code because it doesn't look like it's working with the masks being used.

So which is it? Are we using the wrong kind of regex? Is the regex not working? Is there any documentation for the syntax supported by the current regex mechanism?IF that's a preg_match, can we assume that if the word list in the database has a string that can be interpreted by preg_match that we'll get good censor matching?

And now that I'm thinking about this I'm thinking that the mods might be using strstr() while SMF might be using preg_match, which leaves text to get filtered in different ways along the chain of execution - that can just lead to confusion and embarrassment.

Let's not leave this unresolved - what SHOULD work there?

Thanks.
Title: Re: Word Censor List v1.1
Post by: Arantor on January 03, 2015, 08:07:31 PM
I don't know what you understood from what I said, to be honest, but clearly there is some misunderstanding somewhere.

This adds them to the database in the way SMF's own interface does. This is then internally converted into a regex for processing purposes. It doesn't support full regex syntax for this reason. Hence my comment.

But multiple times I have seen comments... you clearly know best, of course. Best of luck to you.
Title: Re: Word Censor List v1.1
Post by: TonyG on January 03, 2015, 08:39:14 PM
We do have a misunderstanding. I'm trying to understand how this stuff is working so that we can do better filtering.
I completely understand that this Word Censor List mod just inserts text strings into the database.
From there, what happens to each string?

The list already includes some strings with regexp. I just asked if that was valid or not.
You said "This is then internally converted into a regex for processing purposes. It doesn't support full regex syntax for this reason. "
OK, so what sort of conversion is done there? Knowing that will allow us to make better improvements to this list.

From the examples already in the list, it seemed to me that "b[4a@][!l][!l][0o][0o]n" should match balloon, b@l!00n, and b4l!0on. Is that not correct? If not then all I was saying is that a number of entries already in the list are bad and we need to change how this is approached.
Title: Re: Word Censor List v1.1
Post by: TonyG on January 10, 2015, 07:11:42 PM
Coming back to this topic. Can anyone tell us exactly what Regex syntax is supported for words found in the censor list?

I see the Load.php code referred to by @Arantor:

if ($censor_vulgar == null)
{
$censor_vulgar = explode("\n", $modSettings['censor_vulgar']);
$censor_proper = explode("\n", $modSettings['censor_proper']);

// Quote them for use in regular expressions.
for ($i = 0, $n = count($censor_vulgar); $i < $n; $i++)
{
$censor_vulgar[$i] = strtr(preg_quote($censor_vulgar[$i], '/'), array('\\\\\\*' => '[*]', '\\*' => '[^\s]*?', '&' => '&amp;'));
$censor_vulgar[$i] =
                              (empty($modSettings['censorWholeWord']) ?
                                    '/' . $censor_vulgar[$i] . '/' :
                                        '/(?<=^|\W)' .
                                        $censor_vulgar[$i] .
                                        '(?=$|\W)/') .
                              (empty($modSettings['censorIgnoreCase']) ?
                                   '' :
                                        'i') .
                              ((empty($modSettings['global_character_set']) ?
                                   $txt['lang_character_set'] :
                                        $modSettings['global_character_set']) === 'UTF-8' ? 'u' : '');

if (strpos($censor_vulgar[$i], '\'') !== false)
{
$censor_proper[count($censor_vulgar)] = $censor_proper[$i];
$censor_vulgar[count($censor_vulgar)] = strtr($censor_vulgar[$i], array('\'' => '&#039;'));
}
}
}

// Censoring isn't so very complicated :P.
$text = preg_replace($censor_vulgar, $censor_proper, $text);


I broke up that meaty assignment statement just for readability. I understand that's adjusting each word element to account for server-specific settings. But can anyone explain exactly what the reformatting code is doing which might preclude using Regex syntax in elements of $modSettings['censor_vulgar'] ?

Note: I just looked at the Advanced Censor mod. This will not process the $modSettings['censor_vulgar'] list using Regex as seen above. It looks for specific text.:
if (strstr($pMessageBody, $vCensorVulgar[$i])) return true;

However, I believe that code could easily be retrofit with the code from Load.php.

Thanks.
Title: Re: Word Censor List v1.1
Post by: dougiefresh on January 11, 2015, 04:21:20 PM
Hmmmm....  I don't have a copy of version 1.0 of this mod, so I'm gonna have to figure something out regarding the broken censor list....
Title: Re: Word Censor List v1.1
Post by: TonyG on January 12, 2015, 03:13:48 PM
I don't understand @dougiefresh.

To get the Regexp in your word list to work, I think one just needs to understand what's being done in that core code to each element before it does the final preg_replace. It might be helpful to write that data to a file to see what's been done to it. Then we can revise each element to confirm.

As to the Advanced Censor mod, it returns a true before posting if the text contains a censored word. So all that's needed there is the same code from Load.php, and final check:
if (preg_replace($censor_vulgar, $censor_proper, $text) !== $text) return true;
Someone should advise him that his mod is invalid if the wordlist contains Regexp. I guess I'll do this after we get through this discussion.

HTH
Title: Re: Word Censor List v1.1
Post by: dougiefresh on January 12, 2015, 06:57:02 PM
Uploaded v1.2 - January 12th, 2015
o Removed most wildcards from the word censor list.
o Corrected link to the mod in the descriptions.
Title: Re: Word Censor List
Post by: TonyG on January 12, 2015, 09:57:23 PM
So is the answer to the ongoing question that regex is simply not supported at all for censored words?
If so, then removing the wildcards from the list in this mod is the right solution for this mod.

I think the better long-term solution however is to find out what regexp is possible in the code from Load.php, and then get words in the list to conform within the constraints.
Title: HELP PLEASE!!!!!!
Post by: metallicgloss on January 29, 2015, 03:49:16 PM
I installed this package and it is now turning all 'hello' into *o and it is REALLY ANNOYING.
I edited the file in the pack but nothing has changed. I re-installed my forum and re-added a couple packages with an execption of this. It is still doing it, it is doing something with the database. Where can i remove it so it now doesnt could hell as a swear word.
Title: Re: Word Censor List
Post by: dougiefresh on January 29, 2015, 03:58:38 PM
@metallicgloss: Go into the Admin panel, under Forum => Posts and Topics => Censored Words.  Put a check in the option saying Check only whole words:.  That should solve the problem....
Title: Re: Word Censor List
Post by: metallicgloss on January 29, 2015, 04:35:33 PM
@dougiefresh THANK YOU!!!!! This will really help the community. Thank you for your help.
Title: Re: Word Censor List
Post by: KensonPlays on March 01, 2015, 10:34:18 PM
Thanks for the update! I had one by @Labradoodle-360, but I could not find it for the life of me. You're a life-saver. :P
Title: Re: Word Censor List
Post by: dougiefresh on April 06, 2015, 05:12:13 PM
Uploaded v1.4 - April 5th, 2015
o Updated for SMF 2.1 Beta 1
Title: Re: Word Censor List
Post by: skb on March 13, 2017, 02:24:18 AM
I uninstalled the mod, yet the words remain in the Censored Word List ?
Title: Re: Word Censor List
Post by: Arantor on March 13, 2017, 03:34:25 AM
That would make sense, the mod just adds to the existing censor list.
Title: Re: Word Censor List
Post by: dougiefresh on November 09, 2018, 03:09:38 PM
Uploaded v1.5 - November 9th, 2018
o No functionality change.
o Updated documentation to point to new website.
Title: Re: Word Censor List
Post by: skb on September 28, 2019, 04:45:22 AM
I have the "Allow users to turn off word censoring:" option enabled, but I don't see the setting in My Profile / Account Settings / Forum Profile or Look and Layout where a user can exercise this option.
Title: Re: Word Censor List
Post by: landyvlad on October 01, 2021, 12:40:43 AM
Can I use this to add all the words etc to my censored word list and then delete the mod to clean up but leave all those  words in the censor list?
Title: Re: Word Censor List
Post by: efk on October 05, 2021, 07:29:54 AM
Quote from: landyvlad on October 01, 2021, 12:40:43 AMCan I use this to add all the words etc to my censored word list and then delete the mod to clean up but leave all those  words in the censor list?
I believe that you don't need a mod for that. Simply go to Admin/Forum/Posts and Topics.../Censored Words and do what you need to do, its simple to use.
Title: Re: Word Censor List
Post by: Kindred on October 05, 2021, 10:50:01 AM
efk --   the thing is, this mod simplifies the adding of a WHOLE BUNCH of words rather than entering them by hand.

landyvlad -- based on the comments above, the answer is yes.
Title: Re: Word Censor List
Post by: shadav on October 05, 2021, 12:24:16 PM
yes you don't need the mod, in fact I only installed the mod on one of my forums then set up the censored words how I wanted then carefully extracted them from the settings table and imported them into the rest of my forums  ;D
don't uninstall the mod though
but you can delete the mod as all this really does is run an sql to your db

it is very helpful as it has a lot of words in its list and a lot of variations of said words and in other languages as well.
Title: Re: Word Censor List
Post by: landyvlad on October 05, 2021, 10:47:17 PM
Quote from: shadav on October 05, 2021, 12:24:16 PMdon't uninstall the mod though
but you can delete the mod as all this really does is run an sql to your db

Why not uninstall?  Would that remove the words from the censored list?
Title: Re: Word Censor List
Post by: shadav on October 05, 2021, 10:59:31 PM
I'm not entirely sure but I think it removes them from the db if you uninstall...looking at the files it appears to remove the words but again I'm not entirely sure