News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

IP checking breaks reCAPTCHA plugin on IPv6 + general bad code practices

Started by joepie91, October 27, 2011, 03:50:57 PM

Previous topic - Next topic

joepie91

Hello,

First off I would like to make clear that I have used SMF myself for a long time, and my main reason for using it over other forum platforms is how light and simple it is, and, until now, reliable. I am not just here to bash, I am here to point out a very serious problem.

Recently a friend of mine tried to install a reCAPTCHA plugin on his SMF 2.0.1 board. Every time you would actually try to register or make a guest post, an error would be encountered saying that reCAPTCHA required the user IP to be passed on. After testing in the Subs-Editor.php file I found out that the $_SERVER['REMOTE_ADDR'] variable simply did not hold a value. After creating an independent file that printed the same variable, and finding that it worked just fine, I started suspecting something was wrong with the core SMF code. After a grep through the source I found the following few lines:

./Sources/QueryString.php: if (!isset($_SERVER['REMOTE_ADDR']))
./Sources/QueryString.php: $_SERVER['REMOTE_ADDR'] = '';
./Sources/QueryString.php: elseif (preg_match('~^((([1]?\d)?\d|2[0-4]\d|25[0-5])\.){3}(([1]?\d)?\d|2[0-4]\d|25[0-5])$~', $_SERVER['REMOTE_ADDR']) === 0)
./Sources/QueryString.php: $_SERVER['REMOTE_ADDR'] = 'unknown';
[...]
./Sources/QueryString.php: if ($_SERVER['REMOTE_ADDR'] == 'unknown')
./Sources/QueryString.php: $_SERVER['REMOTE_ADDR'] = '';


... which brings me to several points regarding the code structure of SMF.

There are several very big and important mistakes that are being made here:

  • Magic variables ($_SERVER) are being set. This should never ever ever be done for any reason whatsoever because it changes the value of a magic variable from predictable to unpredictable. Use your own variables to store values derived from a magic variable.
  • A variable is being tested that cannot be modified by the client, and is fully provided by the HTTPd. There is no possible way that this variable holds a malformed value if a proper HTTPd is being used (Apache, lighttpd, nginx, to name a few). There is, simply put, no reason to check this variable as the only possible thing it can cause is problems.
  • An IPv4-specific regex is used that cannot deal with IPv6 addresses, nor with IPv4 addresses in IPv6 format. An address in the format of ::ffff:1.2.3.4 simply is treated as an invalid value and causes the $_SERVER['REMOTE_ADDR'] variable to be set to 'unknown', and later on set to ''. This is what broke the reCAPTCHA plugin since he was trying to run the forum on an IPv6-ready server (Debian + lighttpd).
  • The meaning of a variable ($_SERVER['REMOTE_ADDR']) is changed halfway through the code. If a distinction has to be made between unknown and unset, then that distinction should be made in another variable. It is not acceptable to first have a distinction, and later generalize it into '' for both. This makes it very hard to understand what a variable does at what point in the code and causes unpredictable behaviour. It was the reason that at first I could not find out what was causing the variable to be empty - because, of course, an invalid format would set it to 'unknown' rather than empty, and it was not immediately obvious that unset and unknown would later both be set to be empty.
  • A new magic variable is set ($_SERVER['is_cli'] = true;) in the $_SERVER array which is - considering the purpose of the $_SERVER array being the passing on of server/environment variables - not acceptable either. These kind of developer-defined things should have their own (global) variables or arrays.

Now if this were simply a misformatted regular expression, I would have submitted a corrected version that also did IPv6 and be done with it. However, the problems here - with the programming style in general, not just with the regular expression - are so severe, that it requires more than that. This kind of programming has so much potential to cause future problems that it's very likely that, continuing this style, similar problems will be found in the future.

The solution to the issue we were having? Comment out the two lines for the format checking entirely.

Antechinus

Just noting that we are aware of this problem:

"An IPv4-specific regex is used that cannot deal with IPv6 addresses, nor with IPv4 addresses in IPv6 format."

Sleepy has already written a mod to add IPv6 support to 2.0.x (http://custom.simplemachines.org/mods/index.php?mod=3051) and we are going to have IPv6 support in the next version, of course.

I wont comment on the other points as I'm mainly front end, so am not really up to speed on those points. I expect Norv will take a look at this as soon as he logs in.

Oh and just an opinion here: but I never use any sort of CAPTCHA at all these days, and have not noticed any disadvantage in running an active site like that. I'd suggest not bothering with RECAPTCHA if it's giving you trouble.

Norv

Thank you for taking the time for this, joepie.

Quote from: joepie91 on October 27, 2011, 03:50:57 PM
However, the problems here - with the programming style in general, not just with the regular expression - are so severe, that it requires more than that. This kind of programming has so much potential to cause future problems that it's very likely

I agree. There are bad practices embedded throughout the code. I know that very well, and none of the writing of superglobals and such practices will be in the next version. The SMF codebase is getting a serious refactoring and cleaning for the next versions. I have already started the work with this purpose, and I'm not satisfied until any of such (and similar) will not be there anymore.
We don't need the earth and sky in SMF, but THIS, we do need to do.
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

Illori


emanuele

The problem with IPv6 should be fixed with SleePy changes to support IPv6, the general bad practice...well it's still there. Don't know if it will be changed for 2.1.


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Arantor

The bad practice would require a complete overhaul of everything end to end to achieve and certainly won't happen any time soon though it does need doing.
Holder of controversial views, all of which my own.


Advertisement: