News:

Want to get involved in developing SMF, then why not lend a hand on our github!

Main Menu

PHP-FPM processes stuck at 100% CPU

Started by clanssd, September 01, 2011, 03:23:24 PM

Previous topic - Next topic

clanssd

Hi there,

I'm running SMF 2.0 Gold on a VPS with nginx 1.1.1 and PHP 5.3.8 with FPM while using APC 3.1.9 as an opcode cache. I've been keeping an eye on top for the last day and I keep seeing php-fpm processes get stuck at 100% CPU usage until the process despawns (I had to set it to do this in php-fpm.conf). Needless to say this is driving up the CPU load quite a bit and I can't really figure out why. People have suggested using the FPM slowlog to see what requests are taking longer than a certain amount of time to complete and this is basically the kind of output I see in slowlog:


[01-Sep-2011 12:11:25]  [pool www] pid 19900
script_filename = /home/user/domain.com/user/index.php
[0x00000000023ff848] error_handler() /home/user/domain.com/user/Sources/Errors.php:208
[0x00007fff58801c60] error_handler() unknown:0
[0x00000000023fbed8] ob_end_clean() /home/user/domain.com/user/Sources/Display.php:1470
[0x00007fff588022e0] Download() unknown:0
[0x00000000023fa000] call_user_func() /home/user/domain.com/user/index.php:153

[01-Sep-2011 12:11:27]  [pool www] pid 19871
script_filename = /home/user/domain.com/user/index.php
[0x000000000267b748] ob_end_clean() /home/user/domain.com/user/Sources/Display.php:1470
[0x00007fff588022e0] Download() unknown:0
[0x0000000002679870] call_user_func() /home/user/domain.com/user/index.php:153

[01-Sep-2011 12:11:45]  [pool www] pid 20266
script_filename = /home/user/domain.com/user/index.php
[0x00000000024ceaa8] ob_end_clean() /home/user/domain.com/user/Sources/Display.php:1470
[0x00007fff588022e0] Download() unknown:0
[0x00000000024ccbd0] call_user_func() /home/user/domain.com/user/index.php:153

[01-Sep-2011 12:13:04]  [pool www] pid 19460
script_filename = /home/user/domain.com/user/index.php
[0x00000000025898d8] Download() /home/user/domain.com/user/Sources/Display.php:1470
[0x00007fff588022e0] Download() unknown:0
[0x0000000002587a00] call_user_func() /home/user/domain.com/user/index.php:153


The PID's in the log correspond with the php-fpm processes that get stuck at 100%. I also have vB and IP.B running on the same VPS and while they do generate entries in the slowlog, they don't lock up the process. Any insight would be greatly appreciated. If you need me to post additional info, please let me know.

Something like that

Have you tried enabling the PHP error log? It mail reveal more clues as to what's going on.

Also, check your HTTP logs for requests that return a 5xx error. That will give more insight as to what's causing the issue.

Island Wave

You might want to try running...

netstat -ane | grep FIN_WAIT

...and see if you have a lot of FIN_WAIT1 or FIN_WAIT2 connections running.

clanssd

Thanks for the quick reply... not sure what to call you since your name is blank. I went through the PHP error log and didn't find much of anything related, just a few images that couldn't be found. I don't see any 5xx errors in the access log. I emptied out the SMF error log and then waited until a few processes got stuck but I don't see anything showing up there either.

For now, I just set up PHP-FPM to close child processes if they've run for more than 45 CPU seconds as the processes that don't get stuck usually respawn after about 2-15 seconds. This seems to be a quick and dirty fix for the time being although I would still like to find out what's causing it.

Island Wave: Thanks for your reply as well. Unfortunately I'm not sure how to interpret the output of that command. :| I started managing sites as a hobby and now find myself knee-deep in Linux web server mumbo jumbo. This is the output from the command you suggested.


tcp        0      0 x.x.x.x:80         96.235.146.131:55126    FIN_WAIT2   0          0
tcp        0   3107 x.x.x.x:80         136.145.230.200:53377   FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         76.87.11.98:65386       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         76.87.11.98:65387       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         74.192.55.134:64548     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         76.183.102.201:50998    FIN_WAIT2   0          0
tcp        0      1 x.x.x.x:80         115.135.181.200:49855   FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         190.58.225.148:63012    FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         148.87.19.214:45082     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:31831     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         115.135.181.200:49856   FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         188.175.163.114:50308   FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:27031     FIN_WAIT2   0          0
tcp        0      1 x.x.x.x:80         115.135.181.200:49849   FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         50.8.222.60:65145       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         96.235.146.131:55108    FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         71.31.215.220:61899     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         76.87.11.98:65384       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:45201     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         76.87.11.98:65389       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         96.235.146.131:55125    FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:63610     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         129.111.181.228:51811   FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         50.8.222.60:65146       FIN_WAIT2   0          0
tcp        0      1 x.x.x.x:80         190.58.225.148:63015    FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         96.235.146.131:55107    FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:41222     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         76.87.11.98:65385       FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         148.87.19.214:45433     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         77.248.106.65:50299     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         115.135.181.200:49858   FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         84.155.161.68:44790     FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         190.58.225.148:63013    FIN_WAIT2   0          0
tcp        0      0 x.x.x.x:80         50.8.222.60:65144       FIN_WAIT2   0          0
tcp        0    345 x.x.x.x:80         180.76.5.27:7110        FIN_WAIT1   0          0
tcp        0   6252 x.x.x.x:80         82.120.14.250:61949     FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         85.198.204.147:55111    FIN_WAIT2   0          0
tcp        0      1 x.x.x.x:80         189.203.69.92:33640     FIN_WAIT1   0          0
tcp        0      1 x.x.x.x:80         85.198.204.147:55114    FIN_WAIT1   0          0
tcp        0      0 x.x.x.x:80         50.8.222.60:65142       FIN_WAIT2   0          0


This is across 3 forums that usually have at least 5-10 members and usually anywhere between 10-30 guests/spiders on each throughout the day if that gives the output a little bit more context.

clanssd

#4
After some troubleshooting, I've found that this happens anytime someone tries to download an attachment larger than 4 MB or when SMF tries to generate a thumbnail for a file attachment larger than 4 MB. I don't have a vast knowledge of the way SMF works but I assume this has to do with the fact that it send all attachments through PHP?

I know the simple answer is to simply lower the max attachment size to 4MB but this is a problem with an existing forum. Anyone trying to download existing attachments over 4 MB ends up crashing a PHP process which I'm pretty sure is causing 502 errors in nginx. I've looked through php-fpm.conf, nginx.conf, php.ini, and fastcgi_params but I don't see where this 4 MB number is coming from. This wasn't an issue when we were using Apache with mod_php. Again, I greatly appreciate any insight anyone reading might be able to provide. Thanks!

Edit: Wanted to note that there is no problem uploading files larger than 4 MB.

butch2k

You can edit SMF code so that X-Sendfile is used instead of standard download procedure.
Same combo here, and no issue at all.

You might want to check your nginx config as well especially the buffers, here is my nginx  conf for php files


        location ~ \.php$ {
            sendfile on;
            fastcgi_pass   127.0.0.1:9000;
            root   /var/www/vhosts/xxxxxxxxxxxxxx.com/httpdocs;
            fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
            fastcgi_index  index.php;
            include        fastcgi_params;
            fastcgi_connect_timeout 60;
            fastcgi_send_timeout 180;
            fastcgi_read_timeout 180;
            fastcgi_buffer_size 128k;
            fastcgi_buffers 8 256k;
            fastcgi_busy_buffers_size 256k;
            fastcgi_temp_file_write_size 256k;
            fastcgi_intercept_errors on;
        }

clanssd

Hey butch, thanks for the suggestion! That seems to have fixed a lot of the issues with attachments we were having, although the modification wasn't straight forward... it kept doubling the attachments path in $filename so I had to set the attachment path to be relative rather than absolute. Then it wasn't adding a / before the attachment folder name so I had to do a str_replace on $filename to fix that... I'm not entirely sure why the $filename variable got weird like that... but it works in the end so I guess that's good.

If anyone else should run into this problem, I used the information found at this page to modify Display.php:
hxxp:thomasfischer.biz/?p=364 [nonactive]

The only difference is to use X-Accel-Redirect instead of X-Sendfile and then read the correction posted at the bottom.

butch2k

I used the same strreplace trick, i need to investigate further about the doubled path.

Something like that

Quote from: clanssd on September 03, 2011, 11:07:36 PM
After some troubleshooting, I've found that this happens anytime someone tries to download an attachment larger than 4 MB or when SMF tries to generate a thumbnail for a file attachment larger than 4 MB. I don't have a vast knowledge of the way SMF works but I assume this has to do with the fact that it send all attachments through PHP?

I know the simple answer is to simply lower the max attachment size to 4MB but this is a problem with an existing forum. Anyone trying to download existing attachments over 4 MB ends up crashing a PHP process which I'm pretty sure is causing 502 errors in nginx. I've looked through php-fpm.conf, nginx.conf, php.ini, and fastcgi_params but I don't see where this 4 MB number is coming from. This wasn't an issue when we were using Apache with mod_php. Again, I greatly appreciate any insight anyone reading might be able to provide. Thanks!

Edit: Wanted to note that there is no problem uploading files larger than 4 MB.

In your php.ini, check the memory_limit setting. I recommend 48M.

In php-fpm.conf, try adding php_admin_value[memory_limit] = 48M

Last I checked, SMF still chokes if you include a high pixel count image with an img tag, the reason being it tries to download and load the PHP image into memory, and dies when it runs out of memory.

Advertisement: