SMF Version: SMF 1.1 RC3
We have this problem where our server's load shoot up the sky. We manage to track down the problem and narrowed it down to SMF. When we turn off SMF, everything will run fine. Once we start SMF and when only 1 person try to login, the load will starts to build...from 1.0 to 10 to 30 to 50 to 100 and eventually the server starts to crawl to a halt.
The process that is causing the load seems to be httpd (apache 1.3.x). In top, we will see httpd process starting to pile up memory up to over 90mb per process. I can see 3 processes with over 90mb during my monitoring of this problem.
We already did some apache optimization such as turning off hostlookup and so on. Also did mysql optimization.
My question is, why does only 1 users logon could cause such a big problem? We are also running smf in other website running on SAME server and it's running fine. Though that forum have very low users and posting.
This forum with problem is ported from phpbb. We installed joomla with a bridge to smf. Converted the phpbb containing around 100,000 users to SMF. Could that be the reason for it being slow?
We haven't done stuffs like eaccelerator or innodb. But would that really help in our situation? The thing is, before the lockup, I do see mysql processes being locked. Does that means smf is doing those queries and since it's using myisam, those processes are being locked until the queries are done? But the thing is, I only see httpd process taking up all the memory. Mysql seems to be still fine.
Anyway, feel free to suggest anything. Thanks in advance.
More information about the server :
Dedicated Server P4 2.8c
1 GB RAM
HDD space (enough)
WHM 10.8.0 cPanel 10.9.0-R34
Apache 1.3.x
PHP 4.4.4
mysql 4.1.12
mailscanner
Installed eAccelerator and still have problem.
Once we enable the forum and logon, load starts to go up up up and away....
edit :
also converted some of the tables to innodb as suggested in the forum but no go. Still same high load going on...
17:16:59 up 1 day, 19:39, 2 users, load average: 19.62, 8.62, 3.94
325 processes: 315 sleeping, 6 running, 3 zombie, 1 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 89.4% 0.0% 9.0% 0.3% 1.1% 0.0% 0.0%
cpu00 85.7% 0.0% 12.7% 0.7% 0.7% 0.0% 0.0%
cpu01 93.1% 0.0% 5.3% 0.0% 1.5% 0.0% 0.0%
Mem: 1015908k av, 995980k used, 19928k free, 0k shrd, 6812k buff
758620k actv, 142600k in_d, 12616k in_c
Swap: 2048276k av, 687008k used, 1361268k free 78396k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
8649 nobody 21 0 93260 84M 5616 D 0.0 8.5 0:30 1 httpd
1685 nobody 25 0 96544 81M 8844 D 0.7 8.2 0:34 0 httpd
1683 nobody 15 0 83460 78M 8844 D 0.3 7.8 1:08 0 httpd
8652 nobody 15 0 79900 76M 5840 S 0.3 7.7 0:05 1 httpd
7185 nobody 25 0 93648 71M 5980 R 22.2 7.1 1:00 0 httpd
1677 nobody 25 0 92068 68M 8848 R 21.8 6.9 0:30 1 httpd
1674 nobody 25 0 92072 66M 9172 D 14.3 6.7 0:36 0 httpd
1668 nobody 15 0 77744 61M 7964 S 0.0 6.1 0:35 0 httpd
1680 nobody 15 0 91332 38M 8496 D 0.0 3.8 1:36 0 httpd
1230 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1233 mysql 25 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1234 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1235 mysql 15 0 76584 32M 2740 S 0.7 3.3 0:15 0 mysqld
1236 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1237 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1238 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 1 mysqld
1239 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 0 mysqld
1240 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 1 mysqld
1241 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:01 1 mysqld
1243 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:00 1 mysqld
1247 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:02 1 mysqld
1253 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:12 0 mysqld
1254 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:08 0 mysqld
1255 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:15 1 mysqld
1259 mysql 15 0 76584 32M 2740 S 0.0 3.3 0:02 1 mysqld
edit:
one thing I noticed is that when the load starts getting very high, as mentioned, there will be some httpd process started using lots of memory as well. In one case, I see three httpd at around 80mb in size. Then to avoid system crashing, I changed the smf forum to maintenance mode. Once I set to maintenance mode, load will start to go down. It will go down back to less than 1.0 as usual.
But I still see that few httpd process which is around 80mb still hovering in top. Only when I restart apache that I see these httpd process gone.
Any idea?
3 zombies, anything in the error log. I'd suspect one of your log files are over 2GB.
Edit - never saw the bit about locked processes, what ones are they?
Hello,
With that large a forum, I'm pretty sure your server is trashing... and that's the reason for the high-load. I've had server with loads go up to 30 - 80, just because of memory exhaustion.
From your top output, it appears you're using close to a gig of swap-space, which is quite a lot. Not to mention, 1GB of memory is very, very little for 315 processes.
Best Regards,
Christian A. Herrnboeck
MonteCarloHosting.Net
He does say he only had one user logged in which indicates something else is wrong. I'd assume as it's a conversion from phpBB, this problem has only started happening since the conversion.
Ben,
The original poster is (I think) reffering to the one user logged into the system. When you open TOP, you'll see X users logged in - these are not forum users, these are users (people, usually) logged into the server. With over 300 processes running, you MUST have a heck of a lot more users on your forum, then just one ;)
To help diagnose this, post the output of:
ps aux
Oh, and make sure to use code tags... or things could get ugly ;)
Best Regards,
Christian A. Herrnboeck
PS: Yes, it could very well be after the conversion that he started noticing it. (Not wanting to start a flame war) but a standard phpBB 2.0.x series board is about half as heavy as SMF is on system resources.
Quote from: ChristianH. on October 12, 2006, 12:21:46 PM
The original poster is (I think) reffering to the one user logged into the system. When you open TOP, you'll see X users logged in - these are not forum users, these are users (people, usually) logged into the server. With over 300 processes running, you MUST have a heck of a lot more users on your forum, then just one ;)
Thanks for the explanation, although I am quite capable of reading the output of top.
Quote from: original posterMy question is, why does only 1 users logon could cause such a big problem? We are also running smf in other website running on SAME server and it's running fine. Though that forum have very low users and posting.
Quite clearly he is refering to one user active on this particular message board.
QuotePS: Yes, it could very well be after the conversion that he started noticing it. (Not wanting to start a flame war) but a standard phpBB 2.0.x series board is about half as heavy as SMF is on system resources.
I'd be interested in viewing your benchmark results that show that, given there have been numerous people who have converted from phpBB and reported significantly improved performance and lower server load.
Ben,
Go get a beer :D
Quote
Thanks for the explanation, although I am quite capable of reading the output of top.
You may want to keep in mind, that this is a public message board. I was trying to explain TOP's output to others who may read this topic.
Quote
Quite clearly he is refering to one user active on this particular message board.
One user on a forum is not able to build that much load, unless Apache is having series core-dumps. I'd suggest running a trace on the apache process, or even installing phpSuExec, and see what's going on. Normally, a forum which has only one active member,
will not cause a server to have over 300 processes running - that's just impossible.
Quote
I'd be interested in viewing your benchmark results that show that, given there have been numerous people who have converted from phpBB and reported significantly improved performance and lower server load.
I'm a host - I have multiple servers, and clients who use everything from phpBB to SMF, to vB, and who use all *nuke variants, Joomla, you name it, they run it. From experience, phpBB 2.0.x (unmodded) is lighter on memory. (Try it yourself, on your server, install phpSuExec, and install one phpBB and one SMF board. Request index.php from both, and monitor memory usage for each through top. You'll see the difference).
Now, can we *please* keep this thread, on topic? :P
-Christian
no logs over 2gb. Checked.
when I say 1 user, I mean 1 user using the forum only. We just turned it on and login to the forum and the problem starts. Unless we have users hammering the forum all the time then that would add more users but AFAIK only 1 user logon.
There are other websites running on the server as well. Those are low load sites running static pages.
Here is a shot of ps aux :
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1528 240 ? S Oct10 0:05 init [3]
root 2 0.0 0.0 0 0 ? SW Oct10 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SW Oct10 0:00 [migration/1]
root 4 0.0 0.0 0 0 ? SW Oct10 0:00 [keventd]
root 5 0.0 0.0 0 0 ? SWN Oct10 0:00 [ksoftirqd/0]
root 6 0.0 0.0 0 0 ? SWN Oct10 0:00 [ksoftirqd/1]
root 9 0.0 0.0 0 0 ? SW Oct10 0:01 [bdflush]
root 7 0.0 0.0 0 0 ? SW Oct10 2:06 [kswapd]
root 8 0.1 0.0 0 0 ? SW Oct10 3:38 [kscand]
root 10 0.0 0.0 0 0 ? SW Oct10 0:11 [kupdated]
root 11 0.0 0.0 0 0 ? SW Oct10 0:00 [mdrecoveryd]
root 19 0.0 0.0 0 0 ? SW Oct10 0:00 [scsi_eh_0]
root 20 0.0 0.0 0 0 ? SW Oct10 0:00 [scsi_eh_1]
root 23 0.0 0.0 0 0 ? SW Oct10 1:42 [kjournald]
root 101 0.0 0.0 0 0 ? SW Oct10 0:00 [khubd]
root 2365 0.0 0.0 0 0 ? SW Oct10 0:00 [kjournald]
root 2366 0.0 0.0 0 0 ? SW Oct10 0:05 [kjournald]
root 2367 0.0 0.0 0 0 ? SW Oct10 0:11 [kjournald]
root 2846 0.0 0.0 1584 292 ? S Oct10 0:08 syslogd -m 0
root 2850 0.0 0.0 1540 236 ? S Oct10 0:00 klogd -x
root 2860 0.0 0.0 1520 248 ? S Oct10 0:08 irqbalance
root 2868 0.0 0.0 1516 176 ? S Oct10 0:00 /usr/sbin/courierlogger -pid=/var/spool/authdaemon/pid -facility=mail -start
root 2869 0.0 0.0 1840 208 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2879 0.0 0.0 1840 280 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2880 0.0 0.0 1840 288 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2881 0.0 0.0 1840 284 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2882 0.0 0.0 1840 292 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2883 0.0 0.0 1840 288 ? S Oct10 0:00 /usr/libexec/courier-authlib/authdaemond
root 2884 0.0 0.0 1592 184 ? S Oct10 0:00 mdadm --monitor --scan -f
root 2937 0.0 0.0 2024 212 ttyS2 S Oct10 0:00 pppd call racser
racvnc 3016 0.0 0.0 5092 292 ? S Oct10 0:00 racXvnc :1 -desktop Dell_Remote_Service -auth /var/racvnc/.Xauthority -geome
racvnc 3029 0.0 0.0 9344 204 ? S Oct10 0:00 xterm -display :1 -geometry 80x24+10+10 -ls -bg white -fg black -title Dell_
racvnc 3030 0.0 0.0 7904 192 ? S Oct10 0:00 twm -display :1 -f /var/racvnc/rac.twmrc
racvnc 3118 0.0 0.0 4252 212 pts/0 S Oct10 0:00 -bash
root 3136 0.0 0.0 9756 224 ? S Oct10 0:00 /usr/sbin/snmpd -s -l /dev/null -P /var/run/snmpd -a
named 13828 0.0 0.1 47624 1148 ? S Oct10 0:00 /usr/sbin/named -u named
root 13844 0.0 0.0 3668 376 ? S Oct10 0:01 /usr/sbin/sshd
root 13858 0.0 0.0 2144 216 ? S Oct10 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 13883 0.0 0.1 11816 1412 ? S Oct10 0:04 chkservd
root 13894 0.0 0.0 1516 260 ? S Oct10 0:00 /usr/sbin/courierlogger -pid=/var/run/imapd.pid -start -name=imapd /usr/lib/
root 13895 0.0 0.0 1648 220 ? S Oct10 0:00 /usr/lib/courier-imap/libexec/couriertcpd -address=0 -maxprocs=40 -maxperip=
root 13901 0.0 0.0 1512 128 ? S Oct10 0:00 /usr/sbin/courierlogger -pid=/var/run/imapd-ssl.pid -start -name=imapd-ssl /
root 13902 0.0 0.0 1628 144 ? S Oct10 0:00 /usr/lib/courier-imap/libexec/couriertcpd -address=0 -maxprocs=40 -maxperip=
root 13910 0.0 0.0 1516 260 ? S Oct10 0:00 /usr/sbin/courierlogger -pid=/var/run/pop3d.pid -start -name=pop3d /usr/lib/
root 13911 0.0 0.0 1640 220 ? S Oct10 0:01 /usr/lib/courier-imap/libexec/couriertcpd -address=0 -maxprocs=40 -maxperip=
root 13917 0.0 0.0 1512 128 ? S Oct10 0:00 /usr/sbin/courierlogger -pid=/var/run/pop3d-ssl.pid -start -name=pop3d-ssl /
root 13918 0.0 0.0 1632 144 ? S Oct10 0:00 /usr/lib/courier-imap/libexec/couriertcpd -address=0 -maxprocs=40 -maxperip=
mailnull 13977 0.0 0.0 6636 364 ? S Oct10 0:23 /usr/sbin/exim -bd
mailnull 13984 0.0 0.0 6604 300 ? S Oct10 0:00 /usr/sbin/exim -C /etc/exim_outgoing.conf -q60m
mailnull 13989 0.0 0.0 6608 204 ? S Oct10 0:00 /usr/sbin/exim -tls-on-connect -bd -oX 465
root 13996 0.0 0.0 3068 780 ? S Oct10 0:18 antirelayd
mailnull 14227 0.0 0.0 21628 720 ? S Oct10 0:00 MailScanner: starting child
root 14363 0.0 0.0 1600 372 ? S Oct10 0:00 crond
root 14375 0.0 0.0 6056 308 ? S Oct10 0:00 pure-ftpd (SERVER)
root 14379 0.0 0.0 5596 184 ? S Oct10 0:00 /usr/sbin/pure-authd -s /var/run/ftpd.sock -r /usr/sbin/pureauth
xfs 14414 0.0 0.0 5364 240 ? S Oct10 0:00 xfs -droppriv -daemon
root 14648 0.0 0.1 6920 1552 ? S Oct10 0:12 cpbandwd
root 14649 0.0 1.2 17984 12348 ? SN Oct10 0:53 cpanellogd - sleeping for logs
nobody 14673 0.0 0.0 3808 156 ? S Oct10 0:00 entropychat
nobody 14677 0.0 0.0 1740 128 ? S Oct10 0:00 /usr/local/cpanel/bin/startmelange
mailman 15026 0.0 0.0 9920 248 ? S Oct10 0:00 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/mailmanctl -
root 15034 0.0 0.0 3560 248 ? S Oct10 0:00 rhnsd --interval 240
mailman 15042 0.0 0.0 9904 776 ? S Oct10 0:26 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15043 0.0 0.0 9916 812 ? S Oct10 0:27 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15044 0.0 0.0 9900 776 ? S Oct10 0:26 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15045 0.0 0.2 9940 2484 ? S Oct10 0:27 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15046 0.0 0.0 9896 812 ? S Oct10 0:28 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15047 0.0 0.3 10084 3068 ? S Oct10 0:29 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15048 0.0 0.1 9912 1960 ? S Oct10 0:28 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
mailman 15049 0.0 0.0 9924 812 ? S Oct10 0:00 /usr/local/bin/python2.4 /usr/local/cpanel/3rdparty/mailman/bin/qrunner --ru
root 15064 0.0 0.0 1532 180 ? S Oct10 0:00 /usr/sbin/portsentry -tcp
root 15109 0.0 0.0 3396 268 ? S Oct10 0:03 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15110 0.0 0.0 3420 216 ? S Oct10 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15111 0.0 0.0 3416 216 ? S Oct10 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15112 0.0 0.0 3416 216 ? S Oct10 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15113 0.0 0.0 3416 216 ? S Oct10 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15114 0.0 0.0 3416 216 ? S Oct10 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 15127 0.0 0.0 776 360 ? S Oct10 0:14 /usr/local/wofs-urchin/bin/urchind
root 15129 0.0 0.0 1500 152 tty1 S Oct10 0:00 /sbin/mingetty tty1
root 15130 0.0 0.0 1500 152 tty2 S Oct10 0:00 /sbin/mingetty tty2
root 15131 0.0 0.0 1500 152 tty3 S Oct10 0:00 /sbin/mingetty tty3
root 15132 0.0 0.0 1500 152 tty4 S Oct10 0:00 /sbin/mingetty tty4
root 15133 0.0 0.0 1500 152 tty5 S Oct10 0:00 /sbin/mingetty tty5
root 15134 0.0 0.0 1500 152 tty6 S Oct10 0:00 /sbin/mingetty tty6
root 15135 0.0 0.0 1532 140 ttyS0 S Oct10 0:00 /sbin/agetty -L 9600 ttyS0 vt100
mailnull 4631 0.0 0.2 9976 2060 ? S Oct11 0:09 eximstats
nobody 22794 0.0 0.0 3416 244 ? S Oct12 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
nobody 22795 0.0 0.0 3416 244 ? S Oct12 0:00 /usr/local/wofs-urchin/bin/urchinwebd -f /usr/local/wofs-urchin/var/urchinwe
root 28981 0.0 0.3 30300 3968 ? S Oct12 0:19 /usr/local/apache/bin/httpd
root 24544 0.0 0.1 11564 1620 ? S Oct12 0:03 /etc/authlib/authProg
root 1209 0.0 0.0 4256 300 ? S Oct12 0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/lib/my
mysql 1230 2.8 5.5 428016 56344 ? S Oct12 30:31 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql --user=mysql --pid-fil
root 9117 0.0 0.3 11568 4000 ? S Oct12 0:02 /etc/authlib/authProg
root 9120 0.0 0.3 11568 3340 ? S Oct12 0:02 /etc/authlib/authProg
root 9123 0.0 0.3 11564 3716 ? S Oct12 0:02 /etc/authlib/authProg
root 18773 0.0 0.4 11560 4820 ? S Oct12 0:01 /etc/authlib/authProg
root 26889 0.0 0.4 17312 4148 ? S 02:40 0:04 cpsrvd - waiting for connections
root 12033 0.0 0.1 7852 1664 ? S 04:03 0:00 /usr/bin/perl /usr/local/cpanel/bin/leechprotect
nobody 12036 0.0 1.5 32732 15736 ? S 04:03 0:21 /usr/local/apache/bin/httpd
nobody 12037 0.1 1.7 33472 17308 ? S 04:03 0:25 /usr/local/apache/bin/httpd
nobody 12038 0.1 1.5 32692 15552 ? S 04:03 0:22 /usr/local/apache/bin/httpd
nobody 12039 0.0 1.8 37636 18824 ? S 04:03 0:20 /usr/local/apache/bin/httpd
nobody 12040 0.1 1.6 32728 16704 ? S 04:03 0:25 /usr/local/apache/bin/httpd
nobody 12052 0.1 1.5 32796 15552 ? S 04:03 0:23 /usr/local/apache/bin/httpd
nobody 12064 0.0 1.5 32668 15680 ? S 04:03 0:20 /usr/local/apache/bin/httpd
nobody 12065 0.1 1.5 33128 15652 ? S 04:03 0:22 /usr/local/apache/bin/httpd
nobody 12139 0.1 1.6 32636 16684 ? S 04:03 0:23 /usr/local/apache/bin/httpd
nobody 12161 0.0 1.4 32612 15136 ? S 04:03 0:21 /usr/local/apache/bin/httpd
nobody 23381 0.0 1.4 32656 15168 ? S 07:10 0:10 /usr/local/apache/bin/httpd
nobody 26195 0.1 1.5 32388 15416 ? S 07:52 0:08 /usr/local/apache/bin/httpd
nobody 27011 0.1 1.6 33608 16440 ? S 08:02 0:09 /usr/local/apache/bin/httpd
nobody 32753 0.1 1.3 32196 13792 ? S 09:26 0:03 /usr/local/apache/bin/httpd
nobody 32755 0.1 1.4 33268 15216 ? S 09:26 0:02 /usr/local/apache/bin/httpd
nobody 450 0.1 1.2 32148 13040 ? S 09:28 0:03 /usr/local/apache/bin/httpd
nobody 1850 0.0 1.2 32092 12480 ? S 09:40 0:00 /usr/local/apache/bin/httpd
nobody 2139 0.1 1.2 32932 12696 ? S 09:44 0:01 /usr/local/apache/bin/httpd
nobody 2141 0.1 0.0 0 0 ? Z 09:44 0:01 [httpd <defunct>]
nobody 2250 0.1 1.1 32244 11520 ? S 09:46 0:01 /usr/local/apache/bin/httpd
nobody 3556 0.1 1.1 32716 11480 ? S 09:59 0:00 /usr/local/apache/bin/httpd
nobody 3557 0.1 1.1 31864 11836 ? S 09:59 0:01 /usr/local/apache/bin/httpd
nobody 3558 0.1 1.0 32184 10588 ? S 09:59 0:00 /usr/local/apache/bin/httpd
root 3572 0.0 0.1 6880 1748 ? S 09:59 0:00 sshd: admin [priv]
admin 3577 0.0 0.1 7056 1992 ? S 09:59 0:00 sshd: admin@pts/1
admin 3579 0.0 0.1 4248 1352 pts/1 S 09:59 0:00 -bash
root 3616 0.0 0.0 4224 968 pts/1 S 09:59 0:00 su -
root 3623 0.0 0.1 4252 1364 pts/1 S 09:59 0:00 -bash
nobody 4365 0.0 0.9 31900 9692 ? S 10:02 0:00 /usr/local/apache/bin/httpd
mailnull 4399 0.8 5.0 61660 51560 ? S 10:02 0:03 MailScanner: waiting for messages
nobody 4571 0.1 0.9 31844 9524 ? S 10:04 0:00 /usr/local/apache/bin/httpd
nobody 4572 0.1 0.9 32252 9844 ? S 10:04 0:00 /usr/local/apache/bin/httpd
nobody 4653 0.0 0.8 31836 8908 ? S 10:05 0:00 /usr/local/apache/bin/httpd
nobody 4656 0.1 1.0 31904 10264 ? S 10:05 0:00 /usr/local/apache/bin/httpd
mailnull 4835 2.6 4.9 61172 50752 ? S 10:07 0:03 MailScanner: waiting for messages
mailnull 4845 2.8 5.0 61580 51372 ? S 10:07 0:03 MailScanner: waiting for messages
mailnull 4847 0.0 0.7 23400 7580 ? S 10:07 0:00 MailWatch SQL
nobody 4912 0.0 0.4 30696 5064 ? S 10:08 0:00 /usr/local/apache/bin/httpd
mailnull 4925 0.5 0.0 0 0 ? Z 10:08 0:00 [MailScanner <defunct>]
mailnull 4961 2.6 0.0 0 0 ? Z 10:09 0:00 [MailScanner <defunct>]
mailnull 4969 0.0 0.0 6648 1012 ? S 10:09 0:00 /usr/sbin/exim -bd
mailnull 4970 1.5 0.2 7460 2960 ? S 10:09 0:00 /usr/sbin/exim -bd
mailnull 4971 1.0 0.2 7048 2232 ? R 10:09 0:00 /usr/sbin/exim -bd
root 4972 0.0 0.0 2864 876 pts/1 R 10:09 0:00 ps aux
Thanks for all the help and suggestions.
Is this server using cPanel/WHM? If so, go to "Apache Update", then "Load Previous Config", make sure the latest stable version of your php branch is selected (4.4.4 for the 4.x line, and 5.1.6 for the 5.x line) Also, select the "Enable phpSuExec support".
Once that's done, activate the forum, and go to the index, reload the page a few times, and then exit. While doing that, take both a top, and a ps aux, and post the output of them.
Regards,
Christian
panic panic panic....
rebuild of apache failed. Can't start httpd service now. Rebuilding again........
Rebuild in VERBOSE mode, and post the output. Something is screwed on the server.
Regards,
Christian
If I compile without phpsuexec, it will work. Once I enable phpsuexec into the compilation, it will fail to start httpd after compilation. I don't see any error in the compilation as well.
Here is the list in a attachment.
www.wofs.net/easyapache.zip (http://www.wofs.net/easyapache.zip)
Another strange thing I notice is that while apache is down, I can still access whm?? How's that possible? Anyway, I compiled back to without phpsuexec.
So...it's something wrong with my apache?
Quote from: ChristianH. on October 12, 2006, 08:24:10 PM
From experience, phpBB 2.0.x (unmodded) is lighter on memory. (Try it yourself, on your server, install phpSuExec, and install one phpBB and one SMF board. Request index.php from both, and monitor memory usage for each through top. You'll see the difference).
And from real world experiences people have had, the oposite is true, perhaps an empty phpBB is more efficient, but it certainly doesn't scale well. The same is true of vB too, a search on the vB site will highlight the amount of problems ex SMF users have over there with server load after converting, but yes, that is off topic.
@abubin, was there anything in the apache error log that may give any clues? What were the locked mysql queries you were seeing?
Access the board and run mysqladmin processlist and if theres any locked queries, paste them here.
apache error logs doesn't show anything. All I see there is errors or accessing 404.shtml and favicon.ico. I already remedied that and still have same problem.
As for the locked mysql, it went away after converting the tables to innodb. As stated in the smf optimization thread, innodb help to reduce the locked status of tables.
When I look in mysql processlist, most of the time I am seeing sleeping processes. As you can see, the load is caused by httpd. Mysql is running pretty stable and fine. At first, I also suspected Mysql but after much tweaking and monitoring, mysql seems to be running fine because I don't see any mysql process hogging the resources. When the httpd process taking 80mb per process and CPU going 100%, I don't see mysql taking any loads.
Now, I am thinking maybe joomla-SMF bridge is causing the problem because when I access with direct to http://domain.com/forum/ all is running fine. But when I login to joomla then click on forum, load will start to go up.
Could it be some problem with joomla-smf bridge with CB?
I'm not aware of any issues with the bridge, but then I don't use it and haven't really followed it's progress.
Best to post in the dedicated board for it here Mambo/Joomla Bridge Support (http://www.simplemachines.org/community/index.php?board=77.0)
Hello,
QuoteAnother strange thing I notice is that while apache is down, I can still access whm?? How's that possible? Anyway, I compiled back to without phpsuexec.
cPanel and WHM run under their own server, you'll see one or more cpsrvd's in your TOP and ps-aux outputs - that's cPanel/WHM.
Regarding your problem, just prodding around guesses won't really help.
If you can't install phpSuExec, then I'd suggest loading up the forum, and in WHM, click on "Process List", then, click on the PID of the process.
It'll do a trace, and we'll be able to actually see what's going on - without that, it'll be impossible to take care of this.
Also, an Apache/PHP process can not normally take that much memory, as by default php.ini limits it to 8MB - unless you changed this.
Ben:
I'm not going to discuss things like this... I know what I deal with on a day-to-day basis, I'm not criticizing SMF, merely pointing out performance stats I've seen. (You should know I'm a fan of SMF :P )
Best Regards,
Christian A. Herrnboeck
I found that post_max was set to 55M. Changed it back to 8M but that darn big HTTPD process is still around when I start using SMF through joomla again.
With help, I manage to find out the problem enabling phpsuexec. There is a line "php_admin_flag safe_mode off" in httpd.conf that is causing apache refuse to start. Upon removing that line apache started fine.
Now, I have phpsuexec started. But how do I use phpsuexec with my problem?
Okay, with phpsuexec, I can see php process in top showing which domain is running php scripts.
Then I proceed with testing again I see httpd process running at 90mb.
Hello,
You set the wrong php.ini value... it's this line:
memory_limit = 8M ; Maximum amount of memory a script may consume (8MB)
Also, can you now please post top output? I need to know if it's the httpd -DSSL command that's using 90MB, or the php command.
If it's PHP, it's related to your forum, if it's Apache - then there's something else wrong on the system.
Best Regards,
Christian A. Herrnboeck
MonteCarloHosting.Net
that php.ini value for memory is already 8mb.
Top:
10:54:08 up 6 days, 13:16, 1 user, load average: 1.69, 0.66, 0.38
190 processes: 185 sleeping, 3 running, 2 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 79.9% 0.0% 6.9% 0.4% 2.4% 10.1% 0.0%
cpu00 82.6% 0.0% 9.4% 0.0% 0.9% 6.9% 0.0%
cpu01 77.1% 0.0% 4.4% 0.9% 3.9% 13.4% 0.0%
Mem: 1015908k av, 928668k used, 87240k free, 0k shrd, 58596k buff
708052k actv, 120320k in_d, 11064k in_c
Swap: 2048276k av, 191256k used, 1857020k free 256312k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
9354 nobody 25 0 92144 89M 4076 R 38.3 8.9 0:24 1 httpd
8124 nobody 21 0 80480 76M 5960 R 29.4 7.7 0:10 0 httpd
9371 nobody 16 0 10356 9492 6088 S 2.7 0.9 0:00 0 httpd
2392 mysql 16 0 94568 86M 3028 S 2.2 8.7 0:10 1 mysqld
10781 nobody 15 0 14312 12M 8488 S 1.9 1.2 0:26 1 httpd
7244 nobody 15 0 16224 14M 7312 S 1.7 1.4 0:32 0 httpd
8 root 15 0 0 0 0 SW 0.9 0.0 4:20 0 kscand
13002 mysql 15 0 94568 86M 3028 S 0.7 8.7 2:20 0 mysqld
8604 root 15 0 1256 1256 900 R 0.7 0.1 0:05 1 top
9245 nobody 15 0 8304 7500 4340 S 0.7 0.7 0:00 0 httpd
7 root 15 0 0 0 0 SW 0.4 0.0 2:34 0 kswapd
31563 mysql 15 0 94568 86M 3028 D 0.4 8.7 1:39 0 mysqld
9374 nobody 15 0 4412 2436 1808 S 0.4 0.2 0:00 1 httpd
32129 nobody 15 0 12312 10M 7392 S 0.2 1.0 0:09 0 httpd
2406 mysql 15 0 94568 86M 3028 S 0.2 8.7 0:01 1 mysqld
7209 nobody 15 0 8728 7348 4396 S 0.2 0.7 0:01 0 httpd
9380 nobody 15 0 9636 8848 5688 S 0.2 0.8 0:00 1 httpd
As you can see, it's httpd which I am dumbfounded. Is there any program that can probe into this httpd process and see what's going on?
Your Apache processes aren't really using that much memory, it's quite normal for them to use up to 100M. You can see what httpd is doing, by following my instructions above. However, you'll need to be fast!
QuoteIf you can't install phpSuExec, then I'd suggest loading up the forum, and in WHM, click on "Process List", then, click on the PID of the process.
It'll do a trace, and we'll be able to actually see what's going on - without that, it'll be impossible to take care of this.
Also, an Apache/PHP process can not normally take that much memory, as by default php.ini limits it to 8MB - unless you changed this
i can't find any "process list" in WHM. The closest I can find is "Show Current Running Processes".
It will list all the processes running but without showing the memory usage.
Anyway...I did try accessing and then looking into this process list and i see this related to that big httpd process :
28696 (httpd) /usr/local/apache/bin/httpd /
/usr/local/apache/bin/httpd -DSSL
So, does that means -DSSL is the culprit?
Hrm, different version, different link ;) Click on the PID (the first number). That will do a stack trace, and let us know what's going on.
the -DSSL flags are normal ;)
-Christian
I can't click on the PID to view the process in detailed. Hmm...must something wrong.
What's the name of the script when you hover your mouse over the link "process list"? Mine is "http://www.domain.com:2086/scripts/simpleps".
Ironically,
http://*****************************:2086/scripts2/top
Try that ;)
-Christian
yeah...manage to capture the strace of this httpd process. Tricky process it is...need to be real fast before the process die.
It's vary long. I don't how to read it. But some of the highlights which I think might be important :
unlink("/var/cache/eaccelerator/7/b//eaccelerator-user-7bc72407a99655725799aa4ce46dbf5a") = -1 EACCES (Permission denied)
open("/var/cache/eaccelerator/7/b//eaccelerator-user-7bc72407a99655725799aa4ce46dbf5a", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied)
time(NULL) = 1161238652
fcntl64(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(6, 0x9871e18, 8192) = -1 EAGAIN (Resource temporarily unavailable)
fcntl64(6, F_SETFL, O_RDWR) = 0
write(6, "w\0\0\0\3\n\t\tSELECT data\n\t\tFROM `wofs"..., 123) = 123
read(6, "\1\0\0\1", 4) = 4
read(6, "\1", 1) = 1
read(6, "E\0\0\2", 4) = 4
read(6, "\3def\17wofscom_wofscom\fsmf_session"..., 69) = 69
read(6, "\1\0\0\3", 4) = 4
read(6, "\376", 1) = 1
read(6, "\261\3\0\4", 4) = 4
read(6, "\374\256\3rand_code|s:32:\"7060fc85b7ac0"..., 945) = 945
read(6, "\5\0\0\5", 4) = 4
read(6, "\376\0\0\2\0", 5) = 5
getpid() = 24373
time(NULL) = 1161238652
getpid() = 24373
fcntl64(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(6, 0x9871e18, 8192) = -1 EAGAIN (Resource temporarily unavailable)
fcntl64(6, F_SETFL, O_RDWR) = 0
write(6, "\210\0\0\0\3\n\t\t\tSELECT variable, value,"..., 140) = 140
There are EAGAIN (Resource temporarily unavailable) and also the eaccess error. Though I don't think it's eaccelerator because the problem already existed before I install eaccelerator. So on googling about this EAGAIN thing, doesn't really tell. Most who have this problem posted very technical stuffs which I how to intepret.
Hello,
Except for the eaccelerator errors, I don't see anything wrong. (You'll get EAGAIN errors in all apache straces).
Can you setup a new, completely clean, SMF forum, and then point it's Settings.php file to the database you are currently using? Try that, and see if it still causes high CPU load.
Regards,
Christian
I already tried that. Did 2 things.
1) setups a completely new smf standalone. Copied db from this existing smf db and ran fine.
2) open back this smf with problem. Access using direct "www.domain.com/forum" and it works FINE. Problem only occur when I logon to joomla. Then click on joomla's forum link which is "www.domain.com/index.php?option=com_smf&Itemid=103&" then problem starts.
Hello,
I'm just wondering why were were beating around the bush, so to speak :P
Contact the bridge-author (Orstio, I believe). This is a problem with that Joomla component ;)
Regards,
Christian A. Herrnboeck
Which bridge are you using? Are you using the joomlahacks one?
I am using bridge from joomlahacks, the author is wolverine. I already posted there. He is also unable to help.
Another thing I did is, I cut the number of users to half. 50k and the httpd process is showing less memory usage. This means definitely the memory usage is used by bridge loading up all the users index table or something into memory (or something...only speculating).
I already upgraded my server to 2gb RAM and the problem still persist.
I'd suggest using the official one available @ http://www.simplemachines.org/download/?bridges