News:

Join the Facebook Fan Page.

Main Menu

Excessive CPU usage

Started by shnazzle, June 15, 2019, 12:08:39 PM

Previous topic - Next topic

shnazzle

Has anybody experienced any extreme cpu usage since installing RC2?

HostGator shut us down last night for having 3 stints of 90+ seconds in the "Extreme CPU Usage" bounds.

I've shut down shoutbox, which is the only plugin I've got installed. But it never did this on 2.0.15.

What activities could cause a lot of cpu processing in 2.1 RC2?

shawnb61

The short answer is no, nothing specific about RC2 at this time. 

It would help to know a bit more about your environment - # of users, posts, # of active users you tend to see, PHP version.

Anything in the error logs?

Is there anything in your web access log that provides a clue?  (My first suspicion when I see this is always significant crawling activity...) 

Are you using https?  Are you using the image proxy?  How many files in the cache?

Do you have a lot of records in log_topics?  (This can get out of hand even on mid-sized forums: https://www.simplemachines.org/community/index.php?topic=212330.msg1667071#msg1667071)

A question worth asking is born in experience & driven by necessity. - Fripp

shnazzle

- https
- no image proxy
- 522k posts
- 39k topics
- 60ish concurrent users max, usually more around 30
- Php 5.5, although I just changed it to 5.6
- Mysql Server version: 5.6.41-84.1 - Percona Server (GPL), Release 84.1, Revision b308619
- nothing in cache dir
- 181k rows in log_topics
- error logs have a few errors, nothing massive.
- there is a bot/apider/crawler called "link" with 85MB activity and 9,800+10 requests

Any help?

shawnb61

Wait...  How many requests from "link"?   Is that in scientific notation???
A question worth asking is born in experience & driven by necessity. - Fripp

albertlast

the stats side could create a big cpu usage on database side when "like" is enabled,
maybe this could be a reason,
the fix for this got into rc3.

shnazzle

Quote from: shawnb61 on June 15, 2019, 12:52:57 PM
Wait...  How many requests from "link"?   Is that in scientific notation???

Yes indeed it is, so, it's a lot.

Quote from: albertlast on June 15, 2019, 12:54:59 PM
the stats side could create a big cpu usage on database side when "like" is enabled,
maybe this could be a reason,
the fix for this got into rc3.
What's that now? Likes are causing issues at the minute?


albertlast

the graphic didn't help,
since it didn't tell from what it show" is that php?" "is that mysql?" "something different?"

shnazzle

Sorry it wasn't really meant as a diagnosis thing.
The trough is where we were shut down. Then went back up in the wee hours and climbed progressively into Very High with onlh very little forum activity.
So it's just a general picture of; it's using a lot of CPU for little practical reason. At the time near the end of thr graph, I think 10 people were active.

I've asked HostGator for better logs of what exactly was taking up so much CPU in the times when we peaked

Illori

if they will not provide details which i doubt they will, i would start looking for a new host as they are oversold and most likely just trying to push you away anyway.

shnazzle

Quote from: Illori on June 15, 2019, 03:31:31 PM
if they will not provide details which i doubt they will, i would start looking for a new host as they are oversold and most likely just trying to push you away anyway.

Way ahead of you.
I've found a few options. Unfortunately the best one is Windows-based, which I really don't like working on  :(

shawnb61

Quote from: shnazzle on June 15, 2019, 02:42:32 PM
Quote from: shawnb61 on June 15, 2019, 12:52:57 PM
Wait...  How many requests from "link"?   Is that in scientific notation???
Yes indeed it is, so, it's a lot.

If you are getting so many hits from a bot that the # of hits must be listed in scientific notation, that's likely your problem... 

I use Hostgator.   All your logs are available on cPanel.  Even up-to-the-minute. 

I would download the current log & see if I could identify:
- the IP range of the culprit
- the activities the culprit is doing

When doing this, I sometimes load the logs directly into Excel for analysis.

And block the culprit, either by IP range or by user-agent if you can find a good identifier in there. 

Over the years, I have had occasional issues with bots & crawls.  I have a few agents blocked in my .htaccess file. 

EDIT:  Your logs are found in cPanel, under Metrics, Raw Access.  Awstats helps if the culprit has been around long enough to be included in those stats. 
A question worth asking is born in experience & driven by necessity. - Fripp

Sir Osis of Liver

Quote from: Illori on June 15, 2019, 03:31:31 PM
i would start looking for a new host

Hostgator has been having problems lately.  Moved a large forum off there about a month ago, it was trashed and offline for a while, support was no help.  Take Illori's advice.


When in Emor, do as the Snamors.
                              - D. Lister

shnazzle

I've blocked a few of the heavy hitters.
After reading closely, the +10 is the number of robots.txt hits.
Not sure how that works.9800 made it through but 10 got caught by robots.txt?

Definitely moving hosting asap

GigaWatt

Quote from: shnazzle on June 15, 2019, 06:35:17 PM
Not sure how that works.9800 made it through but 10 got caught by robots.txt?

That's scientific notation. It means 9800 * 10^10 (multiplied by 10 to the 10th power). It means you've got this many hits on robots.txt: 98.000.000.000.000.
"This is really a generic concept about human thinking - when faced with large tasks we're naturally inclined to try to break them down into a bunch of smaller tasks that together make up the whole."

"A 500 error loosely translates to the webserver saying, "WTF?"..."

drewactual

make sure you have a good cache set up, both for scripts and for static. 

a static cache will eliminate the processor punches on a txt file.  a good script cache like OPCache will eliminate the processor power needed to run a script every time it's called.  you said you upgraded to 5.6 from 5.5, and i wager you left OPCache behind. 

as an example, i dropped from redlining CPU regularly to less than 10% average CPU usage after implementing OPCache and with the same traffic.  it makes THAT much difference.  HostGator has it available if you ask for it- or- you may be able to turn it on and enter it's parameters/settings in your htaccess or local ini file.

i had something that went truly 'viral' once and it racked up over 30k users concurrently.  i sat watching it real time and recall thinking to myself "that God I set the cache up right, else that would have started a fire and burned the data center down the CPU would have been so hot!!!"...

shnazzle

Quote from: drewactual on June 17, 2019, 12:28:02 PM
make sure you have a good cache set up, both for scripts and for static. 

a static cache will eliminate the processor punches on a txt file.  a good script cache like OPCache will eliminate the processor power needed to run a script every time it's called.  you said you upgraded to 5.6 from 5.5, and i wager you left OPCache behind. 

as an example, i dropped from redlining CPU regularly to less than 10% average CPU usage after implementing OPCache and with the same traffic.  it makes THAT much difference.  HostGator has it available if you ask for it- or- you may be able to turn it on and enter it's parameters/settings in your htaccess or local ini file.

i had something that went truly 'viral' once and it racked up over 30k users concurrently.  i sat watching it real time and recall thinking to myself "that God I set the cache up right, else that would have started a fire and burned the data center down the CPU would have been so hot!!!"...

Unfortunately OPCache is not availble to us :( 
Interested now though!

Aleksi "Lex" Kilpinen

Hostgator is owned by Endurance International Group (EIG) - Google it, learn of it, jump ship.

Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

drewactual

Quote from: shnazzle on June 25, 2019, 01:28:30 PM
Unfortunately OPCache is not availble to us :( 
Interested now though!

that's unfortunate.  and also something i would put my foot down about- as it is used wide, far, and deep across the interwebz.  the only reasonable excuse for not having it present already is if they're using one of the install flavors (fpm; mpm-worker; fastcgi ect) that conflict with zendOPCache..... and even then, there are alternatives.

shawnb61

Do you have SMF caching turned on?  Even if you don't have some of the more advanced caching like OpCache available, SMF's file-based caching produces very good results. 

Look under Admin | Configuration | Server Settings | Caching and set it to "Level 1 Caching" if it's not already. 

Another thing to try - if you are on https, and have the Image Proxy enabled, I would disable it until this is sorted out.  Crawls & the Image Proxy in 2.0.15 are bad news; they don't play nice. 

I use HostGator.  In general, I've been happy with them.  Never had a problem with their support.  I have had similar CPU issues during aggressive crawls.  BUT...  Only since 2.0.14...  Now, did 2.0.14 do something?  Or did Hostgator do something coincidentally at the time I upgraded to 2.0.14? 

I've taken a few steps to minimize the CPU issues (upgrading PHP helped; disabling the image proxy; enabling persistent DB connections).   
A question worth asking is born in experience & driven by necessity. - Fripp

Aleksi "Lex" Kilpinen

The thing about EIG is that it shops around for known shared hosts, keeps the branding, then "optimizes" cost structure of the hosts it buys out, fires great (expensive) support staff and migrate clients to a worse hardware infrastructure. Most often causing a slow degradation of service quality. EIG owns something like 60 different hosting brands.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

shnazzle

I have level 1 file based caching enabled.
One thing I tried, which has yielded good results..
I've disabled our SMF Packs Shoutbox plugin.
We've had two little spikes into "very high" very briefly, but that's it.
We used to be hovering in the very high range with peaks into "extreme".

It might be a coincidence, and I guess I could verify by turning it back on, but I don't want to be shut down again :)

Image proxy is off. I had read about the https issues so as we're running https...off :)



Aleksi "Lex" Kilpinen

Anything that works "realtime" - such as shoutboxes, notifications and so on, can cause a considerable increase in CPU load.  If turning off features like that solves the issue, then you probably found the culprit. But often, the EIG -style limits on any resource tend to be pretty arbitrary, and basically only exist to limit resources that were originally sold basically "unlimited".
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

Virus-X

Quote from: shnazzle on June 15, 2019, 12:08:39 PM
Has anybody experienced any extreme cpu usage since installing RC2?

HostGator shut us down last night for having 3 stints of 90+ seconds in the "Extreme CPU Usage" bounds.

I've shut down shoutbox, which is the only plugin I've got installed. But it never did this on 2.0.15.

What activities could cause a lot of cpu processing in 2.1 RC2?

How about your other plugins? If you believe this plugin has error, please try to remove it.

shawnb61

Quote from: shnazzle on June 29, 2019, 04:07:46 AM
I have level 1 file based caching enabled.
One thing I tried, which has yielded good results..
I've disabled our SMF Packs Shoutbox plugin.
We've had two little spikes into "very high" very briefly, but that's it.
We used to be hovering in the very high range with peaks into "extreme".

It might be a coincidence, and I guess I could verify by turning it back on, but I don't want to be shut down again :)

Image proxy is off. I had read about the https issues so as we're running https...off :)

In my experience, CPU spikes are typically caused by crawls.   Look at your web logs and determine who the culprit is.  Somebody hitting your site several thousand times an hour isn't hard to spot in the web logs! 

If it's Google, throttle them in Google webmaster tools. 

If it's someone else, block them. 

As noted above, I use Hostgator.   It may also be that Hostgator is squeezing their users.  That is almost certainly part of it.  For whatever reason, crawls didn't bother me in the past, but they do now. 

I'm still on the fence whether to leave HG in the future.  On the plus side, I have never had issues with uptime or support - they've always been rock solid.  But on the downside I really don't like their backup policies & pricing.  And I suspect they are squeezing resources - as shown by sensitivity to crawls. 
A question worth asking is born in experience & driven by necessity. - Fripp

drewactual

fwiw HG has a great dedicated server program.  its not cheap @ $3500 a year, but down time is exceedingly rare. 

back to the OP:  you may consider using an htaccess entry to throttle your php.  if there are some rogue scripts running they can consume a LOT of processor power, especially when they edge up to the cutoff limit regularly... I've set mine, as a for instance, to 30 seconds.  if even a robust search is called, it will die in 30seconds, as a for instance... and if a script regularly takes that long, it's likely i'd be needing to visit the coding and come up with a cleaner way to do it. 

i throttle the timeouts to 30 seconds and the memory to 16mb max per process. 

Aleksi "Lex" Kilpinen

IMO that's not really a very good idea. If you run to situations where you need to kill processes to conserve resources, you should get more resources.
Slava
Ukraini!
"Before you allow people access to your forum, especially in an administrative position, you must be aware that that person can seriously damage your forum. Therefore, you should only allow people that you trust, implicitly, to have such access." -Douglas

How you can help SMF

drewactual

oh it certainly wasn't SMF snatching up those resources- it was a poorly written code even by php4 standards, and heavily searching a cumbersome and massive database and while running on php5.6 at the time and data from a >4.xSQL on a 5.4SQL... SMF is clean- nothing from it has ever taxed the system like that thing did. 

^it's for that reason i wonder what is gulping down the OP's resources.  Now that my (SMF+WP) primary site is so clean, i wager i could support it's traffic and weight on a shared environment with little issue.  NOT before, though, with that other function running on it.   

shnazzle

Our "guy" that we've just introduced made a good point. Or at least, seems like a good point.

We're on hostgator shared hosting. And while we generally tend to run within the high to veyr high band, it's sometimes that we are just under Extreme and then when we go over we're hit with enforced shut-downs.
So the logic seems to be that we may be sharing our hosting with some other very heavy users. So when what is usually "very high" for us on a normal day, when someone else absolutely rams their service, it becomes a "extreme" as a percentage of the total available for the shared host.

I'm not sure it's scaled that way, but the other day our site wa sdamn near unusable, and all of a sudden it was flying again.

shawnb61

I've been shared on HG for years.  I've never seen someone else's traffic affect my CPU.   

The CPU charts always track perfectly to traffic on the web access logs.  I download the most recent day or two of logs and analyze them in Excel to identify the culprit.

When I see wide swings in CPU, the cause is always crawls.
A question worth asking is born in experience & driven by necessity. - Fripp

Advertisement: