Recent Downtime

Started by dschwab9, January 19, 2008, 02:02:07 AM

Previous topic - Next topic

Smith6612

Quote from: rsw686 on January 19, 2008, 06:26:29 PM
dschwab9 I hope you don't mind if I ask a couple of questions about the failure.

What was the motherboard brand / model that had the thermal management failure? If this was an Intel chip I thought they had built into the core throttle control to prevent overheating.

Was there a reason to use the fan speed thermal management from the beginning? If you are in a datacenter noise shouldn't be much of an issue. I personally would trade cooler operating temps and longer component life for some noise.

As the same with my home computer (my gaming one). I'd rather have my computer get noisy when I fire up games rather than have the fans sit at low speed and make my Graphics card and CPU overheat.

Chriss Cohn

Hi, good its up agian.... but following error:
When i will try to use the search then, "            Unable to access the search daemon" appears...

Regards, Christian

Gary

Gary M. Gadsdon
Do NOT PM me unless I say so
War of the Simpsons
Bongo Comics Fan Forum
Youtube Let's Plays

^ YT is changing monetisation policy, help reach 1000 sub threshold.

dschwab9

Quote from: rsw686 on January 19, 2008, 06:26:29 PM
dschwab9 I hope you don't mind if I ask a couple of questions about the failure.

What was the motherboard brand / model that had the thermal management failure? If this was an Intel chip I thought they had built into the core throttle control to prevent overheating.

It's an Asus board with an Athlon 64 X2 processor

Quote from: rsw686 on January 19, 2008, 06:26:29 PMWas there a reason to use the fan speed thermal management from the beginning? If you are in a datacenter noise shouldn't be much of an issue. I personally would trade cooler operating temps and longer component life for some noise.

Apparently it was on by default.   it's now off on all the servers

dschwab9

Also, fyi, I'm making some progress on recovering recent attachments from the corrup hard drive

Gary

Gary M. Gadsdon
Do NOT PM me unless I say so
War of the Simpsons
Bongo Comics Fan Forum
Youtube Let's Plays

^ YT is changing monetisation policy, help reach 1000 sub threshold.

dschwab9

all attachments (or at least 99% of them) have been recovered and are being uploaded now.   Should all be there in an hour or two.

dschwab9

attachments and avatars have been restored

4LP3RUZ1

you rock!

asus mobo and x2 cpu... thats the same as my comp! :)

Q-fan was the problem?
Frozen frogs are back :(

dschwab9

Quote from: alperuzi on January 20, 2008, 05:15:27 AM
you rock!

asus mobo and x2 cpu... thats the same as my comp! :)

Q-fan was the problem?

Whatever they call it.   That thing that adjusts the speed of the fan.   I suspect the temperature sensor is in such a place where the reading isn't accurate.   There was another box that was running really hot with low fan speed also.

Aaron

Quote from: dschwab9 on January 20, 2008, 03:29:04 AM
all attachments (or at least 99% of them) have been recovered and are being uploaded now.   Should all be there in an hour or two.

You're the man, Derek! Great work! :D

Alan S

I am not sure if this is related to the downtime but i have found a number of corrupt mod packages on the mod site , Message Preview On Hover and Sarcasmics Smiley sets to name a few , I tried to opern them in WinRar and i got a unexpected end of archive error.
Quote from: Eliana Tamerin on August 23, 2008, 04:10:10 PM
SMF 7 is where it gets good. That has time travel. You can go back and post before the guy who flamed you. :P

Peter De Decker

glad to see the site back up!

TheWrath!


rsw686

Quote from: dschwab9 on January 20, 2008, 01:34:07 AMIt's an Asus board with an Athlon 64 X2 processor

Apparently it was on by default.   it's now off on all the servers

Quote from: dschwab9 on January 20, 2008, 05:23:17 AM
Whatever they call it.   That thing that adjusts the speed of the fan.   I suspect the temperature sensor is in such a place where the reading isn't accurate.   There was another box that was running really hot with low fan speed also.

First off don't take this the wrong way. I just want to point a few things out so this doesn't happen to others.

What a bad combination of a lot of things. Asus boards are horribly unreliable. I have 3 for 3 where something has failed. An AMD chip with no built in thermal protection. And user error of not checking all the bios settings and stress testing with CPU Burn In or the like.

Well on the bright side least you were able to recover most of the data. This is one of those experiences that you never want to have to do again.

Also if your not using ECC memory it would be wise to do so. Memory prices are soo cheap that you can pickup Kingston 2x2GB of DDR2 ECC memory for $110, same as the non EEC memory.
The Reptile File
Everything reptile for anyone reptile friendly

Aquaria Talk
Community for freshwater and saltwater aquariums enthusiasts

Dannii

Thanks for your dedication Derek. :)

Btw, now is the time to finally implement that AJAX CPU temperature feature into the admin CP yes?
"Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise."

i2Paq

Quote from: dschwab9 on January 20, 2008, 03:29:04 AM
all attachments (or at least 99% of them) have been recovered and are being uploaded now.   Should all be there in an hour or two.

I've downloaded a lot of Mods for the 1.1.4 smf release the last couple of day's, so if any of them is missing do let me know and I will put them up for dowload/upload.
I've also downloaded a few themes for 1.1.4, same case, let me know.

Great job anyway!

Regards, Norman.

4LP3RUZ1

@rsw686

Asus boards are good quality, I've had 2 and never had a problem with them. I use the Q-fan feature and find it to be a great addition to my quiet PC. I have no problems with thermal monitoring and find the Athlon X2 to be a cool and reliable CPU. The thermal diode is actually inside the CPU, so the CPU does its own monitoring, but again, its all setup from the BIOS.
Frozen frogs are back :(

dschwab9

Quote from: rsw686 on January 20, 2008, 09:36:18 AM
What a bad combination of a lot of things. Asus boards are horribly unreliable. I have 3 for 3 where something has failed. An AMD chip with no built in thermal protection. And user error of not checking all the bios settings and stress testing with CPU Burn In or the like.

I could post a list of 20 motherboards here and I wouldn't get a response from someone about each and every one of them being horrible.   FYI:  not sure if it was mentioned in my original post, but the power supply fan had also failed, which has nothing to do with ASUS or the quality of their boards.   I will say I have been using their stuff for years and haven't never been a problem.

Also, about using AMD being a bad idea - I can't go for that.   AMD has some of the top performing CPU's out there and is probably in use in 50% of the web servers out there.

Quote from: rsw686 on January 20, 2008, 09:36:18 AM
Also if your not using ECC memory it would be wise to do so. Memory prices are soo cheap that you can pickup Kingston 2x2GB of DDR2 ECC memory for $110, same as the non EEC memory.
You obviously don't know a whole lot about servers.   You can't just buy any stick of RAM and stick in any motherboard.  ECC RAM won't work in a board that doesn't support it.   Also, ECC RAM or not is totally unrelated to CPU temperature.

dschwab9

Quote from: alperuzi on January 20, 2008, 02:34:13 PM
@rsw686

Asus boards are good quality, I've had 2 and never had a problem with them. I use the Q-fan feature and find it to be a great addition to my quiet PC. I have no problems with thermal monitoring and find the Athlon X2 to be a cool and reliable CPU. The thermal diode is actually inside the CPU, so the CPU does its own monitoring, but again, its all setup from the BIOS.

Isn't there another diode other then the CPU though?   The BIOS reports a CPU temp and a "System temp".   On this machine, the system temp was relatively low, so I suspect that diode was near the front of the case where the air intake it.

Advertisement: