Simple Machines Community Forum

General Community => Site Comments, Issues and Concerns => Topic started by: 青山 素子 on February 24, 2009, 10:55:09 PM

Title: Site Downtime on 24 February
Post by: 青山 素子 on February 24, 2009, 10:55:09 PM
Today our site experienced an outage that lasted several hours. Once our staff became aware of the outage, our emergency backup site was activated for community support.

First, I'd like to let you all know that the outage wasn't the result of a hacking attempt or anything so glamorous. Rather, it was the very boring condition of failed hardware.

Specifically, the main power supply in our master database server decided to die. While some attempts were made to resurrect the machine, the power supply refused to perform its primary duty of supplying power. As a result, our database server was deprived of electrons and would not boot.

We have moved our database drives to our replica database server and got it booted back up. All the databases crucial for running our services have checked out okay. If you notice any problems that didn't exist previously, especially database errors, let us know.
Title: Re: Site Downtime on 24 February
Post by: lovearat on February 24, 2009, 10:59:41 PM
I know ya'll often go unappreciated for all the great work and time put into running this site. So I want to give ya'll my heartfelt Thank you for all that ya'll do. And for working so hard to get the site back online.
Title: Re: Site Downtime on 24 February
Post by: Apllicmz on February 24, 2009, 11:52:34 PM
thank you
i try to seach she not work
can check please

Nice work
Title: Re: Site Downtime on 24 February
Post by: fords8 on February 25, 2009, 12:41:16 AM
Glad everything is back up and running!
Title: Re: Site Downtime on 24 February
Post by: _Anthony_ on February 25, 2009, 01:06:30 AM
Glad to have it back though :)
Title: Re: Site Downtime on 24 February
Post by: 青山 素子 on February 25, 2009, 01:27:53 AM
I believe our search server was running on the slave SQL server, which is now disabled. I'm not sure if we'll be able to get it to return until we get the new power supply and boot it back up.
Title: Re: Site Downtime on 24 February
Post by: fords8 on February 25, 2009, 01:35:04 AM
No worries Motoko-chan! Do what ya got to do to get back to 100% . Thanks for the updates also!  8) :D :)

EDIT: I hope this didn't hurt CodeFest at all too. They didn't have to take time away from that to deal with this?
Title: Re: Site Downtime on 24 February
Post by: 青山 素子 on February 25, 2009, 01:55:19 AM
No, those that could attend (I was unable to attend due to work) were still able to meet. I did keep them informed of developments by phone.
Title: Re: Site Downtime on 24 February
Post by: fords8 on February 25, 2009, 02:05:30 AM
Quote from: Motoko-chan on February 25, 2009, 01:55:19 AM
No, those that could attend (I was unable to attend due to work) were still able to meet. I did keep them informed of developments by phone.

Great! Looking forward to hearing what they get done.
Title: Re: Site Downtime on 24 February
Post by: metallica48423 on February 25, 2009, 03:34:32 AM
We all got here tonight -- the last will be here in the early afternoon tomorrow :)  Thanks for asking.

We'll try to get the search issue resolved as fast as we can :)
Title: Re: Site Downtime on 24 February
Post by: Aleksi "Lex" Kilpinen on February 25, 2009, 03:39:54 AM
Considering the fact that you actually lost hardware, you guys sure got back on line fast I think... :)
Title: Re: Site Downtime on 24 February
Post by: metallica48423 on February 25, 2009, 04:03:47 AM
Well, we have two database servers - a master and slave used for replication.  We pretty much just swapped the hard drives from one to the other.  Problem is, the sphinx search index was on the second server, which is now the one without a power supply
Title: Re: Site Downtime on 24 February
Post by: Amacythe on February 25, 2009, 04:16:26 AM
Quote from: fords8 on February 25, 2009, 02:05:30 AM
Quote from: Motoko-chan on February 25, 2009, 01:55:19 AM
No, those that could attend (I was unable to attend due to work) were still able to meet. I did keep them informed of developments by phone.

Great! Looking forward to hearing what they get done.

Well, thus far we managed to get everyone here who was on an airline flight, and those who got here early have managed not to kill each other.  We've had a few discussions about how much we love this project, and how we all want to double our pay (It's a joke since we are all unpaid volunteers!) but alas, we haven't gotten drunk, nor have we had any of the famous orgies that most 'business' meetings tend to manage.

Ok, seriously... metallica will be posting some of the details of our conferences to the Blog as time permits.
Title: Re: Site Downtime on 24 February
Post by: PacificWx on February 25, 2009, 04:20:34 AM
Great work getting the site back - as had been mentioned in this thread, you guys should be congratulated about how quick you got the site back up considering the power supply failure.

Great work!
Title: Re: Site Downtime on 24 February
Post by: Dzonny on February 25, 2009, 06:02:49 AM
I'm glad smf is online again... :)
Title: Re: Site Downtime on 24 February
Post by: TW1ST3D on February 25, 2009, 07:30:36 AM
Wow !!!   I was actually having symptoms of SMF Withdrawl Disorder...............
Title: Re: Site Downtime on 24 February
Post by: SleePy on February 25, 2009, 09:46:14 AM
Quote from: TW1ST3D on February 25, 2009, 07:30:36 AM
Wow !!!   I was actually having symptoms of SMF Withdrawl Disorder...............

Your not the one who spent over 12 hours in travel time to get somewhere :P
/me injects SMF into himself.
Title: Re: Site Downtime on 24 February
Post by: Brettflan on February 25, 2009, 09:52:30 AM
Glad to see things back up and running. :)
Title: Re: Site Downtime on 24 February
Post by: fords8 on February 25, 2009, 09:56:05 AM
Traveling does stink. But at least you are with people that like the samething you do. Now that is some coding power in one room! I wish I was there just to learn somethings!
Title: Re: Site Downtime on 24 February
Post by: LiroyvH on February 25, 2009, 11:54:46 PM
Quote
As a result, our database server was deprived of electrons and would not boot

Lmfao, that's the most nice phrased explanation i've ever seen for a failing power supply :P

Good job getting it back up :)
Title: Re: Site Downtime on 24 February
Post by: uberjon on February 26, 2009, 12:35:57 AM
Quote from: CoreISP on February 25, 2009, 11:54:46 PM
Quote
As a result, our database server was deprived of electrons and would not boot

Lmfao, that's the most nice phrased explanation i've ever seen for a failing power supply :P

Good job getting it back up :)

i actually quite enjoyed that line too  :D

fortunately.. the power supply failure didn't cause other hardware damage.. (always my greatest fear. power supply dies and takes pc with it..) lol
Title: Re: Site Downtime on 24 February
Post by: CaptainKirk on February 26, 2009, 09:29:06 AM
Any idea when we'll get search back? 
Title: Re: Site Downtime on 24 February
Post by: SleePy on February 26, 2009, 09:46:12 AM
It is down at the moment as we moved our backup sql server to be our primary.
Title: Re: Site Downtime on 24 February
Post by: CaptainKirk on February 26, 2009, 10:49:39 AM
Quote from: SleePy on February 26, 2009, 09:46:12 AM
It is down at the moment as we moved our backup sql server to be our primary.

I realize that search is down.  My question was about an estimate of when it will be back up.
Title: Re: Site Downtime on 24 February
Post by: 青山 素子 on February 26, 2009, 11:08:31 AM
It all depends on how quickly we can get replacement parts.
Title: Re: Site Downtime on 24 February
Post by: CaptainKirk on February 26, 2009, 11:13:22 AM
Thank you.
Title: Re: Site Downtime on 24 February
Post by: TheDisturbedOne on February 27, 2009, 09:20:46 PM
Quote from: Motoko-chan on February 24, 2009, 10:55:09 PM
Specifically, the main power supply in our master database server decided to die. While some attempts were made to resurrect the machine, the power supply refused to perform its primary duty of supplying power. As a result, our database server was deprived of electrons and would not boot.
Looks like you needed:
(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fwww.zath.co.uk%2Fwp-content%2Fuploads%2F2009%2F01%2Fenergizer-advanced-lithium-batteries.jpg&hash=0be4fa115279925b8e76703139cc2353064fb4e2)
:D
Title: Re: Site Downtime on 24 February
Post by: Christian A. Herrnboeck on February 27, 2009, 09:46:19 PM
This is when dual power supplies /w a batter backup work wonders. Wouldn't be without them!

Great job guys, keep it up!
Title: Re: Site Downtime on 24 February
Post by: metallica48423 on February 28, 2009, 04:27:10 AM
As search can not work currently due to the fact that the search daemon, sphinx, and the associated index for it, were located on the DB02 server which we put the disks from DB01 in, we opted to temporarily provide a google search field that will search site:simplemachines.org for the search query entered, rather than showing the "unable to access search daemon" error.

As this is using the input field to plug a search into google, we do realize that many options that are familliar are not there and will not work. However, we feel that this is a better solution than having no search ability at all.  This is especially a problem for our team and charter members whom have access to areas that the google bot cannot spider.  We apologize for that inconvenience, but again, for support and customization services, we feel that it is best to provide a method of search in the interim that will work on a basic level.

Until we can either get the search daemon and index on the current db server or get a replacement power supply for the remaining DB server, the normal SMF search will be unavailable.  There's not currently a lot we can do about it otherwise as the volume of searches done here will likely kill the remaining DB server and the site would be made consistently unavailable if we were to use a normal index.
Title: Re: Site Downtime on 24 February
Post by: whatnow on February 28, 2009, 06:46:39 AM
Thanks to everyone for all the work you guys do and are doing to try and rectify this problem and keeping this great site and software going!  :D

If this is a matter of money to get this fixed, I am sure many here would be more than happy to give a few dollars to help out.

Title: Re: Site Downtime on 24 February
Post by: adamnchris on February 28, 2009, 11:43:50 AM
Thanks for the updates and all the hard work.  VERY appreciated from a small dog website in Toronto!  My members all thank you to.  They freak when they can't get in...the world is collapsing, Armageddon is here!  But all seems good now.  World is not collapsing.  Armageddon is not nearing, Pug People all over Toronto are happy now.  THANKS!!!!!!!! 
Title: Re: Site Downtime on 24 February
Post by: LiroyvH on February 28, 2009, 11:56:20 AM
Quote from: adamnchris on February 28, 2009, 11:43:50 AM
Thanks for the updates and all the hard work.  VERY appreciated from a small dog website in Toronto!  My members all thank you to.  They freak when they can't get in...the world is collapsing, Armageddon is here!  But all seems good now.  World is not collapsing.  Armageddon is not nearing, Pug People all over Toronto are happy now.  THANKS!!!!!!!! 

Hm,
I dont think that the downtime of this website will have affected your site.
If your site was down at the same time, it's pure coincidence :P
Title: Re: Site Downtime on 24 February
Post by: Relyana on February 28, 2009, 03:22:55 PM
Earlier today this forum crashed my browser.  :(

Now the forum is getting slower and slower for me again. Hope everything will be sorted out soon. 



Title: Re: Site Downtime on 24 February
Post by: H on February 28, 2009, 04:04:18 PM
We're aware of the problems. All the while our other server is not fixed, the site will be going slower than usual. We should hopefully have everything fixed sometime next week
Title: Re: Site Downtime on 24 February
Post by: Rumbaar on February 28, 2009, 06:25:10 PM
Just a note, as the forum is (for the most part) fully indexed by Google while the search is down here, you can use their services to possibly find a result.

http://www.google.com/search?q=search&domains=simplemachines.org&sitesearch=simplemachines.org
Title: Re: Site Downtime on 24 February
Post by: Kenny01 on March 01, 2009, 04:15:02 PM
From all the posts above, it seems to me that SMF forum is self hosted, because with a hosting company such problem would have been fix within an hour or two because they have more resources.
Title: Re: Site Downtime on 24 February
Post by: Kenny01 on March 01, 2009, 04:23:08 PM
Quote from: GrannyD on February 28, 2009, 06:46:39 AM
Thanks to everyone for all the work you guys do and are doing to try and rectify this problem and keeping this great site and software going!  :D

If this is a matter of money to get this fixed, I am sure many here would be more than happy to give a few dollars to help out.

Granny
If a call is made for that, i'm also willing.
Title: Re: Site Downtime on 24 February
Post by: 青山 素子 on March 01, 2009, 06:01:56 PM
Quote from: Kenny01 on March 01, 2009, 04:15:02 PM
From all the posts above, it seems to me that SMF forum is self hosted, because with a hosting company such problem would have been fix within an hour or two because they have more resources.

Most of the diagnostics were made with the host we are co-locating with. However, because of time differences and some issues with their support area (they recently switched to a new support system), it took longer than it otherwise might.

Also, almost all the people authorized as contact points were either not able to handle the back and forth (because of the codefest) or were otherwise busy (both the server admin and I have busy day jobs).


Quote from: Kenny01 on March 01, 2009, 04:23:08 PM
Quote from: GrannyD on February 28, 2009, 06:46:39 AM
If this is a matter of money to get this fixed, I am sure many here would be more than happy to give a few dollars to help out.
If a call is made for that, i'm also willing.

It isn't a money issue currently, it's more a matter of having our server admin get a chance to check on the warranty status of the part and to order a new one. We're all very busy individuals in the team (remember, we don't get paid and almost all of us either have full time jobs, classes, or both), but we're trying to go as fast as we can.
Title: Re: Site Downtime on 24 February
Post by: Amacythe on March 02, 2009, 01:39:46 AM
Search is now working again.

I'd like to thank everyone for the kind words, understanding and generosity.

As Motoko-chan posted, the issue is that we are all volunteers and sometimes things happen in real life that take precedence over things happening on the internet.  It shouldn't be too long until we get back to full strength.  In the meantime, we may be a bit slow, but we are up and running again.
Title: Re: Site Downtime on 24 February
Post by: Kenny01 on March 02, 2009, 02:28:23 AM
Wonderful guys, you're all great.
Title: Re: Site Downtime on 24 February
Post by: Brettflan on March 02, 2009, 04:58:56 AM
While you're at it, you might also want to go ahead and fix the remaining bugs in SearchAPI-Sphinx.php running on this site.
http://www.simplemachines.org/community/index.php?topic=127672.msg1939169#post_full_sphinx_fix_list

O:)
Title: Re: Site Downtime on 24 February
Post by: Kenny01 on March 02, 2009, 10:43:17 AM
It's a step by step procedure.
Title: Re: Site Downtime on 24 February
Post by: _Anthony_ on March 05, 2009, 02:19:34 AM
Well, I'm glad to have the search feature back :)