Last Weekend’s Outage

Posted August 1, 2011 by George | 54 Comments

As many of you are aware, we experienced some unexpected downtime this past weekend. First, we want to apologize to our amazing community of genealogists for inconveniences that the outage may have caused. Our goal is to provide the best place for all of you to work together to build a single family tree of the world, and we failed to provide that service for more than 48 hours over the weekend.

Our engineers worked around the clock to resolve the issues that caused the outage, and they were finally able to restore service at approximately 8am PDT (3pm GMT) today.

What Went Wrong?

A couple important points worth noting before we dig into the details:

  • We have measures in place to prevent the loss of data, and they worked. (Phew!)
  • You may notice that some data is missing right now.  Please do not re-enter that data, as we are in the process of reloading it over the next couple days.

The issues that caused the outage were with Geni’s PostgreSQL database.  We know that hardware issues or data corruption were not the root of the problem, and we suspect that it was an issue with the database’s index. The eventual solution to the problem was a full restore of the database.

As you can imagine, fully restoring more than 100 million profiles and all associated data (source documents, images, videos, etc) is a large task, and unfortunately we couldn’t simply “pedal faster”.  We did try several alternatives before resorting to a full restore, including (a) rolling back our codebase, (b) attempting to move the data to a different database server, and (c) investigating all of our system logs to try to find an easier way to repair the site.

We will continue to investigate, and as we learn more about the cause(s) of this problem, we will put additional measures in place to minimize or completely prevent this sort of outage in the future.

Better Communication

During the outage, we were not as effective as we should have been with our communication to the Geni community.  While we didn’t know exactly when we would be able to restore service, we should have provided updates much more frequently than we did, and we will in the future.

We will provide updates to users on our Facebook page, our Twitter account (and our Twitter Uptime account), and if possible, on our blog and our support site.

Pro Users

For those of you who have supported Geni by purchasing a Pro account, we would like to offer an additional week of Pro service to you for this inconvenience.  Some of you have already inquired about this; as soon as all of Geni’s data is restored, we will begin working on a way to credit a week to each of your accounts.  We will notify you once we have figured out the best way to apply this credit.

We value your support so much, and we will do our best to ensure that you don’t have to experience this long of an outage again.  Thank you for helping make Geni the amazing community that it is.  We apologize again for the inconvenience.

If You’re Still Having Issues

If you are still having issues and you think that they may be because of this outage, please feel free to leave a comment on this post or seek help at our Helpdesk. Thank you for your patience, and thanks for being so passionate about Geni.

Post written by George

George joined the Geni team in September, 2010 as Geni's marketing director. You can find him on Twitter where he never posts but is happy to respond: @georgegeni

See all posts by

Share:

  • Karen

    Most of us knew you were pedalling as fast as you could–and while more communication would have been nice, I know that when you’re up to your ass in alligators, it can be hard to think about posting the details every hour on the hour. And hey, as one of our project managers used to say, “At least no one died.”

  • Sharon

    Thanks for this communication. It’s honest and helpful. Keep on truckin’….

  • Jacinta Palerm

    Identification of people in photos “lost” as well as all messages. nWill this be recovered?

    • Anonymous

      You should see any missing data in the next few days.

  • Elana Kahn

    I’m having trouble with tagging photos and resizing them to fit in the little box on the tree.

    • Anonymous

      Try clearing your cache and restarting your browser. If you’re still having issues, report the problem along with which web browser you’re using at our help desk. http://help.geni.com

  • Hannahkarpes

    I thought I had inadvertently asked to be unsubscribed – so am happy to hear that this time it wasn’t my fault! Hope all will be back to normal soon!nnHannah Karpes

    • Anonymous

      We’re working on getting everything back exactly the way it was. No worries.

  • DC

    Hi! Thanks for the updates and for working hard to restore Geni. Listen,u00a0you probably already know that your webpages are populating very slowly since the site’s come back up; and that all the photos are missing names & tags attributed to them; but some of my profile info is also missing and Geni is “welcoming” me to the site like I’m a new member and asking me to finish fillingu00a0out my profile! Good luck with the rest of the restore and we look forward to having Geni working in its full glory in no time soon!

    • Anonymous

      Everything will be restored. Anything entered before we went down will be restored.We’re in the process of restoring everything now. Hence the occasional slowness.

  • http://twitter.com/wouterdeboeck Wouter De Boeck

    I have problems with resizing the pictures in the tree (you know: to get the whole face on it and not just the eyes or mouth). After resizing the picture ‘jumps’ to the old position. I tried Safari and Firefox, cleaned u00a0the cache… But I believe that this was already happening u00a0just after the latest release (the bug fixes) and before the outrage.nnGood luck and don’t forget to sleep ;-)

    • Anonymous

      We’re in the process of getting all of the data back in order over the next couple days, so things might be acting slow. Try clearing your cache to make sure, but if you’re still having issues, please report it to the help desk. http://help.geni.com

    • Bill Laverty

      crop your photo to begin just above the top of the head and go down to mid chest and it should fit quite nicely.nnBill

  • Murf112

    nnNow I’m worriedu00a0 thatnmy tree will be lost someday.u00a0 How longnis my tree going to last, 5yrs, 20yrs, 100yrs ? How should geni users protectnall are data for future generations?nnn

    • http://twitter.com/georgegeni George Geni

      Murf,nnOur redundancy will prevent any loss of data; sometimes it just takes a little while to recover it when we have an outage like this.nnEven though I’d be willing to bet that no one will ever have to worry about permanent data loss with Geni, it’s never a bad idea to export your data periodically. u00a0You can do that with our GEDCOM export at http://www.geni.com/gedcom (I’d give it another day or two, it will probably be slow while we are finishing the recovery from the outage). u00a0You can also try a product like AncestorSync, which will allow you to push your Geni data to several different genealogy applications/databses.nn-George

      • Ldcraigo

        It is now August 13th and I think that permanent data loss has already hit my account! I paid for the Geni Pro in July for a month the problem on 8/1 hit and I was locked out of my account, suddenly I was expired by 28 days (don’t know how that could be when I just signed up in July) and back to basic. in the short time I was a Geni Pro I had merged and collaborated growing the tree to over 500! I paid another month on 8/1 thinking it would be resolved..oh no, contacted the help desk explaining that the payment went through my account on 8/1 and now on 8/7, I am expired again by 28 days again! Got an email saying they were looking into it, never heard a word! Came back in tonight, 8/13..hmm, I am now expired 22 days!! left another message and found that the first message has not even been given to a help desk operator, still in pending! Now you tell me..what’s the point of going Geni Pro if Geni won’t allow you to have Geni Pro services!! So now, you are talking the good game about the virtues of being Pro..I paid for Pro and I haven’t seen nothing of what my money has paid for. Actually, I was better off not paying for it!! Now i am out two months worth and it still wants me to sign up for Geni Pro..I DID!! So far it’s been how to throw money down a rat hole! I don’t feel so good about Geni Pro right now and think I would be better off taking my tree and moving it out of here!!

        • Geni George

          We’re not exactly sure what happened with your account, but our CS team is looking into it now.  You can expect an email from as soon as they figure it out.  Sorry about the troubles.

          -George

  • Jim staunton

    I guess this should go to the help desk… but I’m finding when I add a new sibling, they often appears way off to the side, with no apparent reason. This is annoying, and I have a lot of sibs I want to add soon. How can this be rectified or avoided?nnThanks, Jim

    • Anita B

      Jim…you should be seeing a small number next to each child…this is their birth order. 

      The tree will often “appear” differently on your screen based on where you opened it from.  Geni will also, I have noticed, put a child with no descendants in a hole between two children with many so as to compact the width of the tree.  It is all just cosmetic!

  • Alissa

    Yesterday when I added a female to tree and than added her husband to the tree itu00a0had a surname and last name for the male?

    • Alissa

      Also the system pulled the wives surname over as a surname for husband.

  • Lee Barber

    I am having problems opening u00a0profiles in tree view , once I have updated the profile …. continually gives me the pop up box saying “experiencing a problem contact help desk”

    • http://twitter.com/georgegeni George Geni

      It’s probably related to our ongoing data recovery, but it might be a good idea to submit a ticket at http://help.geni.com just in case there is something that needs manual intervention.

      • Lee Barber

        Thank-you , I have taken your advice and submitted a ticket.

  • Washwo1802

    JUST ANOTHER MONTHLY TORNADO. EXCEPT THIS WAS A FORCE 5.u00a0nMY DISTANT RELATIVES CHANGE EVER COUPLE MONTHS ANYWAY. GREAT GREAT GRAND FATHERS BECOME COUSINS, ETC. u00a0

  • Dan Cornett

    It would be helpful if you could:nna) Give more indication of the kind of information which is still be reload (e.g.: is it relationships/profiles or is it messages, or media, or ….)nnb) a bit more definition of “a few days” … two +/- 1 or three or four, or … ?

    • http://twitter.com/georgegeni George Geni

      Hi Dan,nnI’ve asked the engineers for this information and I’ll share as soon as I can get more details.nnFrom what I currently know, all profiles and relationships should be available, but media, messages, and other meta data surrounding the profiles and relationships (photo tags, etc) are potentially still being restored.nn-George

    • http://twitter.com/georgegeni George Geni

      The last estimate I heard is that it’s on pace to finish recovering sometime tomorrow. u00a0

  • Anonymous

    FYI – if you don’t already know – many of the tags for the pictures do not show up. u00a0u00a0

    • http://twitter.com/georgegeni George Geni

      Thanks, and we’re aware. u00a0That’s part of the data that is still being restored.

  • John Sparkman

    Thanks George and the Geni Team.nnReading through comments below and relating them to your advice of the ongoing recovery, as a PRO user I would like to offer all subscribers the following thoughts of mine.1. We know that the bulk of the restore has been successfully done as promised (Thanks Geni)2. We know that there is ongoing “fine tuning” to ensure a 100% restore of all data.3. Allow it to be fully restored BEFORE REPORTING DEFICIENCIES as this may just send Geni staff off on a tangent to investigate something that will be rectified with the FULL RECOVERY.4. As we were patient over the past weekend – just extend that patience for a few days longern.5. LIMIT SYSTEM ENHANCEMENT REQUESTS untill everything is 100%.Lastly a very big THANK YOU again may the “wind be at your back”

  • Cherienutley

    this is disgusting, someone else who I do not know has set a family tree up for me, sop that anyone can read my info, much of it is wrong and offensive.nnhow can you let someone else set up family trees for other people, this is disgusting.u00a0 My info is now out there and I have not agreed to any of it, and Iu00a0 do not even know the person.u00a0 Just because we share the same surname does not give him the right.nnMy name is cherie dawn nutley, and I want my profile deleated but I cannot do it as it is managed by stephen nutley.u00a0 I do not know him, have never hd any contact with him.u00a0 He has no right to do this, and this site is disgusting to even allow this to happen.nnGet my page off please.

    • http://twitter.com/georgegeni George Geni

      Cherie, please submit a ticket at http://help.geni.com and our CS team will assist.

    • Stephenntly

      Dear Cherie nnI am so sorry that you feel the way you do about our family tree. I believe the rules around private and public profiles are set by Geni depending on how you are related, but if I can I will change your profile to private. At the moment due to the problems thatu00a0Geni are having, I amu00a0unable to see how we are related. nThe information I have used to build the tree is in the public domain. English Births, Deaths and Marriages right up to 2005 are available online at various websites. nIf there are errors on the tree that have caused you offence,u00a0I apologise and would like to invite you to join the tree and help me correct it. If you are interested in doing this, please contact me through my profile privately with your email address. nnRegards nnStephenn

  • Lisa Cooper

    Hi,nIt’s the 4th of August and my grandchild’s datau00a0( that I have been collecting like diary entriesu00a0for the past 3 years)u00a0is still missing. Are you sure I can get it all back eventually? Feeling pretty anxious right now!nThanks, nLisa

    • http://twitter.com/georgegeni George Geni

      Lisa,u00a0how do you store them on Geni? u00a0As a text document?

  • Josh Berkus

    Geni,nnI’m curious as to what went wrong.u00a0 It’s not common for PostgreSQL to develop unrecoverable data issues short of hardware failure; I can’t remember the last I’m I’ve seen one and I support some 60 companies using PostgreSQL.u00a0 If you think you’ve discovered an issue which can cause data corruption in PostgreSQL without hardware failure, the project would like a bug report.nnThanks!nn–Josh Berkusnu00a0u00a0 PostgreSQL Project

    • Anonymous

      Thanks for taking the time to comment, Josh. I’ll forward your question to the engineers.

  • Leif B. Kristensen

    I think that you’ve got a lot of nerve to just go and blame this situation on PostgreSQL and “an issue with the databaseu2019s index.” Can you please tell us exactly what went wrong? An administrator error? Or not running a fully patched version? Those are two of the suggestions from the PostgreSQL community.nnOn the other hand, you could have discovered a new bug, which I know that the PostgreSQL community would be happy to know about. Normally, bugs in PostgreSQL are fixed within hours.

    • http://twitter.com/georgegeni George Geni

      Leif,nnOur VP of Engineering is in contact with Josh, and has explained our problem and troubleshooting to him.nn-George

      • Leif B. Kristensen

        George,nthat’s well and good. Still I think that the original post is putting unneccessary blame on a database engine with which I have had nothing but impressive experience. From my own personal knowledge about the PostgreSQL community since 2005, I know that they would be very happy with a formal bug report regarding your outage. Normally, a PostgreSQL bug is fixed within hours.

        • http://twitter.com/georgegeni George Geni

          We’ve been impressed by our experience with PostgreSQL, too (with the obvious exception of this weekend). u00a0We’d report a bug, but there is some issue where the queries that caused the exception weren’t logged, so we weren’t able to identify what query resulted in the crash(es). u00a0We have sent the logs to Josh in hopes that he can help. u00a0If we had enough information to submit a bug request, we’d have already done it.

          • Leif B. Kristensen

            u00abThe issues that caused the outage were with Geniu2019s PostgreSQL database. nu00a0We know that hardware issues or data corruption were not the root of nthe problem, and we suspect that it was an issue with the databaseu2019s nindex.u00bbnnI have a feeling that if you had chosen $(your favourite commercial database) and issued such a statement, a host of lawyers from $(your favourite commercial database) now would have been all over your site, demanding an answer why you’d think $(your favourite commercial database) was to blame for the situation.

          • Anonymous

            I have a feeling that I’d rather be eating ice cream right now. We are communicating what we know to our users. If you wanna help, cool beans. Otherwise let’s take this offline.

          • Josh Berkus

            Leif,nnAttacking people for reporting when they encounter a problem is not a good way to build the PostgreSQL community.

          • Leif B. Kristensen

            Neither is bad-mouthing PostgreSQL for what may as well have been an administrator error, for what we know. Blaming the tools is generally known as the mark of a poor craftsman.nnI won’t discuss this further.

        • Josh Berkus

          Leif,nnStuff happens, especially in large high-demand database systems.u00a0 PostgreSQL does get bugs, or we wouldn’t issue patch releases every two months.u00a0 I think the blog post above was fairly direct and honest.

          • http://twitter.com/stangel Mike Stangel

            We certainly don’t mean to lay blame on Postgres or the outstanding work that so many people have put into it — it’s been very stable for us and we’re proud to say it’s our database of choice. u00a0The problem was certainly *in* the database inasmuch as it wasn’t outside of it, and while we said that we’ve ruled out a hardware issue I suppose it’s not impossible that some transient condition caused the corruption in the internal state of the database. u00a0Unfortunately we don’t have enough specifics to submit a formal bug report, which is why I’m grateful to be in contact with Josh — he’s given me some guidance to make sure that we’re in a position to gather more information should this situation crop up again.

  • Anonymous

    FYI the link in the email goes to a 404 page: http://www.geni.com//blog/last-weekends-outage-368211.html (note that it has two slashes)

    • http://twitter.com/georgegeni George Geni

      Thanks intgr.

  • Donna & Bill Laverty

    thank you for your directness and your honesty – this speaks of your high level of integrity, and that is to refreshing in these days of corruption, dishonesty and the widespread total lack of integrity.u00a0 thank you, thank you, thank you!nnBill Laverty (account under my wife’s name: Donna Laverty)

  • Chris

    Any idea when the missing text in “share some things about xxx” will come back? All my notes have gone missing for very near all my 4000+ entries and the media attachments. Now getting worried as expected deadlines seem to come and go.Clearly I’ll have to make other back up arrangements if they come back such as gedcom: ifu00a0it doesn’t come back I reckon it will take me 3 years to get through all the research again. Chris.u00a0

    • http://twitter.com/geni Geni

      Chris – they should have been back yesterday. u00a0If they weren’t, please let us know.

  • Karl H. Wollan

    “we would like to offer an additional week of Pro service to you for this inconvenience”nnA week???nnYou might want to reconsider. A week is hardly noticable. A week will barely bring a smile on anyones faces.A week doesn’t have a feelgood factor.nYou should at least throw in a month.Not for the money’s worth, but for the feelgood factor.nnMe? Yes I’m a “PRO” but I opted for the lifetime membership so I’m not gettin’ a “free week”.Or a “free month”. Therefore I can say it out loud. Give ‘em something that puts the smile back on their faces. In the long run, you too will benefit from that.nnSincerely, Karl H. Wollan