This past Sunday wasn’t very fun. Second Life had issues for a bunch of hours. I want to explain what happened.
The trouble started back on Thursday. We had some pretty bad problems talking to key services (packet loss) on one of our Internet links. It didn’t impact everything, but the stuff it hit was pretty important. It started for a few hours on Thursday, but just magically went away on its own. It’s never good when a problem just magically fixes itself because it’s pretty likely it’s gonna happen again.
The same thing happened again on Friday, but once again it went away on its own before we were able to debug what actually happened.
We were nervous going into the weekend, and sure enough, it bit us again on Sunday. We started having more Internet communication problems, but this time, it didn’t just go away.
Now that we had it in a bad state we started to troubleshoot and figured out really fast that it wasn’t our equipment. Our stuff was (and still is) working just fine, but we were getting intermittent errors and delays on traffic that was routed through one of our providers. We quickly opened a ticket with the network provider and started engaging with them. That’s never a fun thing to do because these are times when we’re waiting on hold on the phone with a vendor while Second Life isn’t running as well as it usually does.
After several hours trying to troubleshoot with the vendor, we decided to swing a bigger hammer and adjust our Internet routing. It took a few attempts, but we finally got it, and we were able to route around the problematic network. We’re still trying to troubleshoot with the vendor, but Second Life is back to normal again.
While we were troubleshooting, I happened to look at the forums a few times and noticed people asking if it was related to the power outages we’re having here in California. As much as I want to say that it is, we have no reason to believe that's the case. The people on my team did get hit by them, however! One of our engineers was working in the dark in a house without power, with his laptop being powered by a long string of extension cords that ran to a generator outside.
We are actively working on moving some services around to make us more resilient to incidents like what happened this weekend. It’s our top priority right now.
We’re really sorry that this past Sunday wasn’t very fun. The weekend before Halloween is a really fun time to be Inworld, and it was a frustrating day all the way around. (I personally love the way our Residents really get into Halloween in a way that’s only possible in Second Life!) Knowing that it wasn't as awesome as it could have been makes me sad, and we’re working to make it better in the future.
If you are having problems which you believe began during this outage, Support is ready to help.
Second Life Operations Manager