Jump to content

Deploy Plan for the Week of 2020-12-07 (Updated 2020-12-09 16:30 PST)


Mazidox Linden
 Share

You are about to reply to a thread that has been inactive for 1204 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

  • Lindens

Update @ 2020-12-11 11:45 PST : added plans for Friday rolls.

 

We're continuing our daily rolls this week as we get closer to the finish line of having uplifted simulators ready for an all-uplifted future.

Monday:

Morning:

Second Life Server (Main Channel) will restart between 6:00 AM and 12:00 PM Pacific Standard Time.

https://releasenotes.secondlife.com/simulator/2020-12-01.553168.html

Tuesday:

Morning:

Second Life RC channels will restart in the morning, 6 AM - 9:30 AM Pacific Standard Time.

https://releasenotes.secondlife.com/simulator/2020-12-02.553176.html

Wednesday:

Morning:

Second Life Server (Main Channel) will restart between 6:00 AM and 12:00 PM Pacific Standard Time.

https://releasenotes.secondlife.com/simulator/2020-12-02.553176.html

 

Thursday:

Morning:

Two of the Second Life RC channels (RC BlueSteel and RC LeTigre) will restart between 8:00 AM and 10:00 AM Pacific Standard Time.

https://releasenotes.secondlife.com/simulator/2020-12-09.553685.html

Friday:

Afternoon:

The un-updated Second Life RC channel (RC Magnum) will restart between 12:30 PM and 2:30 PM Pacific Standard Time.

https://releasenotes.secondlife.com/simulator/2020-12-09.553685.html

After the RC channel is done updating, we will begin updating the Second Life Server (i.e. Main Channel) regions with the same code after a brief pause.

On Region Restarts:

There will be a lot of them this week too! We did get some more testing done on uplifted Grid Poking Bot, but found a couple of issues that need more attention.

Edited by Mazidox Linden
Updating with Friday rolls
  • Like 4
  • Thanks 6
Link to comment
Share on other sites

And when are you going to fix the issue with super-slow or failing connections (login, seed replies, capabilities fetches, simulator features, AIS3 inventory, bakes) ?... After November the 26th everything got ruined for me in SL because of them: when the AWS servers take 30 seconds or more (!) to perform the cited connections, everything fails in SL !!!

And not even a word about this MAJOR issue on the grid status page !

Edited by Henri Beauchamp
  • Thanks 1
Link to comment
Share on other sites

4 hours ago, Henri Beauchamp said:

And when are you going to fix the issue with super-slow or failing connections (login, seed replies, capabilities fetches, simulator features, AIS3 inventory, bakes) ?... After November the 26th everything got ruined for me in SL because of them: when the AWS servers take 30 seconds or more (!) to perform the cited connections, everything fails in SL !!!

And not even a word about this MAJOR issue on the grid status page !

You should file a JIRA issue Henri.

Link to comment
Share on other sites

Better late than never.  I suppose I shouldn't grumble about the lateness of the OP, but today's restarts were appalling.  My region was restarted at about 8:30am PST and while I was checking over our scripted animals (not breedables!) the region either crashed or was restarted without the usual warning. I must assume it crashed since my avatar was still online in limbo some 15 minutes later, and only an IM from my alt dislodged it.

The region was not restarted until over 2 hours later, and I was told prior to that by Support that they could not "nudge" it until the roll had completed.  I must hope that Wednesday's roll is a little less incident prone.

Edited by Aishagain
additional text
Link to comment
Share on other sites

I just want to thanks all the Lindens for amazing work on the uplift to the cloud.

I got better performance everywhere - everything faster and flawless, be in busy clubs or big shopping events. And can again drive mainland with smooth fast region crossings without troubles, the other day at the Bay City Christmas three lighting event, where there were many people, smooth fast.

Knock on woods!

I am in  Scandinavia on fiber and have excellent AWS connection, guess this is alpha and omega.

Link to comment
Share on other sites

9 hours ago, Whirly Fizzle said:

You should file a JIRA issue Henri.

He shouldn't need to. LL should be on top of this general sort of thing without user prompting.

I guess it's 'mind over matter', we don't mind, and you don't matter.

  • Thanks 1
  • Sad 1
Link to comment
Share on other sites

42 minutes ago, Anna Nova said:

LL should be on top of this general sort of thing without user prompting.

How? It's clearly not a grid wide issue, you'd see lots more posts on forums like every single time something breaks and less online numbers in-world. I, personally, had no issues with SL recently. So it sounds like either a regional or ISP issue to me, maybe certain routing from his place to AWS servers LL are using now. To find out the issue must be filled somewhere, jira or ticket at least.

I did work in ISP for a long time and using example from there. It's relatively easy to see a big issue that affects a lot of people, like a 100gbit fiber line cut or dead/dying hardware. Heavy traffic drops on graphs, bunch of fcs and other errors, etc. But if it's a problem on the "last mile", that affects 10 to 50 people at most, then until someone of those affected people will poke a tech support and they check manually, then there's no problem anywhere as far as ISP is concerned.

Edited by steeljane42
  • Like 3
Link to comment
Share on other sites

44 minutes ago, steeljane42 said:

How? It's clearly not a grid wide issue, you'd see lots more posts on forums like every single time something breaks and less online numbers in-world. I, personally, had no issues with SL recently. So it sounds like either a regional or ISP issue to me, maybe certain routing from his place to AWS servers LL are using now. To find out the issue must be filled somewhere, jira or ticket at least.

I did work in ISP for a long time and using example from there. It's relatively easy to see a big issue that affects a lot of people, like a 100gbit fiber line cut or dead/dying hardware. Heavy traffic drops on graphs, bunch of fcs and other errors, etc. But if it's a problem on the ”last mile”, that affects 10 to 50 people at most, then until someone of those affected people will poke a tech support and they check manually, then there's no problem anywhere as far as ISP is concerned.

I have a 1Gbps (downlink)/700Mbps (uplink) FTTH connection to Internet, without any issue on it, and provided by a major ISP in France (Free/Illiad).

I have no issue whatsoever connecting to Amazon web services, outside of SL (all Amazon websites load super fast and without a glitch).

I also had no issue in SL before USA's Thanks Giving.

There are other affected people (see this thread, for example), that I assume do not use the same ISP and are most likely not living in the same region or even the same country.

Obviously, LL broke something just before the Holidays and no one cares ever since (they even went as far as shutting down entirely their support service that very week-end !).

As for the JIRA, I gave up on it a long time ago: posting an issue on it is like ”pisser dans un violon” like we say in France. I have detailed logs (Wireshark pcap and viewer) available to any Linden who will care contacting me.

Edited by Henri Beauchamp
  • Thanks 2
  • Haha 1
Link to comment
Share on other sites

11 hours ago, Aishagain said:

I must assume it crashed since my avatar was still online in limbo some 15 minutes later, and only an IM from my alt dislodged it.

That sounds a lot like a bug that's been going on for quite some time - weeks, perhaps months. Maybe that particular thing was, by coincidence, due to this rather than to a crash or roll-out.

For me, it is very common that, when a friend logs out, I am told that "<friend's name> has logged out", and instantly I get another message saying that "<friend's name> has logged in". My friends list shows them as being logged in. As far as I can tell, they are logged in. They haven't logged back in, of course, and an IM to them gets the message that they are not online and the IM will be saved, etc. My friends list gets up to date too.

With one friend, it was every time she logged out. With others, it's only sometimes, and with yet others, it's probably never.

Most of the time it doesn't happen, but it's so common - more than daily - that there must be a jira about it. It sounds a lot like you described, but without a crash or roll-out.

ETA: As I've been writing this, I've been wondering why it happens more with some people than with others, and I'm thinking that maybe it's due to having had IM conversations with them prior to them logging out. It's nothing to do with the roll-outs though, so I'll stop rambling on :)

Edited by Phil Deakins
  • Like 1
Link to comment
Share on other sites

@Phil DeakinsThe bug you refer to is indeed old in SLterms.  It occurs when an account logs out and receives a communication of any sort but principally personal IM before the avatar has left the simulator where it was.  This seems to fool the system into thinking that the avatar is still online.  In some extreme cases the avatar then NEVER logs out until a further direct IM is received.

This ghosting was officially eradicated back in the SL middle ages but in certain circumstances (region crashes being one) the bug rears its head, it would seem.

Link to comment
Share on other sites

@Anna Nova The thing is, Anna, that LL have a very good and reliable Bug Tracking system in the JIRA and they rely on reports from folk who know what they're talking about, like Henri to give them  firm evidence of problems which possibly do not affect everyone.  Henri's experiences plus his insight into what "might" be the cause or cure are vital to their clearing up problems in SL.

It would be good if Lindens had enough time to spend significant spells in SL themselves but they simply do not have the time to be both the operators of SL and users.

Edited by Aishagain
additional text
Link to comment
Share on other sites

Good news !

The SL-AWS slow/failing servers replies seem to have been solved during the night (i.e. yesterday's afternoon, SLT) !

Everything is back to normal for me, with fast logins, snappy rezzing and (non-failing) TPs, reliable inventory operations, baking, etc...

I'm still curious about the origin of that issue... So, if a Linden could elaborate about it...

 

12 hours ago, Aishagain said:

@Anna Nova The thing is, Anna, that LL have a very good and reliable Bug Tracking system in the JIRA and they rely on reports from folk who know what they're talking about, like Henri to give them  firm evidence of problems which possibly do not affect everyone.  Henri's experiences plus his insight into what ”might” be the cause or cure are vital to their clearing up problems in SL.

Sadly, the JIRA is (and by far) not the best place for TPV developers to be heard (and even less listened to) by LL... Your issue gets lost in an ocean of other issues, with much less relevant/pertinent info given in the latter, and the JIRA triagers won't even know who the said TPV devels are or how valuable are their reports.

I have much better luck with comments in the commits to LL's viewer code (as a mean of preventing future SL bugs, by pin-pointing them in pre-release viewer code), with messages posted to the opensource-dev list, or on this forum. My posts in the latter two media are pretty rare occurrences, and usually denote a catastrophic degradation that require urgent attention from LL.

As for the viewer bugs, the JIRA is of little to no use as well for me, as a TPV developer: I do not even have (even a read-only) access to most JIRA issues (not even those referenced in LL's viewer commits !), and anyway, when I find a bug or got one reported to me, I fix it myself for my viewer (since it's Open Source, anyone could benefit from my fixes as well, should they care to have a look at the weekly diff and change log I publish for each new release).

<rant>There are the TPV devel group meetings, but sadly they are held on voice, meaning non-english speakers like me cannot attend them (well, they can, but won't understand half of what is said)...</rant>

<nostalgia>In the distant past (before Oz' era), these meetings were held exclusively in text chat, letting everyone a chance to understand and express themselves (via translators if needed), whatever their nationality</nostalgia>

Edited by Henri Beauchamp
  • Thanks 1
  • Sad 1
Link to comment
Share on other sites

On 12/7/2020 at 6:14 PM, Mazidox Linden said:

We're continuing our daily rolls this week as we get closer to the finish line of having uplifted simulators ready for an all-uplifted future.

 

After having them ready for the all-uplifted future, are you guys going to return to the 2 (1 Main and 1 RC) restart days per week? I might have missed some info about this a few weeks earlier, but constantly running into restarting regions for 3 or more days a week makes any scheduling impossible.

 

On the other hand, region crossing behavior is now much better, some vehicles are definitely getting much less errors out of not being in sync with the driver on crossings.

 

Interestingly, though, it also seems like group chat service doesn't work too well, or at all, during these restart hours, while otherwise it looks like it returned to its usual not too perfect, but working state.

Link to comment
Share on other sites

1 hour ago, Henri Beauchamp said:

<rant>There are the TPV devel group meetings, but sadly they are held on voice, meaning non-english speakers like me cannot attend them (well, they can, but won't understand half of what is said)...</rant>

and there is a weekly Open Development User Group in text (it's going on as I post this). Try stopping by some time.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

25 minutes ago, AlettaMondragon said:

After having them ready for the all-uplifted future, are you guys going to return to the 2 (1 Main and 1 RC) restart days per week? I might have missed some info about this a few weeks earlier, but constantly running into restarting regions for 3 or more days a week makes any scheduling impossible.

That is the plan, but we may yet have a few more weeks of irregular restarts. We regret the inconvenience.

  • Thanks 4
Link to comment
Share on other sites

1 hour ago, Henri Beauchamp said:

The SL-AWS slow/failing servers replies seem to have been solved during the night (i.e. yesterday's afternoon, SLT) !

Everything is back to normal for me, with fast logins, snappy rezzing and (non-failing) TPs, reliable inventory operations, baking, etc...

I'm still curious about the origin of that issue... So, if a Linden could elaborate about it...

As much as I'd like to take credit for it, we didn't do anything.

  • Thanks 1
Link to comment
Share on other sites

53 minutes ago, Oz Linden said:

As much as I'd like to take credit for it, we didn't do anything.

This does not bode well for the future, then... If LL is unaware of the cause (meaning it would be on AWS' side and they did not communicate about it with LL), it will likely reproduce. 🙄

One of the drawbacks of the whole ”uplift” thing...

  • Thanks 1
Link to comment
Share on other sites

7 hours ago, AlettaMondragon said:

 

On the other hand, region crossing behavior is now much better, some vehicles are definitely getting much less errors out of not being in sync with the driver on crossings.

 

I agree things do seem better at places like Blake and Eden/North Sea (as of yesterday and this morning) but sadly at Bellisseria (eastern waterway) I ran into several issues, including being thrown from the boat after a slow crossings (the older crossing issue), which was followed a few crossings later by the dreaded frozen animation after coming to a near stop upon crossing.. It happened once more which was then followed by being disconnected after a bad crossing.  😬

Even with seeing those issues today at Bellisseria, again I do agree, things do seem to be improving! 🙂

 

Edit: I did run into one issue of frozen animations at Blake as of 2:30 slt, but considering the distance covered, I'd still agree its an improvement of how things have been.

 

Edited by Vicious Hollow
  • Like 1
Link to comment
Share on other sites

  • Maestro Linden changed the title to Deploy Plan for the Week of 2020-12-07 (Updated 2020-12-09 16:30 PST)
11 hours ago, Henri Beauchamp said:

You mean 100% text chat (I won't understand anything on voice and have no time to loose with voice meetings in English) ?...

The following user groups are primarily text communication based:

http://wiki.secondlife.com/wiki/Server/Sim/Scripting_User_Group

http://wiki.secondlife.com/wiki/Open_Development_User_Group

http://wiki.secondlife.com/wiki/Server_Beta_User_Group

Link to comment
Share on other sites

1 hour ago, Miller Thor said:

I hope we will return to the usual times as soon as possible. Main channel Tuesday 3 am PST, RC channel Wednesday 6 am PST. These ongoing intermediate actions are not exactly optimal.

Those of us outside the Americas, in very different time zones, are used to those times, but they may not feel so optimal. Some of the recent roll-outs have been very prolonged. Second Life provides services to the whole world, and Poughkeepsie.

  • Thanks 1
Link to comment
Share on other sites

A short Test shows us that the EEP-issue now really got fixed and it works for the moment as it should be.

If it still doesn't work out, the solution will look like we'll follow. Open the setting, change one of the values and reset it back to its original value, then just overwrite with the Save button. This will resave the preset to the asset system and the problem will be resolved. This will be necessary, if at all, at most with 1 preset and after that everything will work out as before.

Edited by Miller Thor
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 1204 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...