Jump to content
Sign in to follow this  
Oskar Linden

Deploys for the week of 2012-01-30

Recommended Posts

The code from Magnum is getting promoted this week.  llSetRegionPos() is be enabled grid wide! The region crossing code on BlueSteel and LeTigre last week had the presence issue as you were well aware of. We fixed that issue, but unearthed a serious crashing bug during our QA phase. We will not be releasing it tomorrow and will be putting a maint-server on all three RC's.
 
Second Life Server (main channel)
  • Features
  • New LSL function integer llSetRegionPos(vector position)
    The object with the script will move the root prim position to the given location. The position is any position within the region. If the position is below ground, it will be set to the ground level at that X,Y spot. The function has no delay or throttle.
    • Returns 1 if the object is successfully placed within 0.1 m of position.
    • Returns 0 and does not move the object if position is more than 10m off region or above 4096m.
    • Returns 0 and does not move the object if the object is dynamic (has physics enabled).
    • Returns 0 and does not move the object if the object can not move to position due to object entry rules, prim limits, bans, etc.
  • "frame_number" option added to llGetEnv()
    Returns an integer that represents the current 'frame' of the simulator. Generally only useful for specific debugging cases.
  • Bug Fixes
  • SVC-7466 A notecard holds more data than a script can read
  • SVC-7520 Keyframe motion doesn't move towards the correct position when specified time is not an exact multiple of 1/45 seconds
  • SVC-7485 llSetKeyframedMotion cannot stop animation if none is running ... sounds less important than it is ...
  • SVC-7493 Weird mesh land impact issue
  • Fixed several simulator crash bugs and potential memory leaks.
  • Fixed a notecard crashing bug.
  • SVC-7613 Greatly increased network activity since the 1/18/12 RC channel rolling restart
  • SVC-7608 Sims are not visible when diagonally opposed. (this was not a simulator-side fix, but the bug was originally visible only in the previous iteration of this project)

Second Life RC BlueSteel
This maint-server has huge improvements to certain LSL functions. We attempted to roll this code out a few weeks ago but noticed a script crashing bug that necessitated a rollback. This version is fixed. This code is on all three RC channels.

 

Second Life RC LeTigre
This is the same maint-server that is on BlueSteel.

 

 

Second Life RC Magnum
This is the same maint-server that is on BlueSteel.


 

We will be monitoring this thread during the next week so please feel free to post issues that you feel have been introduced by the new code. Please file a JIRA for issues you find and post the JIRA link into this thread. It really helps us out. When determining if issues are relevant or not research is key. Tracking down exactly the right situation where an issue is occurring greatly speeds up the development process to get fixes in place.
I appreciate your help. Have a good week!
 
__Oskar
 
p.s. If you are interested in helping test SecondLife in beta please join the group "Second Life Beta" in-world. We also have an email list where we communicate upcoming projects and how you can help. ( https://lists.secondlife.com/cgi-bin/mailman/listinfo/server-beta ) Once a week we meet on ADITI to discuss new features, new bugs, new fixes, and other fun stuff. You are more than welcome. Information is here: https://wiki.secondlife.com/wiki/Server_Beta_User_Group

Share this post


Link to post
Share on other sites

I hope you enjoyed your weekend as much as I enjoyed mine. :matte-motes-big-grin:

Good luck with this week's Deploys. It made for interesting bed time reading.

/me waves cheery goodnight to all the usual suspects.



Share this post


Link to post
Share on other sites

Hmmm well... yeah nice for adding new LSL functions, but could you or anybody else please update the LSL function IDs so we can properly include them for the LSL syntax highlighting at least in TPVs?

Current missing function IDs as of Januar 31, 2012:

  • llSetMemoryLimit
  • llGetMemoryLimit
  • llSetLinkMedia
  • llGetLinkMedia
  • llClearLinkMedia
  • llSetLinkCamera
  • llSetContentType
  • llLinkSitTarget
  • llAvatarOnLinkSitTarget
  • llSetVelocity
  • llCastRay
  • llGetMassMKS
  • llSetPhysicsMaterial
  • llGetPhysicsMaterial
  • llManageEstateAccess
  • llSetKeyframedMotion
  • llTransferLindenDollars
  • llGetParcelMusicURL
  • llScriptProfiler
  • llGetSPMaxMemory
  • llGetUsedMemory
  • llSetAngularVelocity
  • llSetRegionPos

Share this post


Link to post
Share on other sites

So you have fixed the ghosting issue. Great.

But you're not deploying the fix for another 30+ hours. Not Great.

If you do have an explanation for continuing to run server code which you know is broken, I would like to hear it.

The best answer would be that you are running the fix through QA processes. That still doesn't explain why, when some regions are inaccessible because of the number of ghosted AVs, you have not reverted them to an earlier, known-safe, version of the code.

 It's Tuesday now, there has been time for a decision to be made at a senior level. There has been some good work done since the problem became obvious of Saturday. But the current situation leaves a distasteful impression that senior management at Linden Labs don't give a **bleep**.

Share this post


Link to post
Share on other sites

@Oskar

Wow one thing is fixed...I logged in and got to post first attempt!:smileyvery-happy:

SVC-7632...oh dear.  Now you say (and I know it's not YOU) that QA have fixed the "ghosting" bug in the threaded region crossing code...this is the same QA that gave us the bug in the first place, so pardon me if I don't rejoice until it is clear that they did their job properly this time.

And having seen Philip's comment in the NYT (see Tateru Nino's blog), I wonder how long it will be until I try to log in one day and there is nothing there?  Does he have NO common sense?

*sighs*  "Onward":smileywink:

Share this post


Link to post
Share on other sites


Ansariel Hiller wrote:

Hmmm well... yeah nice for adding new LSL functions, but could you or anybody else please update the 
so we can properly include them for the LSL syntax highlighting at least in TPVs?

I agree and the viewer team is aware of the issue. Since they operate on a different release schedule from the server team there can sometimes be a disconnect. We've been investigating other options for keeping the function list up to date. As for TPV's we have no control over syntax highlighting in their code.

__Oskar

Share this post


Link to post
Share on other sites


WolfBaginski Bearsfoot wrote:

So you have fixed the ghosting issue. Great.

But you're not deploying the fix for another 30+ hours. Not Great.

If you do have an explanation for continuing to run server code which you know is broken, I would like to hear it.

The best answer would be that you are running the fix through QA processes. That still doesn't explain why, when some regions are inaccessible because of the number of ghosted AVs, you have not reverted them to an earlier, known-safe, version of the code.

 It's Tuesday now, there has been time for a decision to be made at a senior level. There has been some good work done since the problem became obvious of Saturday. But the current situation leaves a distasteful impression that senior management at Linden Labs don't give a **bleep**.

I do have an explanation. Time. There was a time when fixes like this would take 3-6 months or longer. Ask around about how it used to be. Getting a fix out in a week or less is a massive improvement. I wish everyone could realize that instead of jumping to judgement. Every step in the QA/Dev process takes time. This bug was only introduced on Wednesday. We knew immediately it was an issue. Step 1 was to isolate the issue and determine the cause. This takes time. It takes time away from QA/dev for new projects. Step 2 was isolating the faulty code. Step 3 is finding a fix. Step 4 is implementing the fix. Step 5 is building the code. Step 6 is deploying the code. Step 7 is testing the fix. Each step takes multiple hours. When you overlay that onto our standard west coast work schedule you realize how strapped for time we are to get fixes out so fast. In the best case situation a developer has about a day and a half to find the bug and make a fix. If they don't have a fix by Friday we have to pull their code for the next release cycle. QA doesn't get their hands on the code until Monday at the earliest. This gives us a very small window to attempt to verify the bug and the fix. If the fix doesn't work we have even less time. Realistically, if the code is not complete and bug free by Monday night it won't make the Wednesday morning release window.

I hope you can understand how much gets done in such a short period of time. I also would appreciate your understanding of the stress most here are under and the passion and diligence they put into creating a better Second Life. I am sorry that your experience hasn't been good.

__Oskar

 

Share this post


Link to post
Share on other sites


Oskar Linden wrote:


WolfBaginski Bearsfoot wrote:

So you have fixed the ghosting issue. Great.

But you're not deploying the fix for another 30+ hours. Not Great.

If you do have an explanation for continuing to run server code which you know is broken, I would like to hear it.

The best answer would be that you are running the fix through QA processes. That still doesn't explain why, when some regions are inaccessible because of the number of ghosted AVs, you have not reverted them to an earlier, known-safe, version of the code.

 It's Tuesday now, there has been time for a decision to be made at a senior level. There has been some good work done since the problem became obvious of Saturday. But the current situation leaves a distasteful impression that senior management at Linden Labs don't give a **bleep**.

I do have an explanation. Time. There was a time when fixes like this would take 3-6 months or longer. Ask around about how it used to be. Getting a fix out in a week or less is a massive improvement. I wish everyone could realize that instead of jumping to judgement. Every step in the QA/Dev process takes time. This bug was only introduced on Wednesday. We knew immediately it was an issue. Step 1 was to isolate the issue and determine the cause. This takes time. It takes time away from QA/dev for new projects. Step 2 was isolating the faulty code. Step 3 is finding a fix. Step 4 is implementing the fix. Step 5 is building the code. Step 6 is deploying the code. Step 7 is testing the fix. Each step takes multiple hours. When you overlay that onto our standard west coast work schedule you realize how strapped for time we are to get fixes out so fast. In the best case situation a developer has about a day and a half to find the bug and make a fix. If they don't have a fix by Friday we have to pull their code for the next release cycle. QA doesn't get their hands on the code until Monday at the earliest. This gives us a very small window to attempt to verify the bug and the fix. If the fix doesn't work we have even less time. Realistically, if the code is not complete and bug free by Monday night it won't make the Wednesday morning release window.

I hope you can understand how much gets done in such a short period of time. I also would appreciate your understanding of the stress most here are under and the passion and diligence they put into creating a better Second Life. I am sorry that your experience hasn't been good.

__Oskar

 

Thanks for the explanation...didn't realize the process and timing involved  :( 

Share this post


Link to post
Share on other sites

TBH I think a week is very fast, and I'm not even a programmer.
...though I did fix a glitch in the zhao 2 script, and I can't wait to see if I can't show my fix to the creator soon. So I do dabble...

I can't wait until the ghosting issue is fixed, cause I've been locked out for a while now...

Share this post


Link to post
Share on other sites

@Oskar

You know you didn't really address one of the main tenets of Wolf's post; namely, why wasn't the faulty code (which you soon KNEW was faulty), rolled back?

And you wouldn't want TPVs to keep pace with the LL viewer so why give them the function IDs? *coughs*

Yes, things are better than they were two or three years ago in some ways, but definitely not in others.  I doubt your words would comfort someone ghosted on Thursday.

I suppose it is unrealistic to expect LL to have staff working at weekends, after all when would they get their recreation?

"It's just a game" *coughs again* :smileywink:

Share this post


Link to post
Share on other sites


Ayesha Askham wrote:

@Oskar

You know you didn't really address one of the main tenets of Wolf's post; namely, why wasn't the faulty code (which you soon KNEW was faulty), rolled back?

And you wouldn't want TPVs to keep pace with the LL viewer so why give them the function IDs? *coughs*

Yes, things are better than they were two or three years ago in some ways, but definitely not in others.  I doubt your words would comfort someone ghosted on Thursday.

I suppose it is unrealistic to expect LL to have staff working at weekends, after all when would they get their recreation?

"It's just a game" *coughs again* :smileywink:

Rollbacks are very costly for a number of reasons. We don't roll back lightly. Every release to the grid is a very complicted process. Rollbacks are reserved for content loss issues or server crashes above a certain threshold. This issue, while frustrating, was easily handled by support. Content was not lost, there were no griefing exploits, and regions weren't crashing. It wasn't an emergency.

Yes it is unrealistice to expect staff to work on the weekends. However most still do. They don't get paid extra, they are just passionate about making Second Life better. 

I misread the LSL function ID request when I worded my answer. Even so, there isn't anything I can do other than tell the viewer team and they already know.

__Oskar

Share this post


Link to post
Share on other sites


Oskar Linden wrote:

I agree and the viewer team is aware of the issue. Since they operate on a different release schedule from the server team there can sometimes be a disconnect. We've been investigating other options for keeping the function list up to date. As for TPV's we have no control over syntax highlighting in their code.

__Oskar

But I have control over syntax highlighting code in Firestorm, but cannot add them without the risk of messing up the produced script bytecode without knowing the function ID so I can add them in the correct order.

Question: Who actually defines those function IDs? It would really be helpful if the LSL Wiki would be updated more promptly.

Share this post


Link to post
Share on other sites

I applaud you for your patience and not losing it while trying to answer people's concerns. Good job sir and keep up the good work! 

 

inb4 Oskar actually does lose it and closes down this thread :P

Share this post


Link to post
Share on other sites


Ansariel Hiller wrote:

Oskar Linden wrote:

I agree and the viewer team is aware of the issue. Since they operate on a different release schedule from the server team there can sometimes be a disconnect. We've been investigating other options for keeping the function list up to date. As for TPV's we have no control over syntax highlighting in their code.

__Oskar

But I have control over syntax highlighting code in Firestorm, but cannot add them without the risk of messing up the produced script bytecode without knowing the function ID so I can add them in the correct order.

Question: Who actually defines those function IDs? It would really be helpful if the LSL Wiki would be updated more promptly.

Viewers do not produce bytecode. More specifically the Second Life servers will not accept bytecode from viewers: all compilation is done on the server from the source provided by the viewer. In the case of code compiled to mono the IDs aren't even relevant or used. The IDs are not needed for syntax highlighting.  There is no reason for any viewer to need accurate function IDs. If the source code requires an ID for each keyword in its syntax table just make them up. They are generally monotonically increasing, but it really doesn't matter what a viewer uses internally.

 - Kelly

Share this post


Link to post
Share on other sites


Cincia Singh wrote:

This rolling restart also broke thousands of mailing list products in SL; SVC-7631. Nice.

Unfortunately some mailing list and product updaters may break or need to be updated. To stop a griefing mode that has effects on the entire grid's back end infrastructure a throttle was added to llGiveInventory. This throttle matches (but is separate from) the existing throttle on llInstantMessage and exists for nearly identical reasons. That throttle is 5k per hour per owner per region; the maximum burst is 2.5k. It is impossible to hit this limit with a single script, but systems designed to spam very large amounts very rapidly may hit it and need to be adjusted. We will be monitoring the effect of this throttle to adjust it as we can if needed.

Security issues like this, especially of this grid wide severity, require that we act swiftly and without significant prior notice, for which we do apologize.

Share this post


Link to post
Share on other sites


Kelly Linden wrote:

Viewers do not produce bytecode. More specifically the Second Life servers will not accept bytecode from viewers: all compilation is done on the server from the source provided by the viewer. In the case of code compiled to mono the IDs aren't even relevant or used. The IDs are not needed for syntax highlighting.  There is no reason for any viewer to need accurate function IDs. If the source code requires an ID for each keyword in its syntax table just make them up. They are generally monotonically increasing, but it really doesn't matter what a viewer uses internally.

 - Kelly

Okay, then I assume those big warnings in lscript_library.cpp are for server-side code only?

// When adding functions, they <b>MUST</b> be appended to the end of

// the init() method. The init() associates the name with a number,

// which is then serialized into the bytecode. Inserting a new

// function in the middle will lead to many sim crashes. Phoenix 2006-04-10.

// IF YOU ADD NEW SCRIPT CALLS, YOU MUST PUT THEM AT THE END OF THIS LIST.

// Otherwise the bytecode numbers for each call will be wrong, and all

// existing scripts will crash.

 

 

 

Share this post


Link to post
Share on other sites


Ansariel Hiller wrote:

Kelly Linden wrote:

Viewers do not produce bytecode. More specifically the Second Life servers will not accept bytecode from viewers: all compilation is done on the server from the source provided by the viewer. In the case of code compiled to mono the IDs aren't even relevant or used. The IDs are not needed for syntax highlighting.  There is no reason for any viewer to need accurate function IDs. If the source code requires an ID for each keyword in its syntax table just make them up. They are generally monotonically increasing, but it really doesn't matter what a viewer uses internally.

 - Kelly

Okay, then I assume those big warnings in lscript_library.cpp are for server-side code only?

// When adding functions, they <b>MUST</b> be appended to the end of

// the init() method. The init() associates the name with a number,

// which is then serialized into the bytecode. Inserting a new

// function in the middle will lead to many sim crashes. Phoenix 2006-04-10.

// IF YOU ADD NEW SCRIPT CALLS, YOU MUST PUT THEM AT THE END OF THIS LIST.

// Otherwise the bytecode numbers for each call will be wrong, and all

// existing scripts will crash.

 

 

 

Correct.

Share this post


Link to post
Share on other sites

OK Oskar, I think that puts things in perspective thanks.:smileysurprised:

 

I just don't think that the remarks will cut much ice with the great unwashed (or unlogged in).

 

As to the rest of this breakage, well Kelly explains it very succinctly, but again, it ain't gonna go down well.

 

As to the viewer code issues, I'll just drown quietly...I'm in way over my head!:matte-motes-bashful:

Share this post


Link to post
Share on other sites

But you have, in the past, been able to quickly revert to an earlier, safe, version of the code, which bypasses the whole QA element. Why was that choice ruled out?

Somebody chose to keep the system running with broken code. 

Share this post


Link to post
Share on other sites


Darren Scorpio wrote:

I applaud you for your patience and not losing it while trying to answer people's concerns. Good job sir and keep up the good work! 

 

inb4 Oskar actually does lose it and closes down this thread
:P

I don't think Oskar will actually lose it, but so far he's holding up well with dignity. 

Despite what some people think, I saw evidence of there being staff working over the weekend; sims were restarted, griefers removed, emergencies dealt with. 

While it's really sad that some people - a lot of people - have been affecting by this new ghosting bug, everybody has been given the opportunity to get their moneysworth out of Support, and encouraged to do so.  Others have chosen to waste their time posting the same old same old questions without bothering to even try and look for solutions first, so it's no wonder they've remained locked out of Second Life for almost a week in some cases, when they could have even got themselves back in.

Frustrations and inability to get our own way can cause a bit of sniping and back-biting. At the end of the day, Oskar and co are on the payroll, giving extra value for money by daring to raise their heads above the parapet.

Anyone who is arrogant enough to believe they could modify a running program like Second Life without running into issues on a regular basis really astound me.

I'm with you, Darren, applauding the patience, and appreciating what the creators of Second Life have done for me since November 2007. 

Share this post


Link to post
Share on other sites

Hang in there, Oskar.  The IT operations pros among us appreciate your professionalism.

And Kelly, thanks for killing inventory spam. To paraphrase the meme, spammers gonna hate. I look forward to not having to relog and then dump a few score items from my inventory.

Share this post


Link to post
Share on other sites


WolfBaginski Bearsfoot wrote:

But you have, in the past, been able to quickly revert to an earlier, safe, version of the code, which bypasses the whole QA element. Why was that choice ruled out?

Somebody chose to keep the system running with broken code. 

It only seems quick from your perspective but it takes a minimum of two people working nonstop for multiple hours. It is a labor intensive process. It is also highly disruptive to the grid, commerce, stability, and user perception. We try very hard to stick to the release schedules we publish so people can plan downtime around them. Rollbacks are always highly disruptive. More people are upset at rollbacks that are unnecessary than those who were affected by this issue. We had a support level fix in place for this issue. It didn't escalate to the level of an emergency. 

__Oskar

Share this post


Link to post
Share on other sites

I have updated the notes for tomorrow's deploy. The code on BlueSteel and LeTigre caught a crashing bug during the merge phase. We had a fix in place for the stuck presence issue and verified that it worked. However during our QA phase we recognized a new crashing bug and did not have the time to implement a fix and QA it before release tomorrow morning. We decided to pull this project until we can work out these new issues. The maint-server that was going to only be on Magnum is now going to be deployed to each of the three RC's. 

If you are curious about our Dev/QA process it would be of interest to you to know that there is a lot that goes on between Friday and Wednesday. Friday morning is when we decide which RC channel we should promote to the main channel ("trunk"). After this any existing RC channels and any new code needs to merge with the code that will be the main channel the next Tuesday. The merge process can take many hours. It is common for there to be new issues that need to be worked out on the fly since you are basically combining two entirely separate code branches. Then you have to hope that it builds properly. After that it requires a deploy to a development grid. Each of these builds then needs a QA verification pass. Time is very short for us in this process. Issues need to be found quickly or there isn't time to fix the found bugs. This is a case where our process worked as expected. We found an issue before release. Sadly we didn't have enough time to get a fix out before release. Sometimes we do.

I would encourage you to keep the scope of this entire process in mind when critiquing QA. In the same span of time this process is often done in triplicate if there is a busy backlog of RC candidates ready to go. 

__Oskar

Share this post


Link to post
Share on other sites

Oskar

Forgive my obtuseness.  This issue with a new bug...does it mean that the ghosting issue will not now be fixed this week or did I mis-read?

Also, thankyou once again for your painstaking explanation of this process such that numbskulls like me can understand what has to be done each week.

I do hope this ghosting issue can be got rid of because it is not simply affecting region-crossers.  If an av that has previously been ghosted on a sim TPs back into that sim, apparently it can never get out again, even if it cleaned up its doppelganger!:smileysurprised:

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...