Jump to content
Sign in to follow this  
Maestro Linden

Deploys for the week of 2013-02-04

Recommended Posts

The server release train continues to run at full steam this week.  Choo-Choo!

 

Second Life Server (main channel)

The main channel is getting the server maintenance project which was on LeTigre last week.  This project has miscellaneous minor bug fixes and new features.

https://wiki.secondlife.com/wiki/Release_Notes/Second_Life_Server/13#13.01.25.269523

Scheduled Tuesday 2013-02-05 05:00-12:00 PST

 

Second Life RC BlueSteel:

Once again this week, BlueSteel gets the project to support for normal and specular maps.  This blog post has more information about the project.  The next step is to release a project viewer, which will allow users to try out this new functionality.

https://wiki.secondlife.com/wiki/Release_Notes/Second_Life_RC_BlueSteel/13#13.02.01.269843

Scheduled Wednesday 2013-02-06 07:00-11:00 PST

 

Second Life RC LeTigre:

LeTigre is getting a new maint-server project.  This update fixes some miscellaneous crash modes, and offers minor performance improvements.

https://wiki.secondlife.com/wiki/Release_Notes/Second_Life_RC_LeTigre/13#13.02.04.269945

Scheduled Wednesday 2013-02-06 07:00-11:00 PST

 

Second Life RC Magnum:

Magnum is getting an update to the interest list improvement project.  This projectshould reduce the bandwidth usage of viewers due to object updates, and should improve simulator performance, especially in sims with many connected avatars.

This week's update addresses a communication issue which lead to LibOMV bots to have high bandwidth usage.  Additionally, this update includes server support for normal and specular maps, which is detailed in the BlueSteel section. 

https://wiki.secondlife.com/wiki/Release_Notes/Second_Life_RC_Magnum/13#13.02.01.269856

Scheduled Wednesday 2013-02-06 07:00-11:00 PST

 

We will be monitoring this thread as the code gets released, so feel free to note any observations you have about the server updates.  If you have a specific bug you'd like to report, please file a Jira

Share this post


Link to post
Share on other sites

How did the Wednesday RC rollouts go? It is 4PM SLT and I don't see a roll completed notice in Grid Status. Plus Blue Steel regions seem to be having problems.

Share this post


Link to post
Share on other sites

Hi Nalates, the RC rolls finished roughly on schedule.  The status blog didn't have the rolls marked as 'Resolved', but this has been corrected.  What types of problems are happening in BlueSteel regions?

Share this post


Link to post
Share on other sites

I don't have enough info to say it is JUST Blue Steel. Today at 3:45+/- Hippotropolis went down and disconnected me at the Open Source meeting.

While I could log back to SL I could not TP back to Hippotropolis. Hippo Hollw is a BS RC too. I didn't try to tp to the Hollow, but before my disconnect the map was showing avatars there but not after relogging. I assumed it was having the same problems.

I saw others disappearing before I disconnected. Avatar names stopped appearing in my chat. Then chat stopped. My chat text would appear to post. 

Things would catch up a bit when TD and FPS rates recovered. Some chat comments would post but, no names.

Oddly the SIM FPS and Physics FPS were showing different values. Sometimes Physics FPS was 0 and Sim FPS was 5 or 6.

Time Dilation was spiking down to 0.5 and in a few seconds bouncing back to 1.0. 

I think all but 4 people (of 20+/- in region) were at the meeting. I did not see enough people entering or leaving to cause problems.

Share this post


Link to post
Share on other sites

Maestro and Andrew,

I wanted to report on the bot's usage. Fixed!

Before this incident the bot's "normal" usage was 5 MB / hr. That is so normal no one would suspect anything.

But now it is 1 MB / hr! It has never been that low before, ever.

The improvement might be from the interest list changes, but since the bot is parked 3300m up with a very limited draw distance, I think it is from this UDP bug fix, and will help with more than just bots. :)

 

Share this post


Link to post
Share on other sites

Hi Nalates, there have been reports of griefer activity in Hippotropolis recently, which may explain the issues you've seen.  Are other BlueSteel regions affected?

Share this post


Link to post
Share on other sites

@Triple -- Thanks for the update.  I think the fix to the UDP packet acks would drop your bandwidth from the terrible 1GB/H back down to the normal 5MB/H.  Improvements beyond that must be attributable to other changes.  Some of the lower bandwidth you're seeing is the lack of packets updating the old cloud density layer (now gone), and a slower update for the wind layer, however I estimate the savings there to be only about 0.15 MB/H.  A possible source of savings would be from the improved server-side culling of updates for objects that are not visible.  If there are any objects that are moving around or changing appearance near the bot but are outside its view then update rates for those objects will be much attenuated in Magnum.

@Everybody else -- This behavior where updates for out-of-view objects are not sent until they are within view causes a visible glitch that some people have noticed.  In particular, if you have a pile of objects that all move around or change appearance while your view is turned away and then you turn around to look at them you'll briefly see them in their old state and then they will update in your view after about 1 second or less.  I haven't seen anyone complain about this glitch recently and I'm curious to know if it is still annoying anyone.  If anyone has an opinion about this glitch I'd like to hear it.

I've got a "fix" that reduces the glitch, but doesn't actually eliminate it.  I'm planning on submitting that work as its own RC project and it is probably a couple of weeks down the RC schedule. I'd like to know how much the glitch will annoy people between now and the release the "fix".

Share this post


Link to post
Share on other sites

I think the fix to the UDP packet acks would drop your bandwidth from the terrible 1GB/H back down to the normal 5MB/H.  Improvements beyond that must be attributable to other changes. 

I'm no MTU expert, but if a packet size limit below 1500 was also introduced with the fix, I believe that might account for a further drop in bandwidth on certain connections through certain networking hardware. In fact, I can imagine that even if the limit stayed at 1500, the risk of pushing that limit could be specific to which updates are sent, so something like reducing wind updates or dropping cloud layer updates might have a disproportionate positive effect.

Regarding the momentary glitch in updates from out-of-view objects, I don't think I've ever even noticed it, so evidently it's not annoying to me anyway. In general, I can't imagine it being more disruptive than the viewer-side glitch with painting the wrong side of the departing sim when entering a new one, or the soon-to-be-vanquished problem of the departed sim "blinking" invisible 50 seconds after leaving it (fixed with the interestlist work, I understand).

Share this post


Link to post
Share on other sites


Qie Niangao wrote:

I'm no MTU expert, but
if
a packet size limit below 1500 was also introduced with the fix, I believe that might account for a further drop in bandwidth on certain connections through certain networking hardware

have seen this 1500 number mentioned at diff times. I understand is some kinda max limit for the protocol. so my question is: do linden assume that 1500 is the packet size for all conns and operate to this max limit?

I ask bc my Belkin N300 router is max 1454. if I try set this to larger then it don't go at all for anything interwebz. just fails completely and refuses to work

so am curious as to what happens if SL sends me a 1500. like is the 46 extra bytes just ignored/lost? or do the viewer invoke a lost packet something and request again? or do the 1500 get automagically chopped into 2 packets: 1454 and 46 somehow?

 

Share this post


Link to post
Share on other sites

It's the third option. The packet gets split into two packets on a connection where the MTU is too small, and reassembled on the way. Which means that extra sets of packet headers have to be transmitted, and CPU cycles have to be spent on assembly and disassemby.

I am not going to pretend to understand all the messy details, but in the old days, when people used dial-up modems, MTU really mattered. It took a second or so to send a 1500-byte packet over a modem link.

1500 bytes is the maximum for standard Ethernet, but if you're using a broadband connection over a telephone line, that almost certainly uses PPPoE, which has an MTU of 1492 bytes. There are a few other things which can reduce that further, such as VPNs, but I reckon the bandwith penalty for the 1492-byte MTU is insignignificant compared to the cost of splitting packets. (0.54% penalty)

There's a strong argument for Linden Labs to use a 1492-byte MTU, which is only likely to affect object-rezzing. If automatic systems are used and working correctly, Linden Labs should only be sending 1492-byte packets anyway, but I think it is better to set this one explicitly. Packet size is always no bigger than the data needs.

In fourteen hundred and ninety-two

Columbus sailed the ocean blue.

He wasn't the first, there was nothing surprising,

Nut this is the number to set packet sizing.

 

 

 

 

Share this post


Link to post
Share on other sites

Andrew, I have filed a Jira on a thing I've been seeing, as follows:

 

If I look around in Kuula, so as to get all the visible textures rezzed, and then stand with my back to one area for five or six minutes, when i turn around, many of the textures show as grey and then flash back to correct. These are NOT textures that change: they are constant. The effect hits signs, and even trees.

What it looks like is that textures left the cache. I'm not clear on how the changes you're describing here work but one way or another, visible stuff that I have seen is being repainted. It's not the end of the world but it does mess up the experience.

 

Thanks!

Share this post


Link to post
Share on other sites

Janet, what you're describing sounds like a behavior of the viewer.  It is probably performing a lazy scan for textures that are not being rendered in view and slowly unloading those from memory -- in an effort to prepare the memory for any new textures that might show up, and to help keep your render frame rate high.

This is definitely not related to the server delaying updates for objects behind you since these objects are not chaning and thus would not be getting updates anyway.

Share this post


Link to post
Share on other sites

After looking through the SL UDP packet code a bit more I have a little better understanding of how it works than last time I spoke about it.

I learned that when we actually pack data for a UDP packet we limit the payload to 1200 bytes.  We're aiming for a max of 1500 bytes for the entire packet, counting UDP header and some ACK information that we add to the end of the normal 1200 byte payload, so we leave 300 bytes for that stuff.  Once the final payload is computed (including the initial data and postpended ACKs, but not including the UDP protocol header) we check the size and print a WARNING to logs if it exceeds 1500 bytes, but we send it anyway.

This is a little bugged.  We should be checking for "data_length > MTU - UDP_header_size" at this point.  This wouldn't be hard to fix -- a little googling reveals that UDP_header_size = 64 bytes.  However the question is:  when aiming for 1200 byte payloads, what is the maximum size of the true packet?  Most of SL's packet types easily fit into just a few hundred bytes, but the ones that stream object data are packed to the maximum -- if we're ever exceeding an MTU then it would occur when packing those.  I suppose I should survey the maximum packet sizes of a typical connection to a very full and busy region to see how big they are getting and if we are exceeding 1492, or maybe 1456, then maybe we should aim lower than 1200 bytes.

The UDP packet size bug that was affecting Magnum last week would cause payloads to go over 3000 bytes!  Interestingly most network routing hardware would transparently split these packets and reassmble them before they arrived at the client computer but not all -- some people were seeing terrible packet loss.  If we are occastionaly exceeding the true MTU of some connections then most people wouldn't be affected but it might show up as a steady background of packet loss for a few.

Share this post


Link to post
Share on other sites


Andrew Linden wrote:

 

@Everybody else -- This behavior where updates for out-of-view objects are not sent until they are within view causes a visible glitch that some people have noticed.  In particular, if you have a pile of objects that all move around or change appearance while your view is turned away and then you turn around to look at them you'll briefly see them in their old state and then they will update in your view after about 1 second or less.  I haven't seen anyone complain about this glitch recently and I'm curious to know if it is still annoying anyone.  If anyone has an opinion about this glitch I'd like to hear it.

I'm seeing an issue whereby double doors, rather than single doors, aren't showing as closed whilst I'm in the room but they are closed, when I try to walk outside I get blocked from doing so because the doors are closed. If I walk to another parcel and come back, the doors are closed.

I haven't checked this on other viewers but I have seen it on two different machines.

Share this post


Link to post
Share on other sites

thanks for the info

i will play a bit more and look for the 1500+ warnings in the console/logs and see if i get any. and if i do what affect it has. re any packet loss i might get at the time

Share this post


Link to post
Share on other sites

Thanks Andrew,

I would have thought that the only thing that would unload things from a cache would be a cache full situation. I would have thought that removing things from a cache would be inexpensive enough not to be something one did pre-emptively. Since I have my cache maxed, I would never expect things to be removed ahead of time.

I understand that this isn't a server issue and will keep digging to find someone better to mention this to. Thanks!

Share this post


Link to post
Share on other sites

Janet, you mention that your cache is maxed, which sounds like you're talking about the cache on your local filesystem (hard drive), which can be configured to be large or small.  I was talking about storage in memory (RAM).  The viewer will take a lot of RAM, but there are limits to what it can get -- the rest of your operating system and other applications also need RAM to function.  Also, there is special RAM in your graphics card which is also limited.  If you are in a texture-heavy region (many large unique textures) then it may be that the viewer cannot hold all those visible AND those behind you, so it will swap out non-visible textures behind you to make room for the stuff it is trying to show in front.

Also, it probably doesn't do all of this switching at once because it takes time to sort through and move lots of data around and to do so in one solid chunk of work would cause your render frame rate to block for a significant fraction of a secondb -- a big lag event.  The work is probably parceled out across multiple frames so the scene will update in a timely fashion -- less lag over a longer period.

You might try reproducing the problem in a region with lots of objects but much fewer textures (a big build still in its plywood stage).  You might also try to discover how much memory is on your video card and consider testing the texture-heavy scene on another system with a higher-grade graphics card (if you have access to such).

Share this post


Link to post
Share on other sites

Thanks for continuing to help, Andrew. I know you have important things to do, so dont feel obligated to stick with this.

 

I'm running a 16GB iMac with 2GB graphics memory. I have texture cache maxed at 9948 or whatever it is. The SL process uses about 1.4GB real memory (900MB private, 77MB shared, 1GB virtual. The machine uses no swap space, and shows just under 5GB free. So it's clear SL isn't using the 9GB I've offered it. I'll try dialing that down and see what happens.

 

Thanks again,

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...