Jump to content

Henri Beauchamp

Resident
  • Posts

    1,188
  • Joined

Posts posted by Henri Beauchamp

  1. 23 hours ago, Monty Linden said:

    [Viewer]  Request timeout fires.  Begins to commit to connection tear down.

    In fact, I could verify today that this scenario cannot happen at all in SL. I instrumented my viewer code with better DEBUG messages and a timer for event poll requests.

    Under normal conditions (no network issue, sim server running normally), event polls never timeout in the agent region before an event comes in. Even in an empty sim, without any neighbour regions, the ParcelProperties message is always transmitted every 60 seconds (and for an agent region with neighbours within draw distance, you also get EnableSimulator for each neighbour every minute).

    Timeouts only occur for neighbour regions, when nothing happens in the latter, and after 293.8 seconds only.

    So, when a user requests a TP, the agent region will not risk seeing the poll request timing out just at the moment TeleportFinish arrives, causing a race condition in the HTTP connection tear down sequence, like you described.

    However, what would happen if, say, a ParcelProperties message (or any other event in the agent region) arrives milliseconds before the user triggers a TP request ?... The poll request N finishes with ParcelProperties, the TP request fires, and what if TeleportFinish is sent by the server just before the viewer can initiate poll request N+1 (reminder: llcore uses a thread for HTTP requests) ?... Maybe a race condition could happen here (depending on how events are queued server side, and how the Apache delays in connection building and tear down could lag/race things; this might explain why TeleportFinish is sometimes queued but never sent, maybe ?)...

    In any case, I would suggest reconsidering the way TeleportFinish is sent to viewers: what about restoring the old UDP reliable path for it ?... Or implementing a message for viewers to re-request it, when they did not get it ”in time”...

  2. 1 hour ago, Monty Linden said:

    There are several failure scenarios including one where the TeleportFinish message is queued but the simulator refuses to send it for reasons unknown yet.  A more elaborate scenario is this:

    • [Viewer]  Request timeout fires.  Begins to commit to connection tear down.
    • [Sim]  Outside of Viewer's 'light cone', queues TeleportFinish for delivery, commits to writing events to outstanding request.
    • [Sim]  Having prepared content for outstanding request, declares events delivered and clears queue.
    • [Apache]  Adds more delay and reordering between Sim and Viewer because that's what apache does.
    • [Viewer]  Having committed to tear down, abandons request dropping connection and any TeleportFinish event, re-issues request.
    • [Viewer]  Never sees TeleportFinish, stalls, then disconnects from Sim.

    Yes, this might indeed happen... I will have to try and log one such scenario (got nice DEBUG level messages for event polls and, now, server/viewer messaging)...

    The problem here, is that we do not have a way for the viewer to acknowledge the server TeleportFinish message... The latter used to be an UDP ”reliable” message (with its own private handler), but got UDPDeprecated then UDPBlacklisted in favour of the event poll queue/processing... It was not the wisest move...

    A possible workaround would be to allow the viewer to (re)request TeleportFinish; in this case, a simple short (5 seconds or so) timeout could be implemented viewer side after a TP has started, and if TeleportFinish has not been received when it expires, then it would re-request it...

    EDIT: I'm also seeing a possible viewer-side workaround for such cases via the implementation of a ”teleport window” timer. That timer would be reset each time the viewer starts an event poll request for the agent region: when the user asks for a TP, the timer would be checked and if less than, say, 2 seconds are left before the timeout would fire (since it is set viewer side, at least in my code, for SL, it is known), the TP request would be delayed till the next poll is started...

    • Like 1
  3. 1 hour ago, Monty Linden said:

    Well....  you don't really know that unless what you've received matches what was intended to be sent.  And it turns out, they don't necessarily match and retries are part of the problem.  I already know event-get is one mode of TP/RC failure. 

    All the TP failure modes I get happen before event polls are even started: the UDP message from the arrival sim just never gets to the viewer, so the latter is ”left in the blue” and ends up timing out on the departure sim as well..

  4. 1 hour ago, Monty Linden said:

    This keeps getting worse the more I look.  So *both* viewer and simulator implement a 30-second timeout on these requests.  A guaranteed race condition.  Simulator's spans a different part of the request lifecycle than does the viewer's curl timeout.  More variability.

    Do have a look at the comments I added in linden/indra/newview/lleventpoll.cpp, in the Cool VL Viewer sources, for the various modifications I implemented to deal with both SL and OpenSim idiosyncrasies... In particular:

    	LLAppCoreHttp& app_core_http = gAppViewerp->getAppCoreHttp();
    	// NOTE: be sure to use this policy, or to set the timeout to what it used
    	// to be before changing it; using too large a viewer-side timeout would
    	// cause to receive bogus timeout responses from the server (especially in
    	// SL, where 502 replies may come in disguise of 499 or 500 HTTP errors)...
    	// HB
    	mHttpPolicy = app_core_http.getPolicy(LLAppCoreHttp::AP_LONG_POLL);
    	if (!gIsInSecondLife)
    	{
    		// In OpenSim, wait for the server to timeout on us (will report a 502
    		// error), while in SL, we now timeout viewer-side (in libcurl) before
    		// the server would send us a bogus HTTP error (502 error report HTML
    		// page disguised with a 499 or 500 error code in the header) on its
    		// own timeout... HB
    		mHttpOptions->setTransferTimeout(90);
    		mHttpOptions->setRetries(0);
    	}

    Yes, it is indeed as bad as it looks... This said, my modified code performs just fine in both SL and OpenSim now, and the failed TP issues still seen are not related with event polls anyway (event polls are simply retried on timeouts).

  5. 15 minutes ago, kyte Lanley said:

    Oui mais le problème c'est que je ne peux pas éditer mon premier post car je n'ai pas l'option éditer :

    https://imgur.com/a/0V2r3f8

    Je n'ai la fonction éditer que sur mon dernier message.

    Il semble en effet qu'après un certain temps ((c)1955 Fernand Raynaud 😛) l'option d'édition disparaisse... 😢

    Autre suggestion: ajoutez un message à ce fil de discussion en en faisant un court résumé en Anglais, avec la question ”mise à jour”.

  6. 2 hours ago, kyte Lanley said:

    Hors on voit bien dans le tableau que j'ai posté dans mon premier message que les cartes AMD de la génération 7000 ont de très bonnes performances en Open GL.

    Un tableau tiré d'un test de performances qui n'a rien à voir avec SL et qui ne dit rien des conditions de test. Par exemple, quel était le mode de fonctionnement des pilotes ?  Profil de compatibilité (compatibility profile: i.e. un profil avec support des commandes Open GL dépréciées/caduques) ou profil strict (core profile) ?...

    Avec les pilotes NVIDIA, on observe, dans le viewer, +50 à +100% (en fonction de la scène rendue) de performances en mode core profile, au contraire des pilotes d'AMD où les perfs sont quasi-identiques. Donc, si le test a été réalisé en profil de compatibilité, les résultats pour NVIDIA apparaissent moins bons qu'il ne pourraient être en comparaison avec AMD...

    Un autre point est l'utilisation des profils partagés (shared GL profiles), dont, là encore, NVIDIA profite mieux qu'AMD; un problème de synchronisation de la queue de commandes Open GL, qui doit être faite dans le fil d'exécution (thread) principal avec les pilotes AMD, alors qu'elle peut avoir lieu dans les fils secondaires pour NVIDIA, évitant des ”hoquets” dans le taux d'images par seconde.

    De plus le test fait référence à Open GL v4.5, alors que la version qui compte vraiment est la dernière, i.e. v4.6; on peut donc se poser des questions sur l'étendue des fonctions testées, en particulier dans les nuanceurs (shaders)...

    Par ailleurs, les performances ne sont qu'un aspect des choses. La robustesse (absence de plantages avec les pilotes NVIDIA, là où AMD se vautre littéralement), et le respect du standard Open GL (*) en sont deux autres, que NVIDIA gagne haut la main.

    (*) Il y a dans le code des viewers des contournements de bogues pour les pilotes AMD (et Intel, d'ailleurs), dont les pilotes NVIDIA n'ont pas besoin grâce à leur strict respect de la spécification Open GL.

     

    2 hours ago, kyte Lanley said:

    C'est pour cela que j'aimerais avoir le témoignage de quelqu'un qui possède une 7900 (xt ou xtx) pour qu'il me dise comment sa carte se comporte sur Second Life.

    C'est pour cela que le lien que j'ai choisi dans mon précédent message pointe vers le témoignage d'un utilisateur qui a essayé AMD, a été déçu, et a finalement retourné la carte pour prendre une NVIDIA qui elle, lui a donné satisfaction...

     

    Notez que je n'ai rien contre AMD (mon dernier PC utilise même un Ryzen 7900X qui est un super CPU, dont je suis très satisfait et que je ne peux que chaudement recommander). Simplement, je me base sur mon expérience passée (certes ancienne) et sur le retour des utilisateurs des ”viewers” (le mien, et les autres), qui concordent parfaitement.

     

    2 hours ago, kyte Lanley said:

    Avis donc aux possesseurs de ces cartes, j'attends vos témoignages.

    Vous auriez plus de chance d'obtenir une réponse en posant votre question en Anglais...

  7. 5 minutes ago, kyte Lanley said:

    Je te remercie pour ta réponse j'attends d'autres commentaires pour voir si les autres avis vont aussi dans ton sens.

    Une recherche sur le forum vous permettra de recueillir d'autres témoignages déjà exprimés. Par exemple celui-ci.

    A noter que l'aspect ”bogue” des pilotes est très important; en tant qu'auteur du Cool VL Viewer, j'ai eu le retour de plusieurs utilisateurs de cartes AMD rencontrant des plantages dont les traces (crash dumps) pointaient en plein dans le code des pilotes d'AMD...

    • Like 1
  8. Les pilotes AMD Open GL sont toujours très inférieurs à ceux de NVIDIA.

    Sous Linux, avec Mesa, les différences sont moins visibles, mais sous Windows, il n'y a pas de match: NVIDIA bât AMD (à coût de carte et génération de GPU égaux) haut la main !  Comptez +30% de performances avec NVIDIA comparé à AMD.

    En plus, les pilotes ”améliorés” d'AMD (Adrenalin), qui ont permis de rattraper une partie du retard en terme de performances, sont bourrés de bogues entraînant des plantages, ce que l'on ne voit pas avec les pilotes NVIDIA...

    Pour Open GL et Vulkan, il n'y a pas à hésiter: NVIDIA est le choix logique et évident.

    • Like 1
  9. You might be interested in the Cool VL Viewer v1.30.2.27 (or experimental branch v1.31.0.5) I released today: I revamped the messaging logging so that the DEBUG level ”Messaging” tag logs all the messages (with the exception of the super-spammy and pretty irrelevant ”PacketAck” one) exchanged between the viewer and the server.

    You may toggle the ”Messaging” debug tag when needed (before or after login/TP/region change/draw distance change), from ”Advanced” -> ”Debug tags” in the login screen menu or ”Advanced” -> ”Consoles” -> ”Debug tags” in the main (post-login) menu.

    Also, I implemented optional threaded object cache reads (which toggle is ”Advanced”  -> ”Cache” -> ”Threaded object cache reads”), meaning the viewer won't block to read object cache file(s) after a ”RegionHandshake” message from the server and will keep processing other messages, replying with ”RegionHandshakeReply” only after the object cache file(s) (the 's' is for the PBR viewer branch) has(have) been read: this enlightens some interesting ”aspect” (bug ?) of the server messaging algorithm. In particular, you will see that the server sends two (!) ”RegionHandshake” messages without even waiting for a first ”RegionHandshakeReply” or some timeout, excepted on login (where only one such message is sent by the login region server, unsurprisingly and as should ”normally” be the case)...

  10. 1 hour ago, TheFlyingSaltire said:

    I started using Firestorm years ago but, after an update (two years ago, maybe three) it became very laggy, even on low settings.  I experimented with a few alternative viewers and found that Singularity worked best.  I know that it hasn't had any recent updates, but it works and, to me, that's all that matters.

    Try the Cool VL Viewer...

    Or, if the legacy Windlight renderer works faster for you, and you are under Windows, try Genesis (a fork of Singularity).

    • Like 2
  11. The ”RegionHandshake” message is likely what you missed: it should be received by your viewer after connecting to a region, and your viewer should reply to it via ”RegionHandshakeReply”.

    I do not see those messages in the diagrams above...

    In the C++ viewers, the handler for ”RegionHandshake” is set in llstartup.cpp (like other handlers). That handler is implemented in LLWorld::processRegionHandshake() and calls LLViewerRegion::sendHandshakeReply() for the corresponding region, the latter itself sending the ”RegionHandshakeReply” message back to the simulator server, after it got the objects cache loaded for the region.

  12. 3 hours ago, embe Binder said:

    Yes, that's the problem. Creators who design their content, use their alternative avatars to adjust the poses for their stuff, also mostly use the Firestorm, with the wrong size (btw. correctly measured according to the metric system!), with the effect that the regular Linden avatars (according to the Linden system measured) usually stand with their feet in the ground. The avatars are also getting smaller and smaller, so that with my viewer (Catznip), which measures according to the regular Linden system, with my 1.75m next to male avatars, who are then usually ”only” around 1.55m tall, I look like a giraffe, what annoys me. Now you could say i change my shape to 1.55 m  to adapt to the general trend, but then I sit in several armchairs like a doll with legs hanging in the air. This chaos is untenable, especially since I keep seeing avatars that by lawrights are more likely to be child sizes but are hanging around on Adultsims.

    I am sorry, but this is total nonsense (to stay polite) !

    Let's state facts:

    • First of all, the original Linden viewer never provided any measurement for the avatar height (you only got unit-less sliders, many of which do influence the full body height, and not just the ”Avatar height” one).
    • The LSL function for reporting the avatar size (llGetAgentSize()) is sadly crippled, because it reports the height between the feet joint (at the level of the sole) and the head joint (roughly at the level of the nose) and not the ”true size”, as measured when comparing your avatar height to in-world primitives height, from feet soles to top of (bald) head. This, of course, is plain silly (you do not measure your height this way in RL, and the result returned by that function therefore gives a false measure), but results from the fact that the server got no concept whatsoever of the head (or body) shape (this is entirely rendered viewer-side, from mesh files distributed with the viewer; see the character/*.llm files in the viewer installation directory), and only accounts for (only a few) joints. For a legacy (non-mesh, non-prim) avatar, the difference in height is approximately 17cm +/- 2cm (there are possible variations due to the head shape sliders values).
    • Many ”avatar size” scripted items have been distributed in SL: I made one myself, back in early 2007 (well, actually in late 2006, shortly after I joined SL, for personal use: it was one of the very first LSL scripts I wrote). Some of them do account for the 17cm ”offset” to add to llGetAgentSize(), and are the only valid ones. You can get a free copy of the ”True Height Detector & Comparator” I made from the Cool Shop in Hunburgh (adult sim).
    • As a result of the above mess, TPVs (and mine among the very first ones, if not the first), provided a way to report the actual body height in their ”Appearance floater”. In my implementation, I give the ”actual” (corrected) body height (not counting shoes), the shoes height, and the feet to pelvis height (see the attached screen shot).
    • The body height, be it measured by llGetAgentSize() or reported in any viewer, does not count the least for animations design: what counts is the feet to pelvis height, and only for non-standing/walking/running animations (i.e. only for kneeling, crawling, sitting and laying animations). The reason is that, for animations, the server physics engine automatically sets your avatar soles (feet soles or shoes soles, when the latter are worn) to the level of the ground/floor under your avatar, suitably for standing/walking/running animations; it does so by adding the feet to pelvis height to your avatar's position to calculate the position of the pelvis (which is the reference position sent to the viewer); however, the server got no concept whatsoever either about what an animation is (here again, the animations are only played viewer-side), and the correction it gives to your avatar's pelvis position must therefore be ”reverted” by the animation designer, or by another mean, such as via the ”Avatar Z offset”, when that animation is not a standing/walking/running animation.
    • When they design an animation, most designers just use the default female of male avatar mesh (corresponding to the default female or male shape, with all sliders set to 50), as provided by LL (link in this Wiki article). Most advanced designers may provide several animations, designed for several avatar pelvis to feet heights. They therefore cannot care less about what this or that viewer is displaying in the Appearance floater for the avatar height.

    If you want more gory details about the animations offsets, you may also read what I wrote years ago, when I designed and implemented the @adjustheight Restrained Love command, to allow auto-adjusting via scripts the animations they play, to match any shape your avatar is currently wearing.

    Another hope I have, is that the work done on the ”Puppetry” feature (currently on hold) will see one of its components (the Inverse Kinematics for animations) further developed and, one of those days, applied to the SL animations (well, we would actually need a new animation format/version, to keep backward compatibility).

     

    Here is what the avatar height reports look like in the Cool VL Viewer (do read the tool tips !), and what the ”True Height Detector & Comparator” looks like:

    AvatarHeight.thumb.png.9486865e5774ff56aa677460ed75c88b.png

     

    As for the avatars' size in SL, they are sadly influenced by the excessive and disproportionate default values (the ”50” of body shape sliders, which causes the default ”Ruth” female shape to be 1.87m tall !), resulting in giants to invade SL, and causing frustration in people trying to keep things ”realistic” enough for their own avatar...

    At least, with the TPVs displaying the ”actual” height in their floaters, this should incite people to make their avatar look more realistic by properly adjusting the sliders. This is a Good Thing (TM), and certainly nothing to b1tch about !

    • Like 5
    • Thanks 3
  13. 2 hours ago, Jenna Huntsman said:

    It's worth noting that LL have itchy trigger fingers to remove features that aren't being used as intended

    To compensate, I have ”sticky fingers”, to keep or reimplement useful features that have been unduly removed... 😜

    • Thanks 1
    • Haha 3
  14. 11 hours ago, Monty Linden said:

    I'll share a recent finding to illustrate what this looks like.  If you've looked into the code you may be familiar with the TeleportFinish message.  This is sent to the viewer which triggers it to move into the destination region.  The discovery is that this message is often queued but never delivered to the viewer leading the viewer to disconnect

    That's a very interesting find out !...

    It should be rather easy to fix this issue on your side: checking on a non-empty queue should hopefully not prove too difficult... 🤣

    12 hours ago, Monty Linden said:

    (yes, it's the viewer that disconnects)

    Well, sadly, the viewer cannot guess the arrival sim IP and port which are sent via the TeleportFinish message... We could increase the timeout before disconnection up to the limit that would make the departure sim disconnect in its turn (3 minutes, I guess ?), but it would not improve the least the user experience and only frustrate them more for having waited 2 more minutes for nothing.

  15. 12 minutes ago, Zalificent Corvinus said:

    That doesn't really explain why I got 390ms from England to MELBOURNE Australia, and 200ms to London, if the test site is in the USA, maybe that would explain a 200 ms ping to London from northern England, but 390 to Australia, and 750 to Virginia, Eastern USA?

    You cannot predict what one of the many routers on your route from point A to point B in the World, will decide to direct your packets through: maybe one packet will take one of the cables from Melbourne to West US coast, and the next will take the route via South Africa/UK/East US coast...

    'traceroute' (UNIXes) or 'tracert' (Windoze), might tell you more about what actual route is taken between your PC and the server, but nothing about what route is taken for those latency testing sites...

  16. Using the Cool VL Viewer, you may create sub-folders in such ”v2-specific” folders...

    Still using a v1 folders layout (with the exceptions of the ”Current Outfit” and ”Marketplace” root folders), my viewer just considers those v2-specific folders as ”normal” folders, and you may manipulate them as you see fit (e.g. create sub-folders in them or right out delete them, which is what I always do after I log in with my viewer, returning from a v2+ viewer (testing) session, grumbling and pestering as those $h1tty v2+ viewers create useless, cumbersome stuff in my inventory I never asked or even permitted them to create !).

    For me, such folders are just noise in my inventory; since week one after I joined SL, I created an ”Avatars” root sub-folder, to store all my avatar's forms (and for each form, various outfits in sub-folders for that form); I therefore never needed a ”My Outfits” folder...

    • Like 1
    • Thanks 1
  17. 1 hour ago, AmeliaJ08 said:

    AWS pings very high for me too, seems to be global. Even servers very close to me are showing far higher (3-4x more than Azure) than would be expected.

    Azure seems to be fine, expected values given distance in all cases so doubt it's a subsea cable issue.

    AWS latency also reports too high a ”ping” (300-450ms, US West Oregon, i.e. not even Seattle's SL AWS servers location) for me, here (France) when compared to what I do get from the viewer Statistics (180-200ms), the latter being totally in line (genuine ping + server frame time / 2 + viewer frame time / 2, averaged) with what I get from Azure latency (West US 2: Washington, collocated with SL's AWS servers in Seattle: 160-180ms).

    This could totally be due to a broken/congested undersea cable from UK to US (should AWS' latency web site test go through UK, unlike Azure's), when compared to an intact/non-congested France to US cable: the variations you would see would just be due to Internet routers deciding, from time to time, that no, indeed, the shortest but broken route is just too slow, and  instead route your traffic via the longest but intact/non-congested route...

  18. 24 minutes ago, Merive Vermilion said:

    Hi, I just tried your ping time checker and all I'm getting is a 404 page.

    The page only loads if the URL is given without the ”latency” ending (which is yet added afterwards: the joys of Javascript): https://www.awsspeedtest.com/

    This said, the site does not provide a test for the Seattle location where AWS servers used by SL are located: I get 340ms from the web page, 180ms from the viewer Statistics...

    EDIT: the Azure latency seems in line (165ms) with the Statistics I get from the viewer (180ms, which includes the sim and viewer latencies), using ”US West 2” (Washington), which also corresponds to the sim server location.

  19. 9 hours ago, Aishagain said:

    you are not telling me anything I did not know.  That is not what I was seeing.  There were some sims that I regualrly go to that were clearly NOT running cleanly: rezzing was irregular and unreliable.  The issue of load on the AWS server may well have some relevance here though. One region that was running well last night is now being pole-axed by a high ”ping”, with no obvious explanation.

    What I am telling you is that there could be three explanations to the lengthened ”ping” times. Ruling out ”your side” (viewer not running any less good or bad as usual), there still could be two possible causes: a lengthened ”genuine ping” time (the result of a slow routing or congestion on Internet between your location and AWS servers location), or an excessive load on the AWS server running your sim(s).

    Sadly, and since the AWS servers are not responding to the OS 'ping' command, it is harder to find out which explanation is the right one, though, a look at the Statistics floater (CTRL SHIFT 1) could tell you whether the sim server is overloaded or not (if not, then the issue is likely a network routing one).

  20. The ”ping” time as measured by the viewer is a combination of several accumulated delays:

    1. The ”genuine ping” time (as you would measure via the 'ping' OS command), between your computer and the AWS server running the simulator.
    2. The time it takes for the simulator to acknowledge and reply to the (pseudo) ”ping” message sent by your viewer; this includes up to one simulator frame ”render” time (should your message be received just after the simulator has processed its message queue for the current frame).
    3. The time it takes for your viewer to process the ”pong” reply from the server, which here again may include up to one frame render time.

    So:

    1. The longer the route or the highest the congestion on the intercontinental cables, the higher the ”genuine ping” time (you can measure it via the 'ping' command: look at the AWS server IP for the sim in the About floater, and ping that IP). EDIT: nope, you cannot... AWS servers are apparently configured to drop pings. 😢
    2. The more loaded the AWS server running the sim, the slower its reply, the longer the SL ”ping” time.
    3. The slower your viewer, the slower it processes server messages, the longer the SL ”ping” time.

     

    • Like 1
  21. To avoid auto-adjustment to HDR tone mapping of legacy sky settings (the ones without a ”reflection probe ambiance” parameter), set the ”RenderSkyAutoAdjustLegacy” debug setting to FALSE.

    Also, LL changed the default Midday sky setting to include a ”reflection probe ambiance”, so if you want to see Midday without HDR tone mapping, use instead the ”Midday (legacy)” environment preset in the World menu.

    • Like 1
    • Thanks 5
  22. Note that you might encounter a slight issue, due to the amount of objects in your inventory: with 330K items to scan, the Lua watchdog could kick in and prevent your Lua program to complete. There are ways around this issue (using Lua threads with sleep(0) called at each recursion, or implementing the program in the automation script itself and using CallbackAfter()), but they are not very user-friendly...

    For tomorrow's release, I implemented a relaxed (and configurable) watchdog timeout for one-shot Lua scripts loaded/executed from a file, as well as a new function to get the delay left for your program to complete before the watchdog would interrupt it.

    • Like 1
×
×
  • Create New...