Jump to content

Henri Beauchamp

Resident
  • Posts

    1,218
  • Joined

Everything posted by Henri Beauchamp

  1. Thank you for a really useful paper ! It indeed explains a lot of things I could observe here with my improved LLEventPoll logging and my new debug settings for playing with poll requests timeouts... As well as some so far ”unexplainable” TP failure modes, that resist my new TP queuing code (queuing till next poll has started, when too close from a timeout). Tomorrow's releases of the Cool VL Viewer (both the stable and experimental branches) will have all the code changes I made and will allow you to experiment with it. I will post details about the debug settings and log tags here after release. Looking forward for the server changes. I'll have a look at what duplicated messages could entail viewer side (especially CrossedRegion and TeleportFinish which could possibly be problematic if received twice) and whether it would mandate viewer code changes or not.
  2. Unigine Superposition is far from optimized for OpenGL... You'd get better results under Windows and DirectX than Linux and OpenGL, even though OpenGL Windows' performances with it are indeed abyssal. So yes, better not trusting too much its results for OpenGL. The results of Valley are however perfectly in line with what I get with the viewer: around +10% fps in favour of Linux. In fact, you'd get better results with Windows 7/8 (less overhead than Win10 or Win11)... The problem being that you won't have valid drivers for it and such a modern GPU...
  3. When he is giving the finger, yes, he definitely looks like the stupidest man in the world... Linus Torvalds is no god, and while quite intelligent, he can also prove totally stupid, at times, like everyone (us included): giving the finger to people, for whatever reason, is one of the stupidest and pointless thing to do (and will likely achieve the exact opposite effect of what the person giving the finger would expect/hope) ! Oh, and what would it be, please ?.... I have been using NVIDIA cards and their proprietary drivers for over 19 years (my first NVIDIA card was a 6600GT), and never missed a single feature ! Settle down yourself, pretty please... I am not the person who is spreading FUD... I already replied this question, but of course, if you only read the first phrase of my previous post, you missed it... Read again: it was in the second phrase... 🫣
  4. This is only the case in the #RLV folder, and you may disable this behaviour... This is strictly how RLV is supposed to work for no-mod attachments and how Marine Kelley specified it (see the text after ”HOW TO SHARE NO-MODIFY ITEMS”); the Cool VL Viewer uses my own fork of Marine's implementation, which abides strictly to her specifications. Attachments get renamed when they are in #RLV, to add the joint name to their name (this allows to avoid detaching attachments on locked joints by accident when you change outfits, and prevents the detach/auto-reattach sequence that would ensue and could break some scripts or trigger anti-detach alarms in some objects); for no-mod attachments (which cannot be renamed), RLV instead moves them into a newly created sub-folder bearing the joint name. However, and since some people are used to RLVa viewers' way of doing things (RLVa is a rewrite of RLV and differs in many subtle and less subtle ways from RLV) , I implemented a setting to disable the auto-renaming of attachments in #RLV (which also stops the viewer from creating sub-folders for no-mod attachments): the toggle is ”Advanced -> ”RestrainedLove” -> ”Add joint name to attachments in #RLV/”. A simple question on the Cool VL Viewer support forum would have given you the answer...
  5. Maligned rightly ?... Only by stupid people, I'm afraid... The proprietary drivers for NVIDIA under Linux work beautifully (and around 10% faster than under Windows), with first class, and super-long time support: all bugs I reported to NVIDIA in the past 19+ years I have been using their drivers have been addressed, most of them quite promptly (first class indeed, especially when compared with AMD and ATI cards I owned in the distant past, for which Linux support was abyssal), and I today can still run my old GTX 460 (a 13 years old card !) with the latest Linux LTS kernels and the latest Xorg version. They are also super-stable, and adhere strictly to OpenGL specs. The Vulkan drivers and the CUDA stack are great too (with CUDA much faster and actually often better supported under Linux than OpenCL: e.g. with Blender, which only recently started implementing support for OpenCL when CUDA has been supported for years). It should also be noted that NVIDIA open-sourced their drivers for their recent GPUs and that, while AMD and Intel (used to) contribute more Open Source to Linux, they still rely on Mesa folks for their Linux driver (meaning less performances than a closed sources driver, because the Mesa maintainers do not have access to all the secret architecture details of the GPUs), and that you still need closed sources software ”blobs” to run their GPUs under Linux...
  6. What things exactly ? O.O My viewer is in fact MUCH safer than any other viewer, since it never touches your inventory structure unless you manually trigger an action it offers (such as consolidating base folders, or recreating missing calling cards), unlike what even LL's viewer is doing in your back (consolidation and calling cards recreation are systematic at each login with LL's v2+ viewers and all the TPVs forked from it). It also got safe guards against essential folders deletion or move to another folder (such as the COF: deleting or moving it could get you into BIG troubles), while allowing you to delete (if and only if you so wish and do) some unnecessary folders that got introduced with v2+ viewers and are just clutter for v1 viewer old timers like me. As for its consolidation algorithm (only triggered on demand), it is more elaborate than LL's and also able to, sometimes, repair ”broken” inventories (inventories with duplicate base folders, for example). This is not because the inventory is presented differently (like a v1 viewer does), that it got ”terrible things” done to it !
  7. In fact, you can use MFA in SL without a smartphone, but it is rather complicated and I wish LL would provide MFA via email... Here is the procedure I described in the opensource-dev mailing list (at the end of that archived email).
  8. This is likely due to an inventory server issue: the ”Marketplace Listings” folder is created in merchant's inventory as soon as they connect for the first time as a merchant to SL. In LL's original code, any failure to create that folder (which may happen, in case of inventory server issues) causes an LL_ERRS(), which ”voluntarily” crashes the viewer... Rather user-unfriendly and not very helpful either. Try and connect with the Cool VL Viewer instead: it won't crash, and should it also fail to create the Marketplace Listings folder, you can try and disable AISv3 (an HTTP-based inventory protocol, which sometimes goes mad and stops working for a while), from the ”Advanced” -> ”Network” menu (un-check the ”Use AISv3 protocol for inventory” entry), then relog and verify the ”Inventory” floater (there is no separate Marketplace floater: it's all v1-like UI with all inventory folders showing in the Inventory floater, even though you may choose to hide the Marketplace Listings folder via the corresponding entry in ”Folder” menu of the Inventory floater). It will also log useful diagnosis messages (CTRL SHIFT 4 to toggle the log console) when something goes wrong, instead of crashing... After the Marketplace Listings folder will have been successfully created, you can relog with any other viewers and won't crash with them any more (at least not at this place in that crude code 😜 )
  9. Which would be an argument in favour of @animats' suggestion to send an empty array of events instead of letting the request timing out... Of course, it means more work for the sim server (monitoring the requests timing for each connected viewer and sending an empty event when it gets close to an HTTP timeout to avoid the latter), but it should not prove too overwhelming either...
  10. As I wrote above, it has always been extremely finicky: some drivers and monitors (*) combinations more or less work, others cause crashes (usually the stack trace points deep into the OpenGL driver), or failure to set the proper resolution or image ratio: your screen shot looks to me like if the ratio is not properly set (look at the oblong shape given to the camera control ball, for example)... (*) Yup, it also depends on the monitor which transmits (or not, or late) its characteristics to the driver via the EDID protocol. Here is what I get with the Cool VL Viewer in full screen mode under Linux: notice the proper aspect ratio via the circular UI elements (in the camera controls), the HUD radar at the bottom right, and the bicycles wheels.
  11. That won't be a timeout, but a periodic empty message sent by the server before the request would actually timeout at the HTTP stack level. It means that, if it did not send any message to each viewer with an active request in the past 28 seconds of so (to avoid the 30s HTTP timeout, counting the ”ping” time, the frame time, and possible server side lag at next frame) it must send a reply with an empty events array. But yes, it would work with the current code used by viewers, and would definitely prevent some race conditions on TP (the race happening when the TP request is sent just as the server times out and libcurl silently retries the request, with TeleportFinish sent by the server too soon before libcurl could reconnect).
  12. Well, the viewer indeed keeps the event poll alive after the agent has left the region, which is needed to keep region communications alive in the case when the region border was simply crossed, or in the case of a ”medium range” TP in a neighbour region and still within draw distance. Of course, in the ”far TP” case, the viewer will keep polling until it finds out the region is to be disconnected, so it might restart a poll after a far TP, acknowledging (again) the last received message Id (same Id as previous poll)... Double acks will also happen whenever a poll request ”fails” (or simply times out) for a live region, and the viewer restarts a second poll: here again, the ”id” of the last received message is repeated in the ”ack” field of the poll request.
  13. IIRC, Ubuntu got Wayland enabled by default... Firestorm (like almost all other Linux viewers) is using X11, and the Xwayland compatibility layer provided for Wayland is known to be bogus in many aspects. What happens if you disable Wayland usage in Ubuntu 22 ? Note that the full screen mode has always been extremely finicky and crashy in SL viewers. For the Cool VL Viewer, I fully reworked it so that, when enabled, the viewer goes full screen from the very start, instead of attempting to switch from windowed to full screen (i.e. fully restart GL from scratch) on login: it solves many issues (mainly OpenGL driver level crashes, but also resolution detection issues). For Linux, it also got an optional ”full desktop” mode, instead of genuine full screen (i.e. the viewer uses your current desktop resolution with a decoration-less window, the bonus being that it can also run with other managed windows on top of it). Finally, it may be possible, depending on the window manager in use, to add a rule in the latter so that it does not decorate the viewer window and forces it full screen; you then could run the viewer in ”windowed” mode, but full screen and border-less, similar to what you would get in ”full desktop” mode.
  14. I'm more under the impression you are seeking for just one favourable testimony to use it as an excuse to follow your personal feeling/belief that an AMD card would be better suited for you... Just go ahead, and buy whatever suits your own needs/preferences, and take your responsibilities. Just don't come back here to complain ”we” gave you a bad advice, should you find out you committed a mistake. 😜 As for the graphics cards prices, it might be wiser/smarter to wait a little bit: NVIDIA's cards are already seeing an adjustment of their prices as a result of AMD's newest cards releases (competition is a Good Thing ™) . It will take some time to propagate to France (but you could just as well buy a card from a more reactive German supplier), but prices are going to drop a bit in the coming weeks. The second half of October is usually a good moment to buy computer hardware (long enough after people's return from Summer vacations, soon enough before Christmas). There is also the option to wait for a sale/opportunity on the former cards generation (even a RTX 3070 is plenty powerful enough for SLing).
  15. These are very, very low settings for a Core-i9 and a RTX 3080, even mobile ones... You could easily push the graphics settings between High and Ultra, and yet keep your PC running cool enough by using the frame limiter at 60fps...
  16. Not for the full session, no, but typically up to two minutes or so after the ”departure” (actually after the region is out of draw distance): the event poll is terminated when it time outs after the departure, then the region itself is removed from memory, within one minute, the last thing to go being the UDP data circuit, about two minutes after departure. Of course, should the avatar go back to the region, things still around may be reused...
  17. The viewer will use the highest possible Open GL version available from your drivers. The higher the better (better optimized, faster, more features). The viewer currently does not need features specifically introduced in latest OpenGL versions (this might change ”soon”: the PBR viewer already needs OpenGL v3.2 features), but drivers with v4.6 and a good core profile implementation (the core profile gets rid of old OpenGL versions cruft) are much faster. NVIDIA proprietary drivers in OpenGL v4.6 core profile are typically +50% to +100% faster than in compatibility profile, something that is not seen happening with AMD proprietary drivers.
  18. I'd say I am not surprised, given the absence of retries from the server part, when TeleportFinish is not received by viewers... Count me in among the ”guinea pigs”. 😜 I'll gladly help testing a fix (or several) for this years-long TP bug that is plaguing SL.
  19. Yes, I tried to setup a larger timeout than the SL's servers, and to take into account the bogus ”502 in disguise” I get, considering them simple poll timeouts. I then observe a strange thing, that can only be explained by a libcurl weird internal working: the timeouts occur after 61.25 seconds or so (instead of 30.25 seconds or so, which would correspond to the server timeout plus the ping time), and I do see in Wireshark libcurl retrying the connection once on first server timeout (i.e. after ~30s) instead of passing the latter to the application, like instructed to do (setRetries(0)) !... Maybe it is due to that ”502 in disguise” issue (libcurl won't recognize a ”genuine” timeout and retry once ?)... So in the end, the only way for me to see a genuine timeout occurring is to set the viewer-side timeout below the server one... Also, everyone, you all can stop holding your breath: after stress-testing it (and despite more refinements brought to its code), my workaround does not prove robust enough, and I can still see TP failures happening sometimes (rarer than without it, but still happening nonetheless)... I will publish it in next Cool VL Viewer releases (with debugs settings for a kill switch and several knobs to play with, and that handy poll request age debug display for easy repros of TP failures), but it is not a solution to TP failures, sadly. 😢 So, we will have to wait for Monty to fix the server side of things... 😛
  20. The problem is that you do not get that when logged to SL: you get a 499 or 500 error header (and ”502 error” printed in body). Meaning, somehow, the 502 error gets mutated into another, and is then not recognized as such by the viewers. Thus why you cannot let the server time out when connected to SL (everything working as expected when connected to OpenSim, where I do let the server time out in my code).
  21. Success ! I managed to: Reproduce reliably TP failures modes related to event poll requests expiration and restart delays (race condition with the servers). Find and implement a robust work around for those. The problem seen is indeed due to how a TP request by the user can be sent to the server while the poll request is about to timeout, or was just closed and is being restarted as the result of an event poll message receival. If the server queues the TeleportFinish message (or any message, but this one is unique and supposed to be 100% reliable, unlike ParcelProperties & Co) while the viewer is in the process of restarting a poll request, somehow that message will never be received. To confirm this, I use a LLTimer which is reset just before I post (and yield) the request in LLEventPoll. I also use a 25s timeout and no libcurl-level retries for those requests, so that they always timeout on the viewer side and that the said timeout is always seen happening by the LLEventPoll code. I also implemented a debug display for that timer in the viewer window, so that I can easily manually trigger a TP just before or just after the event poll request has expired or started; doing so, I can reliably reproduce the TP failures that so far seemed to happen ”randomly”. As for the workaround, it is implemented in the form of a TP queuing and event poll timer window checking; whenever a TP request is done 500ms or less before the agent region poll request would time out or has been restarted, the TP is queued (via a new LLAgent::TELEPORT_QUEUED state, which allows to use the existing state machine implemented in llagent.cpp and llviewerdisplay.cpp), and the corresponding UDP message (either TeleportLocationRequest, TeleportLandmarkRequest or TeleportLureRequest) requesting the TP to the server is put on hold until the event poll request timer is again in the stable/established connection window, at which point the TP request message is sent. So far (stress-testing still in progress), it works wonders and I do not experience failed TPs any more. If everything runs as expected and I am satisfied with the stress-testing, this code will be implemented in the next releases of the Cool VL Viewer.
  22. EEEK ! Don't do that: viewers would see those ugly ”502 in disguise” errors, which would be considered as poll request failures in the current viewers' code, and only retried a limited amount of times ! With the current viewer code and in SL (*), the poll request timeout must occur on the viewer side (yes, even though it is ”transparently” retried on libcurl level: the important point is that the fake 502 error is not seen by the viewer code). If anything, increasing the server side timeout from 30s to 65s or so (so that a ”ParcelProperties” message would make it through before each request would timeout), would reduce the opportunities for race conditions. (*) For OpenSim-compatible viewers, a (true) 502 error test is added, which is considered a timeout and retried like for a viewer-side libcurl timeout, but this test is only performed while connected to OpenSim servers, which do not lie on 502 errors by disguising them as 499 or 500 ones in their header. Pretty please, make it so that these changes remain backward-compatible... One possible such change would be as follow: Currently, viewers acknowledge the previous poll event ”id” on restarting a request, by setting the ”ack” request field equal to the previous result ”id”. It means that, for TeleportFinish, the server would normally see the ”id” used to transmit it on its side coming back immediately in the ”ack” field of the request following its receival by the viewer. If the server does not get it (because it does not get a new request posted by the viewer), then the TeleportFinish was not received and should be resent. To be 100% sure that the request is not just in flight or delayed, the server could send two different commands in a row on TP: TeleportFinish first, then, for example the ParcelProperties, in a different message (different Id): then if no ”ack” for TeleportFinish has been received, re-issue it.
  23. The LLAppCoreHttp::AP_LONG_POLL policy group does not define the retry attempts, at least not in my viewer... But explicitly setting mHttpOptions->setRetries(0) causes ”502” errors in disguise (502 body, 500 or 499 header) to happen... However, setting mHttpOptions->setTransferTimeout(25) (25s timeouts, i.e. below the server timeout) with mHttpOptions->setRetries(0) seems to work just fine: libcurl then timeouts after 25s and the viewer fires a new poll, as expected (an no trace of retries in Wireshark)... This would eliminate a possible cause for a race condition. And I got an idea to avoid TP failures that would possibly be the result of a race between a received event processing, the triggering of a TP by the user just at that moment, the firing of a new poll, and the TeleportFinish transmission. I'll try to set an ”in flight” flag on starting the poll request, reset it on request return, and on TP test that flag: when not ”in flight”, yield to coroutines until the coroutine for the event poll can fire a new request (setting the flag); the TP would then be fired while the poll request is ”stable” and waiting for a server transmission.
  24. Yup, you are right... Can see this with Wireshark. The retry is likely done at libcurl level... More race condition opportunities ! 😢 Which only advocates for a return of reliable message events such as TeleportFinish to the ”reliable UDP” path provided by the viewer...
×
×
  • Create New...