LSL String compression

Jenna Huntsman · November 11, 2020

Hey!

I've got a script which needs to pass a long list to another script via RegionSayTo

However, I notices on the LSL wiki that scripts can't receive messages longer than 256 characters, so I wanted to look into compressing the content of the list and decompressing it on the other side.

How could I do this? I looked at the llMD5String but that seems to only be useful to authenticate the content of a message, and not really compress it.

Thanks!

EDIT: Would llXorBase64 be suitable for this kind of thing? I don't really need the security aspect but if it can compress a longer string into a shorter one that'd be really useful!

Edited November 11, 2020 by Jenna Huntsman

Profaitchikenz Haiku · November 11, 2020

From the wiki entry for the listen event

Chat on public channel and positive channels is truncated to 1023 bytes. Chat on negative channels is truncated to 254 bytes.

So if you send on a positive channel you shouldn't have a need to compress the messages, unless you're streaming a novel

KT Kingsley · November 11, 2020

The wiki entry for the listen event at http://wiki.secondlife.com/wiki/Listen makes no mention of the length of messages: neither "1023", "256" or "254" appear on the page.

The wiki entry for llRegionSayTo at http://wiki.secondlife.com/wiki/LlRegionSayTo does specify a maximum message length of 1024 bytes; how many characters this can accommodate will depend on whether unicode is involved.

Edit: I did just check in-world, and I can confirm it's possible to send a 1024 byte message using llRegionSayTo to a script in another object which does receive it in full in its listen event, using a positive-numbered channel.

Edited November 11, 2020 by KT Kingsley

Wulfie Reanimator · November 11, 2020

LSL doesn't have reversible compression algorithms built-in. You'd have to implement your own.

Here's one example written by someone else: http://wiki.secondlife.com/wiki/LZW_LSL_Implementation

Mollymews · November 11, 2020

adding to the conversation

short string compression is best achieved with a dictionary similar to how Google Brotli does it

the fixed dictionary contains short codes for words that are used frequently in the app

the first instance of a word not found in the fixed dictionary is encoded verbatim and then appended to the dictionary with an assigned short code. When a second instance of the word is found in the string then this is encoded with its assigned short code

Brotli links:

https://github.com/google/brotli

https://en.wikipedia.org/wiki/Brotli

after being run thru dictionary encoding then the string can potentially be compressed/packed further using a general compression algorithm. There are a few LSL implementations of general compression algorithms on the wiki. To find search on: LSL compression

when our source string only contains ASCII chars then a simple post-dictionary packing technique is to use Pedro Oval's Ord and Chr functions documented here: http://wiki.secondlife.com/wiki/Ord

with ASCII only source strings then with these functions we can effectively half the length of the string. Example:

string source = "~ Some string ~";
                
// encode : pack 2 ascii chars into 1 UTF-16 char       
string encoded;
integer strlen = llStringLength(source);
integer i;
for (i = 0; i < strlen; i += 2)
{
  integer c1 = Ord(llGetSubString(source, i, i));
  integer c2 = Ord(llGetSubString(source, i + 1, i + 1));
  integer n = (c1 << 8) | c2;
  string e = Chr(n);
  encoded += e;
}   

// decode
strlen = llStringLength(encoded);
string decoded;
for (i = 0; i < strlen; i++)
{
  integer c = Ord(llGetSubString(encoded, i, i));
  integer n1 = (c >> 8) & 255;
  integer n2 = c & 255;
  string d = Chr(n1) + Chr(n2);
  decoded += d;
}

// check
if (decoded == source)
  llOwnerSay("we are good");

Profaitchikenz Haiku · November 11, 2020

1 hour ago, KT Kingsley said:

The wiki entry for the listen event at http://wiki.secondlife.com/wiki/Listen makes no mention of the length of messages: neither "1023", "256" or "254" appear on the page.

True, it was the function I had read that on, it's in the notes for llListen()

Colour me fatigued

KT Kingsley · November 11, 2020

29 minutes ago, Profaitchikenz Haiku said:

True, it was the function I had read that on, it's in the notes for llListen()

Colour me fatigued

I can confirm that using both positive and negative channels, and using both llSay and llRegionSayTo a listen event will receive a full 1024 byte message.

Perhaps someone with wiki editing rights will see this and correct the wiki entry for llListen at https://wiki.secondlife.com/wiki/LlListen.

Phate Shepherd · November 12, 2020

52 minutes ago, KT Kingsley said:

I can confirm that using both positive and negative channels, and using both llSay and llRegionSayTo a listen event will receive a full 1024 byte message.

Perhaps someone with wiki editing rights will see this and correct the wiki entry for llListen at https://wiki.secondlife.com/wiki/LlListen.

I verified that script to script allows 1024 bytes for both positive and negative channels. Public channel 0 chat was cropped at 1023 bytes. Tested both LSL and Mono.

Edited the Wiki.

Edited November 12, 2020 by Phate Shepherd

Phate Shepherd · November 12, 2020

3 hours ago, Jenna Huntsman said:

Hey!

I've got a script which needs to pass a long list to another script via RegionSayTo

However, I notices on the LSL wiki that scripts can't receive messages longer than 256 characters, so I wanted to look into compressing the content of the list and decompressing it on the other side.

If you are trying to send more than 1024 bytes, I'd be looking at something simpler... splitting the string into smaller chunks like 512 bytes, and putting a header on the front with a x of y parts and timestamp info on the front. Then reassembling on the other end if all parts have arrived, and the timestamps match. (Messages aren't guaranteed to arrive in order)

Lucia Nightfire · November 12, 2020

User chat via the viewer is 1023 bytes on non-negative channels.

User chat via the viewer is 254 bytes on negative channels.

Scripted chat is 1024 bytes on any channel.

I've been hoping that the user chat byte limit on negative channels could be raised, but so far it hasn't happened.

Lucia Nightfire · November 12, 2020

4 hours ago, Phate Shepherd said:

Edited the Wiki.

You might want to edit it back with mention that the 1023 & 254 limits are based on user chat, not scripted.

Profaitchikenz Haiku · November 12, 2020

3 hours ago, Lucia Nightfire said:

You might want to edit it back

Oh no, let's not start disputing the ~~election~~ editing. (It almost looks like somebody somewhere decided to put a limit on Para-RP)

Mollymews · November 12, 2020

8 hours ago, Phate Shepherd said:

If you are trying to send more than 1024 bytes, I'd be looking at something simpler... splitting the string into smaller chunks like 512 bytes, and putting a header on the front with a x of y parts and timestamp info on the front. Then reassembling on the other end if all parts have arrived, and the timestamps match. (Messages aren't guaranteed to arrive in order)

agree

this would be a lot more efficient than compression ever will be

Jenna Huntsman · November 13, 2020

Just building on this actually:

Did some modification to my scripts so now the strings are sent in blocks to solve the memory issue.

I've actually now run into another (mostly unrelated) issue wherein I have a long list of strings, however I'm running out of memory when that list gets too long. Each string is around 50 characters or so.

That Ord function looks like it might do the job of compressing that list well, would that be a good idea or should I look at another solution?

Thanks guys!

KT Kingsley · November 13, 2020

One thing I'd suggest is to have a second script that only deals with the stored data and communicates with the main script using link messages, perhaps serving up just one list at a time. How useful that'd be probably depends on how much of the main script's memory is used by the code, and how difficult it'd be to implement on how you're using the data in the main script.

Wulfie Reanimator · November 13, 2020

4 hours ago, Jenna Huntsman said:

I've actually now run into another (mostly unrelated) issue wherein I have a long list of strings, however I'm running out of memory when that list gets too long. Each string is around 50 characters or so.

50 characters is not a lot. How many strings are we talking about?

Jenna Huntsman · November 13, 2020

2 hours ago, Wulfie Reanimator said:

50 characters is not a lot. How many strings are we talking about?

Probably going on around 600. Script is running in Mono with no set memory limit

Wulfie Reanimator · November 13, 2020

6 minutes ago, Jenna Huntsman said:

Probably going on around 600. Script is running in Mono with no set memory limit

Ah okay, that's gonna be more memory than any single script can hold. Each string in a list is going to need 18 bytes + 2 per character, or 118 * 600 in your case. That's ~69 KB (Plus all the other code required to make use of that data.) while Mono is limited to 64 KB.

You'll either have to start compressing things (the Ord suggestion seems pretty good), or add more scripts whose only job is to hold data and pass it back to the main scripts when needed. Both have their pros and cons. Compression allows you to work with fewer scripts, but adding more scripts is easier to scale with your data. You might even consider an external server if you're expecting the amount to keep growing over time.

Jenna Huntsman · February 15, 2023

Hey all!

Had to come back here for a project that I'm working on. Did some updates to @Mollymews' Ord-based compression, to make use of the native LSL functions, and tighten up the function to use as little memory as possible.

string MemComp(string sInput, integer iDir)
{ //Compress 2 UTF-8 chars into UTF-16, or decompress UTF-16 to UTF-8. Credit: Jenna Huntsman, Mollymews
    string sOut;
    integer i;
    for (i = 0; i < llStringLength(sInput); ++i)
    {
        if(!iDir) //If iDir is 0, we're encoding - any other value will decode.
        { //Encode
            sOut += llChar((llOrd(sInput,i) << 8) | llOrd(sInput,i+1));
            ++i; //Iterate i by 2 and not 1 on encode.
        }
        else
        { //Decode
            integer c = llOrd(sInput,i);
            sOut += llChar((c >> 8) & 255) + llChar(c & 255);
        }
    }
    return sOut;
}

Quistess Alpha · February 15, 2023

42 minutes ago, Jenna Huntsman said:

tighten up the function to use as little memory as possible

Good job! Some really minor suggestions, but wouldn't flipping the conditional save doing a negation operation for every character? if(iDir) instead of if(!iDir). Come to think of it, you could just place the conditional outside the loop, and if I'm being over-optimizationalist, overload the input parameter as the loop variable:

string MemComp(string sInput, integer iDir)
{ //Compress 2 UTF-8 chars into UTF-16, or decompress UTF-16 to UTF-8. Credit: Jenna Huntsman, Mollymews
    string sOut;
    if(iDir) // decode for every non-0 value.
    { //Decode
      for (iDir = 0; iDir < llStringLength(sInput); ++iDir) // overload iDir to be a loop increment.
      {   integer c = llOrd(sInput,iDir);
          sOut += llChar((c >> 8) & 255) + llChar(c & 255);
      }
    }else // idir==0
    { //Encode
      for (iDir = 0; iDir < llStringLength(sInput); ++iDir) // overload iDir to be a loop increment.
      {   sOut += llChar((llOrd(sInput,iDir) << 8) | llOrd(sInput,iDir+1));
          ++iDir; //Iterate i by 2 and not 1 on encode. 
          //(if you wanted to be fancy could embed ++ in the += line, but that would be harder to read and debug given lsl execution order, and no more efficient.)
          // something like llChar(llOrd(sInput,++iDir) | (llOrd(sInput,++iDir) << 8) ); with the increment removed from the for construction and initialize to -1.
      }
    }
    return sOut;
}

again though, really minor nitpicks!

Love Zhaoying · February 15, 2023

Glad this thread popped up today, I'm about to work on something that uses llRegionSayTo() and forgot I'll need to split my strings into segments (because some will definitely be longer than 1024)!

Jenna Huntsman · February 15, 2023

1 hour ago, Quistess Alpha said:

Good job! Some really minor suggestions, but wouldn't flipping the conditional save doing a negation operation for every character? if(iDir) instead of if(!iDir). Come to think of it, you could just place the conditional outside the loop, and if I'm being over-optimizationalist, overload the input parameter as the loop variable:

string MemComp(string sInput, integer iDir)
{ //Compress 2 UTF-8 chars into UTF-16, or decompress UTF-16 to UTF-8. Credit: Jenna Huntsman, Mollymews
    string sOut;
    if(iDir) // decode for every non-0 value.
    { //Decode
      for (iDir = 0; iDir < llStringLength(sInput); ++iDir) // overload iDir to be a loop increment.
      {   integer c = llOrd(sInput,iDir);
          sOut += llChar((c >> 8) & 255) + llChar(c & 255);
      }
    }else // idir==0
    { //Encode
      for (iDir = 0; iDir < llStringLength(sInput); ++iDir) // overload iDir to be a loop increment.
      {   sOut += llChar((llOrd(sInput,iDir) << 8) | llOrd(sInput,iDir+1));
          ++iDir; //Iterate i by 2 and not 1 on encode. 
          //(if you wanted to be fancy could embed ++ in the += line, but that would be harder to read and debug given lsl execution order, and no more efficient.)
          // something like llChar(llOrd(sInput,++iDir) | (llOrd(sInput,++iDir) << 8) ); with the increment removed from the for construction and initialize to -1.
      }
    }
    return sOut;
}

again though, really minor nitpicks!

I actually started off in a similar manner, but figured I could tighten up the code into a single loop which I thought would save on memory a bit. Neat idea about using iDir though, I just put my own spin on it to compress it back into using a single loop again. See what you think!

string MemComp(string sInput, integer iDir) //iDir is a bool, so should be FALSE (0) for encode or TRUE (1) for decode
{ //Compress 2 UTF-8 chars into UTF-16, or decompress UTF-16 to UTF-8. Credit: Jenna Huntsman, Mollymews
    string sOut;
    for (; iDir < llStringLength(sInput)*2; iDir = iDir + 2)
    {
        if(((iDir % 2) == 0)) //If iDir is a en even number (including 0), we're encoding - any other value will decode.
        { //Encode
            sOut += llChar((llOrd(sInput,iDir/2*2) << 8) | llOrd(sInput,(iDir/2*2)+1));
        }
        else
        { //Decode
            integer c = llOrd(sInput,iDir/2);
            sOut += llChar((c >> 8) & 255) + llChar(c & 255);
        }
    }
    return sOut;
}

Edited February 15, 2023 by Jenna Huntsman

Quistess Alpha · February 15, 2023

26 minutes ago, Jenna Huntsman said:

figured I could tighten up the code into a single loop which I thought would save on memory a bit.

I guess it depends on what you're trying to optimize. A single loop means the function itself will take up, maybe 10~50 bytes less script space? but will run incalculably slower. (because you're checking something you already know, for every character)

ETA: Also, I liked the first one better. % and / are both slow. Preffer &1 and >>1 (or *0.5 for floats) respectively .

(n/2)*2 == n&(~1) == n&(integer)(-2)

Edited February 15, 2023 by Quistess Alpha

Wulfie Reanimator · February 15, 2023

If the goal is absolutely minimal script memory usage, you could run it through @Sei Lisa's LSL PyOptimizer.

I'm not sure if there are any benefits in doing that since this is a user function, which should use a minimum 512 byte block of memory.

Love Zhaoying · February 15, 2023

1 minute ago, Wulfie Reanimator said:

If the goal is absolutely minimal script memory usage, you could run it through @Sei Lisa's LSL PyOptimizer.

I'm not sure if there are any benefits in doing that since this is a user function, which should use a minimum 512 byte block of memory.

Thanks, I had heard of this but never saw it until now.

LSL String compression

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Please sign in to comment

Linden Lab

Tilia

Second Life

Connect With Us

Partner With Us