Jump to content

ASCII string packing with new LSL functions llOrd and llChar


Mollymews
 Share

Recommended Posts

this is a very simple ASCII character string packer using the new functions llOrd and llChar

idea is to pack two ASCII characters into one UTF-16 character in a form that can be transmitted as a CSV string

as wrote it packs a list to a CSV string, then unpacks the CSV string to a list
 

// basic string packing
// using new functions llOrd and llChar
//
// assumes that are using ASCII (upto 8 bit characters)

list lstData;
string strData;


pack() // data list to csv string
{  
    strData = ""; // clear csv string
    integer dLen = llGetListLength(lstData);
    integer i;
    for (i = 0; i < dLen; i++)
    {
        // pack list element
        string element = llList2String(lstData, i);
        integer eLen = llStringLength(element);
        integer j;
        for (j = 0; j < eLen; j += 2)
        {
            strData += llChar( (llOrd(element, j) << 8) | llOrd(element, j + 1)  );
        }
        if (i < dLen-1) strData += ",";
    }
    lstData = []; // mark for garbage collection to free mem    
}

unpack() // csv string to data list
{
    lstData = []; // clear data list
    while (strData)
    {
        // copy element to buffer
        string buffer;
        integer dPtr = llSubStringIndex(strData, ",");
        if (dPtr == -1) // end of strData
        {
            buffer = strData;
            strData = "";
        }
        else
        {
            buffer = llGetSubString(strData, 0, dPtr - 1);
            // remove element from csv string
            strData = llDeleteSubString(strData, 0, dPtr);
        }
        // unpack element and append to list
        integer bLen = llStringLength(buffer);
        string element;
        integer i;     
        for (i = 0; i < bLen; i++)
        {
           integer n = llOrd(buffer, i);
           element += llChar(n >> 8);
           n = n & 255;
           if (n > 0) element += llChar(n); // catches odd length strings  
        }
        lstData += [element];
    }
}    


default
{
    state_entry()
    {
        strData =  "a,song,happy,days,are,here,again,the,sky,above,is,clearer,again";
        llOwnerSay((string)llStringLength(strData) + " " + strData);
        // string length is 63
     
        // copy to data list
        lstData = llParseString2List(strData, [","], []);
        
        pack();  // list to string
        llOwnerSay((string)llStringLength(strData) + " " + strData);

        // packed string is 椀,獩湧,桡灰礀,摡祳,慲攀,桥牥,慧慩渀,瑨攀,獫礀,慢潶攀,楳,捬敡牥爀,慧慩渀
        // packed string length is 42
        
        unpack(); // string to list
        llOwnerSay(llDumpList2String(lstData, ","))    
    }
}

 

  • Like 6
Link to comment
Share on other sites

Posted (edited)

i had a quick play. For the above packer script the range is best restricted to the 7-bit ASCII character set " " to "~". ASCII(32) to ASCII(126)

the outputted UTF-16 value range of this set is decimal [8224..32382] hex [0x2020..0x7E7E] for which there are valid UTF-16 chars

 

Edited by Mollymews
typs
Link to comment
Share on other sites

I did a forum search, and did not find any posts saying that the new llOrd() and llChar() functions are live.

Maybe it was only announced in release notes (besides Rider's proposal quotes in Molly's other posts)?

Please let me know how this was announced.

Link to comment
Share on other sites

i saw the proposal announcement in the transcript of the group meeting over on Inara Pey's blog. Then I waited for the LSL wiki to be updated with descriptions of the functions . When the wiki was then I typed them in to a script and they worked

 

  • Thanks 1
Link to comment
Share on other sites

1 hour ago, Mollymews said:

i saw the proposal announcement in the transcript of the group meeting over on Inara Pey's blog. Then I waited for the LSL wiki to be updated with descriptions of the functions . When the wiki was then I typed them in to a script and they worked

 

I was amazed to see it, because not only had I missed the plans, and the new llFuncs() on the Wiki..I even had used the llChar() library function just months ago! Would have taken a different path if they existed!

Link to comment
Share on other sites

I've been playing with these funcs since the day they came out and gotta say, it's both awesome that we don't have to rely on user-written code (as good as it usually is with advanced topics such as this), and weird to get used to everything being based on utf-32 xD There's also llHash in the same pack o' new functions for anyone that wants to create guaranteed(I think) unique numbers for something, though it's not intended for crypography.

 

To Love Zhaoying, I only noticed it because I happened to be on the wiki's LSL Functions page at the time and noticed a new crop of functions with the (NEW!) icon next to them. I looked all over and the only other mention I could find was the original JIRA request asking for some sort of char/ordinal handling built into LSL.

  • Like 1
Link to comment
Share on other sites

On 7/6/2021 at 2:14 AM, Mollymews said:

i had a quick play. For the above packer script the range is best restricted to the 7-bit ASCII character set " " to "~". ASCII(32) to ASCII(126)

the outputted UTF-16 value range of this set is decimal [8224..32382] hex [0x2020..0x7E7E] for which there are valid UTF-16 chars

 

Was wondering what the range was, i'm far from being an expert on unicode and its quirks in LSL. Would the best way to restrict the range be simply to have a couple of if statements checking if it's above 32 or below 126 before continuing? Or is there a more elegant way, like with a bitmask or something? o:

Link to comment
Share on other sites

Posted (edited)
8 hours ago, Ember Shuffle said:

Was wondering what the range was, i'm far from being an expert on unicode and its quirks in LSL. Would the best way to restrict the range be simply to have a couple of if statements checking if it's above 32 or below 126 before continuing? Or is there a more elegant way, like with a bitmask or something? o:

the question of checking for above 32 or below 126 is probably best discussed further in the LSL Scripting sub-form, as there is a more than one way to code this up

this said I did some further playing.  Seems the valid n == llOrd(llChar(n), 0) conversion table is:

[1..55295], [57345..65533], [65535..1114111]

so if we wanted to pack integers into a string for transmitting and our number range was [0..999999] (1 million) then for example we could add 65535 to our number. Pack c = llChar(n + 65535). Unpack n = llOrd(c, 0) - 65535

 

just add a UTF-16 info link so that we can see why the valid conversion table is what it is

https://en.wikipedia.org/wiki/UTF-16

 

Edited by Mollymews
n
  • Like 1
Link to comment
Share on other sites

On 7/8/2021 at 2:48 PM, Mollymews said:

the question of checking for above 32 or below 126

i had some time to put together a general purpose string packer/unpacker pair which can differentiate between packable and unpackable strings, where packable strings are those in the range ASCII(32) to ASCII(126)
 

// basic string packer/unpacker for ASCII(32) to ASCII(126)
// differentiating between packable and unpackable strings
//
string packStr(string source)
{
   string result = llChar(1);  // char(1) is our packed string symbol. Change to whichever to suit
   integer len = llStringLength(source);
   integer check = TRUE;
   integer i;
   while (i < len && check)
   {
      integer a = llOrd(source, i);
      integer b = llOrd(source, i + 1);
      // check a in range, check b in range, catch odd length string
      check = (a >= 32 && a <= 126) && ((b >= 32 && b <= 126) || (!b && i == len-1));
      if (check) result += llChar(a << 8 | b);
      i += 2;
   } ;
   if (check && len)
      return result;  // packed string
   else
      return source;  // unpacked string
}

string unpackStr(string source)
{
   if (llGetSubString(source, 0, 0) != llChar(1))  // is unpacked string
      return source;
   string result;  
   integer len = llStringLength(source);
   integer i;
   for (i = 1; i < len; i++) // start at 1 to skip the symbol
   {
      integer n = llOrd(source, i);
      result += llChar(n >> 8);
      n = n & 255;
      if (n > 0) result += llChar(n); // catch odd length string
   }
   return result;
}


default
{
    state_entry()
    {
        
        // s will pack
        string s = "A song: Happy days are here again,The sky above is clearer again.";
        llOwnerSay((string)llStringLength(s) + " " + s);
        string p = packStr(s);
        llOwnerSay((string)llStringLength(p) + " " + p);        
        string u = unpackStr(p);
        llOwnerSay((string)llStringLength(u) + " " + u);
        
        // s will not pack because chars ❶ ❷ ❸
        s = "A song: (tap)(tap)❶ and ❷ and ❸ Happy days are here again,The sky above is clearer again.";  
        llOwnerSay((string)llStringLength(s) + " " + s);
        p = packStr(s);
        llOwnerSay((string)llStringLength(p) + " " + p);
        u = unpackStr(p);
        llOwnerSay((string)llStringLength(u) + " " + u);
    }
}

 

Link to comment
Share on other sites

I merged the two functions into one and tightened things up a touch (based in assumption that in use, strings will be packed when stored and unpacked when used)

Call the one function to try and pack a string. If a packed string is passed it will unpack.

This shaves 512 bytes off total script size vs the 2 function original (mono).. which is probably a good thing if packing strings becomes necessary.

// basic string packer/unpacker for ASCII(32) to ASCII(126)
// differentiating between packable and unpackable strings
// Originally by Mollymews, merged to single function by Coffee Pancake
// https://community.secondlife.com/forums/topic/473873-ascii-string-packing-with-new-lsl-functions-llord-and-llchar/
//
// Usage - pass a string. if unpacked will try and pack. if previously packed, will unpack.
string mollypackStr (string source)
{
    string result;
    integer len = llStringLength(source);
    integer i;
    integer a;
    // unpacked string
    if (llGetSubString(source, 0, 0) != llChar(1)) {
        result = llChar(1);  // char(1) is our packed string symbol. Change to whichever to suit
        integer check = TRUE;
        while (i < len && check)
        {
            a = llOrd(source, i);
            integer b = llOrd(source, i + 1);
            // check a in range, check b in range, catch odd length string
            check = (a >= 32 && a <= 126) && ((b >= 32 && b <= 126) || (!b && i == len-1));
            if (check) result += llChar(a << 8 | b);
            i += 2;
        }
        // unpackable !! return original
        if (check && len) ; else {result = source;}
    }
    // Packed String
    else {
       for (i = 1; i < len; i++) // start at 1 to skip the symbol
       {
            a = llOrd(source, i);
            result += llChar(a >> 8);
            a = a & 255;
            if (a > 0) result += llChar(a); // catch odd length string
       }
    }
    return result;
}


default
{
    state_entry()
    {
        llOwnerSay((string)llGetFreeMemory());
        // s will pack
        string s = "A song: Happy days are here again,The sky above is clearer again.";
        llOwnerSay((string)llStringLength(s) + " " + s);
        string p = mollypackStr(s);
        llOwnerSay((string)llStringLength(p) + " " + p);        
        string u = mollypackStr(p);
        llOwnerSay((string)llStringLength(u) + " " + u);
        
        // identify a packed string
        if (llGetSubString(p, 0, 0) == llChar(1)) {
            llOwnerSay("String p is packed!");
        }
        
        // s will not pack because chars ❶ ❷ ❸
        s = "A song: (tap)(tap)❶ and ❷ and ❸ Happy days are here again,The sky above is clearer again.";  
        llOwnerSay((string)llStringLength(s) + " " + s);
        p = mollypackStr(s);
        llOwnerSay((string)llStringLength(p) + " " + p);
        u = mollypackStr(p);
        llOwnerSay((string)llStringLength(u) + " " + u);
    }
}

 

Edited by Coffee Pancake
  • Like 2
Link to comment
Share on other sites

2 hours ago, Coffee Pancake said:

I merged the two functions into one and tightened things up a touch (based in assumption that in use, strings will be packed when stored and unpacked when used)

i like this ! Is nice and simple. I made some stylistic changes also, to keep with the idea of simple

if (check && len) ; else {result = source};

style: if (!(len && check)) result = source;

if (a > 0) result += llChar(a);

style: if (a) result += llChar(a);

the other style change is using FOR loop in both pack and unpack

string coffeepackStr (string source)
{
    string result;
    integer len = llStringLength(source);
    integer i;
    integer a;
    // unpacked string
    if (llGetSubString(source, 0, 0) != llChar(1)) {
        result = llChar(1);  // char(1) is our packed string symbol. Change to whichever to suit
        integer check = TRUE;
        for (i = 0; i < len && check; i += 2)
        {
            a = llOrd(source, i);
            integer b = llOrd(source, i + 1);
            // check a in range, check b in range, catch odd length string
            check = (a >= 32 && a <= 126) && ((b >= 32 && b <= 126) || (!b && i == len-1));
            if (check) result += llChar(a << 8 | b);
        }
        // unpackable !! return original
        if (!(len && check)) result = source;
    }
    // Packed String
    else {
       for (i = 1; i < len; i++) // start at 1 to skip the symbol
       {
            a = llOrd(source, i);
            result += llChar(a >> 8);
            a = a & 255;
            if (a) result += llChar(a); // catch odd length string
       }
    }
    return result;
}

 

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...