Easy Scripting question - Handle Key Uniqueness

Love Zhaoying · July 5, 2022

I seem to remember that for many, possibly most purposes, storing just the last 4 characters of a "key" for comparison is "close enough".

Can you please confirm this is correct?

My use-case is: I am getting handle(s) for dataserver() calls - in this case for Experience key-value data - and don't want to keep the entire key if necessary.

Part of the reason is: In addition to "storing" the handle-key temporarily in memory, I plan to use the JSON functions to look it up later.

Example:

- hKey is the handle I want to use.

- OtherData is other data that I want to store with the key.

Storing OtherData with the handle-key:

string sJSONVars = llSetJsonValue(sJONVars, ["Handles", (string)hKey], OtherData]);

Retrieving OtherData with the handle-key:

OtherData = llGetJasonValue(sJONVars, ["Handles", (string)hKey]); // ==> Retrieves OtherData

So the question becomes: Is it "close enough" to just use the last 4 digits of the key in this case?

The dataserver() event would also just check for an entry with the last 4 digits of the key in the "Handles" JSON variable.

Other benefits include:

- Can store an Object as "OtherData" not just a value

- Can delete the associated JSON variable from "Handles" just knowing the last 4 digits of the key.

Thanks,

Love

Jenna Huntsman · July 5, 2022

IMO, the answer is: Maybe, but probably not.

It's worth noting that functions like llGenerateKey (likely) doesn't generate unique keys, so if you're using a generated key, then it's 100% possible that the last 4 digits could match, while being a different key.

I'd probably look into using a key compression algorithm to leverage more UTF characters to compress a full key, which can be decompressed later if needed.

(This thread may be of use, although you may want to replace Ord with llOrd

)

Quistess Alpha · July 5, 2022

IMO it would be a bit more 'Sane' to do something like

key k = ...; // get a key from somewhere.
string test = llGetSubString(llIntegerToBase64(llHash(k)),0,5); // 6 ascii characters that stand for our key.

which has 32 bits of entropy.

just taking the last 4 digits of a key only has 16 bits of entropy.

The key compression algorithm by Molly in the LSL library is more "exact" but can be a bit much depending.

Love Zhaoying · July 5, 2022

1 minute ago, Jenna Huntsman said:

IMO, the answer is: Maybe, but probably not.

It's worth noting that functions like llGenerateKey (likely) doesn't generate unique keys, so if you're using a generated key, then it's 100% possible that the last 4 digits could match, while being a different key.

I'd probably look into using a key compression algorithm to leverage more UTF characters to compress a full key, which can be decompressed later if needed.

(This thread may be of use, although you may want to replace Ord with llOrd

)

I am only storing these for "a little while" - a key compression algorithm is a good solution but, overkill for my use-case.

If keys aren't "quite unique enough", I can certainly store the entire key since this specific scenario is short-term.

Love Zhaoying · July 5, 2022

2 minutes ago, Quistess Alpha said:
IMO it would be a bit more 'Sane' to do something like
key k = ...; // get a key from somewhere.
string test = llGetSubString(llIntegerToBase64(llHash(k)),0,5); // 6 ascii characters that stand for our key.
which has 32 bits of entropy.

just taking the last 4 digits of a key only has 16 bits of entropy.

The key compression algorithm by Molly in the LSL library is more "exact" but can be a bit much depending.

Thanks!

Frionil Fang · July 5, 2022

Without getting into the details about how the keys are generated (I don't even remember), if you used the last 4 characters you were cutting the space of keys into 16 bits: only 2^16 = 65536 different ones, instead of 2^128. 65536 may sound big, but as per the Birthday problem and the square approximation formula there, it would take only ~36 entries to start getting into entire percents of collision chance. I wouldn't want a 1% collision chance.

Love Zhaoying · July 5, 2022

4 minutes ago, Frionil Fang said:

Without getting into the details about how the keys are generated (I don't even remember), if you used the last 4 characters you were cutting the space of keys into 16 bits: only 2^16 = 65536 different ones, instead of 2^128. 65536 may sound big, but as per the Birthday problem and the square approximation formula there, it would take only ~36 entries to start getting into entire percents of collision chance. I wouldn't want a 1% collision chance.

Thanks! While I'm only storing maybe 3-4 at a time for this case (and only 1 initially), you never know.

Profaitchikenz Haiku · July 5, 2022

The closest to what Love is contemplating I am aware of is a library function key2chan, which takes an avatar key and generates a -ve channel number from 8 bits of the key, but I can't recall if it's the top 8 or bottom 8.

eta

it uses 0x80000000 | (integer) ("0x" + (string) key)

Edited July 5, 2022 by Profaitchikenz Haiku

Love Zhaoying · July 5, 2022

I have a more clear memory now of where I saw "partial keys" used, and I believe it was for generating "pseudo-random" channel numbers.

A Parent object rezzes object, and passes the a channel # generated from parts of their key. Etcetera.

Love Zhaoying · July 5, 2022

2 minutes ago, Profaitchikenz Haiku said:

The closest to what Love is contemplating I am aware of is a library function key2chan, which takes an avatar key and generates a -ve channel number from 8 bits of the key, but I can't recall if it's the top 8 or bottom 8.

You beat me, while I was posting something similar!

Quistess Alpha · July 5, 2022

Math time! Skimming Wikipedia the relevant equation here is:

$n(p;d)\approx {\sqrt {2d\cdot \ln \left({\frac {1}{1-p}}\right)}}.$

That is, if as Frionil Fang suggests, you want a less than 1% chance of collision, and you're using the 'last 4 digits' method, then you need to keep your list of keys below

sqrt(2*(2^16)*l(1/(1-0.01))) = 36 entries; for less than a 0.1% chance, 11 entries.

Compare that to a 32 bit entropy which would allow for ~2931 entries with less than a 0.1% chance of collision.

Profaitchikenz Haiku · July 5, 2022

12 minutes ago, Love Zhaoying said:

If keys aren't "quite unique enough", I can certainly store the entire key since this specific scenario is short-term.

Given that any other method is going to require processing AND as has been pointed out, still runs a risk of getting it wrong, I'd personally go for the whole key.

That said, I've used that key2chan method and never (yet) had it talk to the wrong person.

Love Zhaoying · July 5, 2022

5 minutes ago, Quistess Alpha said:

Math time! Skimming Wikipedia the relevant equation here is:

$n(p;d)\approx {\sqrt {2d\cdot \ln \left({\frac {1}{1-p}}\right)}}.$

That is, if as Frionil Fang suggests, you want a less than 1% chance of collision, and you're using the 'last 4 digits' method, then you need to keep your list of keys below

sqrt(2*(2^16)*l(1/(1-0.01))) = 36 entries; for less than a 0.1% chance, 11 entries.

Compare that to a 32 bit entropy which would allow for ~2931 entries with less than a 0.1% chance of collision.

Oh dear, and I'm over here with Linden Lab, sitting on the "bad at maths" bench!

* pats bench invitingly * We have cookies!

Love Zhaoying · July 5, 2022

* me, bad at maths * "Solve for X? My Ex has too many problems to solve!"

Quistess Alpha · July 5, 2022

2 minutes ago, Love Zhaoying said:

"bad at maths" bench!

Breaking it down in simple terms (hopefully) there are 3 different variables that are all related to eachother:

D : How many different compressed keys exist? In the "last 4 characters" case, there are 4 letters, and because of how keys work, each character can be one of 16 different letters. 16*16*16*16 = 16^4 = 65536. the more possibilities, the less likely that 2 of them will be the same.
N : How many compressed keys are you keeping in a list? The more keys you store at once, the more probable that two of them will be the same.
P : The probability that 2 or more of the things in the list are the same. You want this to be low. How low? if you dump your list and fill it again 1/p times, you could expect about 1 collision.

If you only have 5~10 things in your list, taking the last 4 characters isn't as awful as I would have expected, but you're going to get collisions every 5000~1000 keys. depends on how important avoiding collisions is for you. the llHash method isn't too bad computationally and has a ~Much lower expectation for collisions.

Profaitchikenz Haiku · July 5, 2022

I got interested into the way key2chan works, especially since the wiki says this : Keys cannot be converted to integers,

So what seems to happen is

(integer) ("0x" + (string) key) generates a 32-bit integer by grabbing as much of the key as will not overflow 32 bits before then turning on the top (-ve sign) bit just in case it wasn't already on.

Which then fits into

28 minutes ago, Quistess Alpha said:

Compare that to a 32 bit entropy which would allow for ~2931 entries with less than a 0.1% chance of collision.

Suggesting that one day (out of how many?) one of my scripts is going to say something to the wrong person

Love Zhaoying · July 5, 2022

13 minutes ago, Quistess Alpha said:

the llHash method isn't too bad computationally and has a ~Much lower expectation for collisions.

..Referencing your previous suggestion, correct?

53 minutes ago, Quistess Alpha said:

key k = ...; // get a key from somewhere.
string test = llGetSubString(llIntegerToBase64(llHash(k)),0,5); // 6 ascii characters that stand for our key.

Quistess Alpha · July 5, 2022

16 minutes ago, Profaitchikenz Haiku said:

Suggesting that one day (out of how many?) one of my scripts is going to say something to the wrong person

Assuming there are always 64 people in any region (unrealistic, but just go with it.) (N=64) and given (integer)("0x"+(string)key) has 32 bits entropy, (D=2^32) we can expect your script to get it wrong once every: 1/p(N,D) ~= 1/(1-(((2^32)-1)/2^32)^((64*63)/2)) ~= 2,130,441 utterances.

Edited July 5, 2022 by Quistess Alpha

Wulfie Reanimator · July 6, 2022

9 hours ago, Love Zhaoying said:

In addition to "storing" the handle-key temporarily in memory, I plan to use the JSON functions to look it up later.

For this reason, storing partial keys (especially only 4 chars) doesn't seem like a good idea. How much later? How many keys do you have expect to search through?

For small domains, like avatars in a single sim, I will sometimes pass avatar keys as the integer for on_rez events, for example. For this you can use the first part of a key, which is conveniently 8 hex chars, which is enough to define a full 32-bit integer (though it won't have wide distribution). You can literally just typecast the key as integer.

Edited July 6, 2022 by Wulfie Reanimator

Love Zhaoying · July 6, 2022

1 hour ago, Wulfie Reanimator said:

10 hours ago, Love Zhaoying said:

In addition to "storing" the handle-key temporarily in memory, I plan to use the JSON functions to look it up later.

For this reason, storing partial keys (especially only 4 chars) doesn't seem like a good idea. How much later? How many keys do you have expect to search through?

This is for short-lifetime transactions. Later in this case means, just during the lifetime of the "handle=llFunction()" call, then the dataserver() response handler. Not very long at all. For this specific case, I'd probably expect no more than 4 or 5.

For some other later case, I'd not depend on the keys to still be in my JSON variable memory, unless I store the data for later (for example in an Experience key-value store).

Wulfie Reanimator · July 6, 2022

48 minutes ago, Love Zhaoying said:

This is for short-lifetime transactions. Later in this case means, just during the lifetime of the "handle=llFunction()" call, then the dataserver() response handler. Not very long at all. For this specific case, I'd probably expect no more than 4 or 5.

For some other later case, I'd not depend on the keys to still be in my JSON variable memory, unless I store the data for later (for example in an Experience key-value store).

In that case it should be fine, though personally (and this is totally a gut feeling), I would probably use more than 4 characters, just in case, especially with how convenient it is to use the first 8 instead. Comma.

Edited July 6, 2022 by Wulfie Reanimator

Mollymews · July 7, 2022

when we want to keep the JSON handles short and unique then is best to maintain the handle ident method ourselves, encoding this as the 1st element of the data returned in the dataserver event. Example:

dataserver(key id, string data)
{
   // JSON handle is the first two chars of data in the range ["10".."99"] which rolls over. Range managed by data source
   string handle = llGetSubString(data, 0, 1);


   ... write json ... handle ... data  
}

i would leave the handle in the data, so that should we want to extract the data from JSON as a chunk, the handle can readily accompany the data. Parsing the handle from the data chunk as necessary

i would also set the unique ident counter to 10 rather than 0, increment to 99 then rollover to start again at 10. So to not have to left pad a leading "0" for the numbers less than 10

Edited July 7, 2022 by Mollymews
_

Love Zhaoying · July 7, 2022

1 hour ago, Mollymews said:
when we want to keep the JSON handles short and unique then is best to maintain the handle ident method ourselves, encoding this as the 1st element of the data returned in the dataserver event. Example:
dataserver(key id, string data)
{
   // JSON handle is the first two chars of data in the range ["10".."99"] which rolls over. Range managed by data source
   string handle = llGetSubString(data, 0, 1);


   ... write json ... handle ... data  
}
i would leave the handle in the data, so that should we want to extract the data from JSON as a chunk, the handle can readily accompany the data. Parsing the handle from the data chunk as necessary

i would also set the unique ident counter to 10 rather than 0, increment to 99 then rollover to start again at 10. So to not have to left pad a leading "0" for the numbers less than 10

Hi Molly,

This is certainly an interesting solution, but I don't see a way to make it fit my "use-case".

Here are the specifics of my current "use-case" (others will be different):

1) I will be calling the following LSL Experience functions:

llReadKeyValue()

llUpdateKeyValue()

llCreateKeyValue()

llDeleteKeyValue()

Also, potentially but not relevant to the use-case: llDataSizeKeyValue(), llKeysKeyValue(), llKeyCountKeyValue().

2) All of the above functions return GUID's. (Unless I've gone mad and forgotten, again.)

When the dataserver() event fires, the "key id" parameter will contain a GUID. This is the same GUID value that was returned to the original caller of the llFunctions() above. (The same llFunctions() that then ultimately, asynchronously result in the dataserver() event being fired.)

3) The dataserver() event returns the "string data" parameter in the format shown for example in https://wiki.secondlife.com/wiki/LlReadKeyValue:

Poorly explained by the page:

The dataserver callback parameters are:

A key containing the handle returned from llReadKeyValue

A string containing a comma-delimited list (cdl). llDumpList2String([ integer success ] + components);

components vary depending upon success or failure of request.

Failure: cdl = llDumpList2String([ 0, integer error],",")

Success: cdl = llDumpList2String([ 1, string value ],",")

Anyway, the "string data" parameter will be:

Success/Error flag (1 character), + "," (Comma) + Error Code (if failure) / Experience call result Data (if success).

So, for this "use-case" there is no way to "maintain a handle identification method" as you say, unless I want to maintain a separate GUID-to-other-handle relationship, then have the dataserver() event add the not-GUID handle to the data instead of using the original GUID.

My current solution is that the dataserver() will call a custom "callback" function; I have to stick with this solution due to the overall architecture of my script.

Thanks,

Love

Mollymews · July 7, 2022

2 hours ago, Love Zhaoying said:

Anyway, the "string data" parameter will be:

Success/Error flag (1 character), + "," (Comma) + Error Code (if failure) / Experience call result Data (if success).

i see what you mean

you don't have a lot of choices

1) use the uuid as is (36 chars)

2) losslessly encode the uuid down to 9 chars

3) use a trim or hash of the uuid and gamble against a collision

if you can find a way to afford the 36 chars then go with 1)

i personally would not go with 3). I would not want to make gambling a component of this kind of application. A bet that will lose. A loss from which you will have to work out how to recover or settle for ignoring the loss as if it never happened

we can ignore losses in some situations. Like a visitor greeter that greets a visitor twice and ignores the other visitor. But am pretty sure you are not making a visitor greeter

Edited July 7, 2022 by Mollymews
typd

Love Zhaoying · July 7, 2022

29 minutes ago, Mollymews said:

i see what you mean

you don't have a lot of choices

1) use the uuid as is (36 chars)

2) losslessly encode the uuid down to 9 chars

3) use a trim or hash of the uuid and gamble against a collision

if you can find a way to afford the 36 chars then go with 1)

i personally would not go with 3). I would not want to make gambling a component of this kind of application. A bet that will lose. A loss from which you will have to work out how to recover or settle for ignoring the loss as if it never happened

we can ignore losses in some situations. Like a visitor greeter that greets a visitor twice and ignores the other visitor. But am pretty sure you are not making a visitor greeter

I think that in this case, I will go with "1" as the lifetime of the needed information is short: The specific object using the script will only need to get Experience key/value data for 1 years, the attached owner.

If I were coding a "server" object, on the Experienced-enabled land, and it needed to remember the GUID keyed data for many users at once, then I would shorten the key GUID's using one of the recommended methods.

Here's a potential sequence, where I would probably need to do something like you are suggesting - a "roll your own" handle:

- The User object can't call llExperience() functions because it is not on Experience-enabled land. The User object knows the GUID of the Server object.

- The Server object is on the Experienced-enabled land, so can call llExperience() functions for the User object.

1) User object sends request to Server object: In this case, the user object would not know the GUID key of the actual llExperience() request (read/create/update/delete key), as the Server is doing the actual llExperience() request.

The User object would need to generate and store it's own "handle" to match against the Server object's response - this looks like the use-case for what you were suggesting.

This "handle" could be the user's GUID + an incrementing number.

We'll call that a "roll your own" handle in the remaining steps.

2) Server gets request from "user object" (identifying the objec't's GUID) with the actual user's GUI and request information. (Is it a Read/Create/Update/Delete, what's the data, etc.)

3) Server calls llExperience() function - server must save the GUID key returned by the llExperience() function, and associate it with the original User object's "roll your own" handle.

4) Server's dataserver() event fires with the reply from the llExperience() function.

Server must lookup the "roll your own" handle information given the key passed in the dataserver() event.

5) Server returns response to the user object with the original "roll your own" handle.

Server object can immediately forget this "roll your own" handle information.

6) User object gets the server response, identified with the "roll your own" handle.

User object can immediately forget the "roll your own" handle information (after invoking a callback function with the response data).

Easy Scripting question - Handle Key Uniqueness

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Please sign in to comment

Linden Lab

Tilia

Second Life

Connect With Us

Partner With Us