Jump to content

Packing a Key into a String


agentronin
 Share

You are about to reply to a thread that has been inactive for 474 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

I am using linkset data to store information about an avatar. The way I have it working now is the avatar's key (aka UUID) is stored into a hexadecimal string, and that string used as a key in the linkset data. It is not memory efficient to convert this key's 16 bytes into a 36 character hexadecimal encoded string (4 of the characters are dashes). Is there a way to cast, and compress, an avatar key's 128 bit pattern into a string variable 16 bytes long that can be used as the key in the linked data set?

Link to comment
Share on other sites

Among other examples in the library section. For technical reasons (no direct access to binary memory, strings automatically handle character-encoding in a way that can be binary-inconvenient), achieving an exact 16 byte conversion in LSL is not possible

Edited by Quistess Alpha
Link to comment
Share on other sites

If you're only dealing with a small set of avatars, you can pretty safely just truncate the key to the length you want.

Sure, it's possible that another avatar (or multiple) have the exact same key for the first 8 characters, but the likelihood of both of those people interacting with the same object is very, very small.

Edited by Wulfie Reanimator
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

On 3/14/2023 at 5:30 PM, agentronin said:

Is there a way to cast, and compress, an avatar key's 128 bit pattern into a string variable 16 bytes long

is not possible to encode the set of all keys into  a 16 byte length string and have the full set decodable in either UTF-8 or UTF-16. LSL follows the C convention for handling strings. Char(0) denotes end of string and on its own in LSL returns an empty string (length 0)

so the best we could ever do (assume all UTF-8 8-bit chars other than 0 could be decoded which is currently not the case) is 19 bytes. 128 bits / log2(127) = roundup(18.32 bytes) = 19 bytes

the (recent-ish) examples in the Scripting  Library  are more about showing different ways to encode/pack keys in the LSD space. But there is a hard lower limit to this as shown above. So any further work (should it happen) would be to make more efficient methods. Efficient in this case meaning faster

 

Link to comment
Share on other sites

On 3/14/2023 at 4:30 AM, agentronin said:

Is there a way to cast, and compress, an avatar key's 128 bit pattern into a string variable 16 bytes long that can be used as the key in the linked data set?

you could make up a string comprising the first 8 characters of the key and the first 8 characters of the name, going further from Wulfie's suggestion, as it even less likely that two avatars with the first 8 characters of their keys the same will also have the first 8 characters of the name the same (even Lindens are going to start diverging after the 6th letter).

Edited by Profaitchikenz Haiku
Link to comment
Share on other sites

 

48 minutes ago, Profaitchikenz Haiku said:

you could make up a string comprising the first 8 characters of the key and the first 8 characters of the name, going further from Wulfie's suggestion, as it even less likely that two avatars with the first 8 characters of their keys the same will also have the first 8 characters of the name the same (even Lindens are going to start diverging after the 6th letter).

is kinda interesting when we start thinking about usernames in the LSD space

LSD UTF-8 starts making it viable to use the username as the unique identifier. Max. username length is 32 chars (including the dot). As usernames are variable length then we may (most likely) find that over the set of all usernames we use less chars in the LSD space than the set of all packed keys

for example your username is 20 chars. Mine is 11 with dot as the last char "elleevelyn." Combined 31 chars as opposed to packing both our keys for 38 chars total

as a exercise, when we do the arithmetic then packing usernames gives a max. length of 25 chars and a min length of 2 chars. 26 and 3 if we include a separator char. Base37 to base2^32 to base121 to base128

Link to comment
Share on other sites

11 minutes ago, elleevelyn said:

LSD UTF-8 starts making it viable to use the username as the unique identifier.

Remember that usernames can change. Even if you use the full name of an avatar, you're reintroducing the old problem of systems relying on data that doesn't constantly identify the same avatar.

Compressing or shortening the key is the way to ensure that the item doesn't change/break when someone changes their name.

  • Like 1
Link to comment
Share on other sites

58 minutes ago, Wulfie Reanimator said:

Remember that usernames can change. Even if you use the full name of an avatar, you're reintroducing the old problem of systems relying on data that doesn't constantly identify the same avatar

yes we have to guard against this.  In the must always be valid case then all usernames (current and historical) for the same account, resolve to the same key

 

edit ps. I just clarify this.  In the imprecise case where we shorten the key then we can get a collision (meaning that 2 or more people can be seen as the same person). In the change username case then it can result in a duplicate of the same person

the imprecise method we use depends on the use case

for example: In a one gift per visitor giveaway store promotion then shortened key can result in a store visitor not receiving the gift. In the username  case then everybody receives the gift, and a person can receive more than one copy of the gift should they change their name during the life of the promotion. As the store owner we have to weigh the pros and cons of this. Everybody gets the gift and some people may get more than one copy, or some people may not get a gift at all 

edit more: In the case where we are monitoring unique visitors to our store then the shorten key method means that we may slightly under count the number. And the username method means that we may slightly over count. In this case as the store owner we may decide that slightly under counting is preferable so we go with the shorten key method  

Edited by elleevelyn
Link to comment
Share on other sites

Ok, so given the issue of changing usernames, either you would pick the first 8 characters of the legacy name which is as far as I know immutable, or, how about the first 8 and last 8 characters of the key? It's getting to a vanishingly small probability of two avatars visiting the same region albeit separated by time and having identical first 8 and last 8 segments of their keys?

Link to comment
Share on other sites

with shortened username then we probably better off using the middle X chars of the firstname rather than the first X chars. Without looking it up I am fairly confident that in the set of all name choices that people make with alphanumeric chars, then the set of the middle X chars is greater than the set of the first X chars or the last X chars

example. pick 3 chars

sam, sam123, sammy123

middle are: sam, am1, mmy

 

as an aside. With the usecase of a visitor greeter then a changed username can be beneficial for both the location owner and the visitor. Example: My name is Sam. Greeter: Welcome Sam!.On my next visit in the same interval my name is now Sammy. Greeter: Welcome Sammy!. If the greeter is using my key then it won't greet me as Sammy  (my new name which I paid $50 for)

 

Link to comment
Share on other sites

5 hours ago, elleevelyn said:

Example: My name is Sam. Greeter: Welcome Sam!.On my next visit in the same interval my name is now Sammy. Greeter: Welcome Sammy!. If the greeter is using my key then it won't greet me as Sammy  (my new name which I paid $50 for)

With a key you can use llGetUsername() and llGetDisplayName() to retrieve a resident's current username & display name, respectively.

Edit: And with some additional data, you can greet people whose name changed differently from people totally new to the parcel.

First timer: "Hello, firstname!"

Repeat visit: (no greeting)

Repeat visit but new name: "Hello, newfirstname! I see you have a new name now... 😎"

Edited by primerib1
Link to comment
Share on other sites

3 hours ago, primerib1 said:

First timer: "Hello, firstname!"

Repeat visit: (no greeting)

Repeat visit but new name: "Hello, newfirstname! I see you have a new name now... 😎"

just on this. We can't know that the username has changed when we have only saved the key

 

on the general matter

while I am into things like compression algorithms generally, is not always appropriate. Like it can be inappropriate to spend additional server cycles because of a desire to pack as much information as we can into a space

one of the issues that comes with LSD is that Linden choose to go with  UTF-8 in the LSD space, while the LSL space is UTF-16.  A thing about this is that every LSL<->LSD transfer has to convert from UTF-16 to UTF-8 and vice versa. And while each instance is time trivial I am not sure how that is going to scale as scripters get more and more into using LSD, but I leave it to Linden to worry about that

consider tho the impact of this on the specific issue being discussed

a UTF-8 encoded key (optimally 19 bytes in the LSD space) consumes 38 bytes in the LSL space. Which is 20 more bytes than Key2Packed (optimally 18 bytes in the LSL UTF-16 space ). Without having done any LSD testing at all, the arithmetic tho suggests/shows that a string of UTF-8 encoded keys will not contain the max. number of keys that we might be hoping. For the reason the string of keys has to fit in the LSL space 

so we might have to rethink about how to write/design the key packing algorithm for LSD, The optimal (optimal in this case meaning most useful) would that the length of the encoded key in bytes is not greater in the LSL space than it is in the LSD space. And I think that a method that uses 2 UTF-8 byte chars is maybe the way to do this, if we want the byte length to be the same in both spaces. And here is a thing. Your packer algorithm in the Scripting Library touches on doing this and I think with some little more work we may be able to get a key packer which is optimal. (optimal meaning most useful in terms of how space is used)

i haven't done any work on this myself. Naively tho, in the 2 byte case there are 11 bits available. 128 bits / 11 * 2 = 23.73 (24) bytes in both LSL and LSD space

 

Link to comment
Share on other sites

2 hours ago, elleevelyn said:

The optimal (optimal in this case meaning most useful) would that the length of the encoded key in bytes is not greater in the LSL space than it is in the LSD space

I disagree, for most use-cases, you should store the majority of keys in LSD and only 'pull' it into script memory when appropriate; as a single key in mono script memory is trivial, the only efficiency questions worth asking are 'how small is it in utf8?' and 'how fast/efficient is the conversion?'.

  • smallest size in utf8 is the 19 byte one I linked above.
  • smallest size in utf16 is first one molly posted a bit ago.
  • fastest compression with small size in utf8 might be a 20-byte compression with no carry. I don't think anyone's doe lots of in-depth comparisons.

 

  • Thanks 1
Link to comment
Share on other sites

Since most of my use-cases for Keys are "internal to my script process", obtained through llGenerateKey() - it would be great if there was a variant that returned "shorter keys"!

Or, if we had a consensus for the "llGenerateKey()" use-case that for instance, just using the "first 4" + "last 4" was sufficient.

 

Link to comment
Share on other sites

22 minutes ago, Love Zhaoying said:

llGenerateKey() - it would be great if there was a variant that returned "shorter keys"!

I'm sure you could cook something up with llChar(32+(integer)llFrand(95))+llChar(...)+llChar(...)+...; or so.

my back of the envelope math says 5 or 6 base 95 characters should probably avoid a collision for most LSL use-cases.

Edited by Quistess Alpha
  • Thanks 1
Link to comment
Share on other sites

16 hours ago, Profaitchikenz Haiku said:

Ok, so given the issue of changing usernames, either you would pick the first 8 characters of the legacy name which is as far as I know immutable, or, how about the first 8 and last 8 characters of the key? It's getting to a vanishingly small probability of two avatars visiting the same region albeit separated by time and having identical first 8 and last 8 segments of their keys?

Usernames and legacy names refer to the same thing (kinda). It's the avatar's name, like mine is Wulfie Reanimator.

The two terms exist because there's a small twist for accounts created after last names went away. When you create an account, you choose a username (like Username) and your legacy name will be Username Resident.

If you change your name, your username and legacy name changes. The name you log in with can be your avatar's full name (separated by space or dot), or just your first name if your last name is Resident.

Edited by Wulfie Reanimator
  • Thanks 1
Link to comment
Share on other sites

4 hours ago, elleevelyn said:

just on this. We can't know that the username has changed when we have only saved the key

Of course, you need to save the user's username as the value of the LSD KVP.

So:
    Key = "seen:<UUID>"
    Value = "<llGetUsername()>"

I mainly responded to your statement here:

14 hours ago, elleevelyn said:

If the greeter is using my key then it won't greet me as Sammy  (my new name which I paid $50 for)

Which is totally wrong as you can always get someone's current username using llGetUsername() as long as that resident is in the same region.

Link to comment
Share on other sites

4 hours ago, primerib1 said:

you can always get someone's current username using llGetUsername() as long as that resident is in the same region.

i am not giving enough explanation, so I do so now

in the context of packed keys. The purpose of packed keys storage in a greeter is so that during a period we will only greet a person one time. Person comes to our location, we greet them (typically the greet is some combination of greet message, give LM, invite to group, external website link, etc)

the number of keys we can store determines the upper limit of the period

Sam visits first time. Greeter: if (key not found in storage) {Welcome Sam! Save key to storage}

Sam changes name to Sammy. Greeter: if (key found in storage) {Do nothing as Sammy has the same key as Sam and the key is already in storage}

the next time Sam/Sammy is greeted as Sammy is when the next/new period begins. At start of new period Sam/Sammy is not in storage

leveling this up. to what you have mentioned

send a greetings message every time (Welcome! Sam then Welcome back! Sammy (as you mention), then only drop LMs, group invites, etc one time during the period

is the dropping of LMs, etc that greeters typically also do that I didn't explain well

when we are ok with dropping on our visitor every time they visit then is no need to pack and store keys 

Link to comment
Share on other sites

5 hours ago, Wulfie Reanimator said:

If you change your name, your username and legacy name changes.

This is confusing me. I have a friend who is about the same age as I, so he has a first name of his choice, but one of the pre-resident Linen last names. He has changed his name, but I see him as newname (oldfirstName LindenLastName) with the parentheses showing me what he was "born" as. To my knowledge, he can't change his born name?

 

Link to comment
Share on other sites

39 minutes ago, Profaitchikenz Haiku said:

This is confusing me. I have a friend who is about the same age as I, so he has a first name of his choice, but one of the pre-resident Linen last names. He has changed his name, but I see him as newname (oldfirstName LindenLastName) with the parentheses showing me what he was "born" as. To my knowledge, he can't change his born name?

You'd have to show me what you're seeing, or give the person's current name/key. You can PM that to me and I can probably explain the details. 🙂

There's no way to programmatically find out someone's previous/original name as far as I know.

Edited by Wulfie Reanimator
Link to comment
Share on other sites

1 hour ago, Profaitchikenz Haiku said:

newname (oldfirstName LindenLastName)

a few things.

  • Display names are free to change, and in fact you can change them weekly. They should ~never be used in identification as they are not unique. usually when you see someone's name in that format, "newname" is the display name, often used by people who want funky characters in their name to be "cool". If you see me in-world you might see my name as "Tessa (Quistess Alpha)"
  • When you pick a new name (by paying the 40$ fee) you may opt to change only your first name, and not your last name. "alpha" is no longer on the list, but if I really wanted to, I could change my name to "Tessa Alpha" (but I wouldn't because I don't like the double a)
  • Thanks 1
Link to comment
Share on other sites

8 hours ago, Quistess Alpha said:
  • fastest compression with small size in utf8 might be a 20-byte compression with no carry. I don't think anyone's doe lots of in-depth comparisons.

 

you might be right about this. If I get a bit of time I might have a go at measuring the offerings to date

Link to comment
Share on other sites

9 hours ago, Love Zhaoying said:

Since most of my use-cases for Keys are "internal to my script process", obtained through llGenerateKey() - it would be great if there was a variant that returned "shorter keys"!

Or, if we had a consensus for the "llGenerateKey()" use-case that for instance, just using the "first 4" + "last 4" was sufficient.

 

i have been following along your posts about your app

is unclear to me why you are not using ordinal order to construct unique identifiers. eg. getNext(0) getNext(1) ... getNext(MAX) , getNext(0), ...

MAX = 2^32 is a lot of unique keys until the counter rolls over. If need more that this, the getNext counter can use 2 or more integer types

we can use a feistel network algo to change the ordinal order arrangement so getNext returns a random-looking unigue value but I don't see any gain for the performance hit 

  • Thanks 1
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 474 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...