Interpretative LSL Script - Determinism vs. Indeterminate Approach

Love Zhaoying · August 3, 2022

Greetings!

There are actually Questions and a request for feedback at the end.

I'm making great progress on my new LSL interpreter script.

One of the things I'm having trouble deciding is: Use a "Deterministic" approach or..not? (Yes, you can make fun of me and challenge my word choices and usage. Go for it!)

Here is the scenario: The interpreter has a single "stream" for incoming commands and parameters.

This stream is parameter "first" - meaning, parameters come BEFORE commands:

- Anything coming in from the "stream" that is not a command goes straight into a parameter..stack (let's call it a "parameter stack").

- When a command is detected in the "stream", the parameters already on the stack are used (and consumed, in most cases).

Now the question is which option to choose:

1) "Determinism"

In this model - I would always "know" whether I am looking at a parameter - or a command, without checking.

If a Parameter: goes on the stack

If a Command: executes the command, using parameters from the stack

2) "Indeterminism"

In this model - I wouldn't really know whether I am looking at a parameter or a command, until I check it to see.

Only after I check it, would I know that I am looking at a parameter (goes on the stack), or a command (gets executed).

=============================

Original approach: Early interpreter versions used commands which I had to actually "check". This is because they were "words" (mnemonics).

This method didn't have an option; it had to be "indeterministic" since I was committed to "parameters first" in the stream. (Why, is an important but not related question.)

Pros: It worked.

Cons:

- More overhead, in terms of both memory and processing.

- Pre-processing was required to check commands: Lookup in a list, get the list position - which became a "command number". The "command number" was checked thereafter instead of checking a mnemonic.)

- The only initial "check" for a command was: Is it a "3-character mnemonic?", and "is it in the list?".

- If it was not in the list, then it was automatically assumed to be a parameter.

- The list had to be in the script!

- Mnemonics took up more space (list + active script) then numeric commands would take up.

Current approach: Current interpreter versions use Pre-processing ("compiling") so that commands are easily identifiable.

This method still checks "command vs. parameter". However, it is easy to identify commands because of the unique format (3-digit numbers with a prefix character).

Pros:

- MUCH Less overhead - both memory and processing.

- List of mnemonics does not need to be in memory, or checked for position to get a command "number".

Cons:

- Still have to check "is this a command or not?"

Proposed approach: This method would use a "new command" which basically means: "push the next N things from the stream onto the parameter stack".

Pros:

- The interpreter scripts become "deterministic". This just means, "no more guessing whether something in the stream is a parameter or a command".

- Don't need to check if something is a command or not.

Cons:

- This would actually add more overhead, both processing and memory used by each script!

- This would add both additional processing vs. just "else it's not a command because it doesn't match the format of a command".

- This would add more commands to each script for each "push the next N things from the stream into the parameter stack".

- This would prevent certain "flexibility" built into the current model. (Not a command? On the stack you go! A command will deal with you later.)

=============================

Opinions? Feedback?

So far, this is working better than anything I've created. Most likely, because I am taking a non-intuitive approach!

I was buying into the "Proposed approach" but the more I think about it, not so much now.

Thanks,

Love

Quistess Alpha · August 3, 2022

I had a little spin in my mind about how ~I would implement a notecard interpreted "language" after reading one of your earlier posts somewhere, and glancing at some of Nanite Systems toys. and here's the rough idea I came up with:

all values are prefixed with a one-letter abbreviation of their type and an underscore except for strings: examples: [ s_string, k_key, v_<a,b,c>, r_<0,0,0,1> i_27 f_1.3 ](don't add lists as an extra type, use hacky workarounds to avoid them as arguments) everything else is (potentially) a command.

Each notecard line is parsed by spaces (or llList2CSV if that implementation ends up being more convenient) and then interpreted from right to left: if a value is encountered, cast to its correct type and replace in the list, if the command is a built-in, apply it to the arguments to the right on the stack, (checking llGetListEntryType and halting with an error if incorrect types or not enough arguments) replacing the function and arguments with the result. For "user defined functions"(which are also variables) pull the definition from a Json thing and put that on the stack over the call.

"if then else endif" gives enough clarity to do the obvious: when endif is encountered, jump to 'then' and keep reading, if the result after 'if' is non-zero, delete from else to endif, otherwise delete from then to else. jump back to endif.

"while do endwhile" would need a little more sophistication, but similar idea.

The best example of a stack based language I know of to steal ideas from is postscript, which I actually use once in a blue moon even though I'm pretty sure there are other better vector graphics languages, I haven't found them and am too lazy to really learn them. . . Postscript Language Reference Manual An example of me being a nerd.

animats · August 3, 2022

That's very Forth.

This is an old language design argument: verb-subject, or subject-verb. See this Wikipedia article.

In programming languages, it's seen as whether you have C-type syntax:

int x;

or Pascal-type syntax.

x: integer;

C-type syntax seems more intuitive at first. But when you can declare new types, you have to parse

foo bar;

which means you have to know what "foo" means at that point in the parse. Which means declaration must precede use. Which means parsing C and C++ requires reading all the include files just to parse the syntax. C backed into this. Originally, you couldn't declare new types. When that capability was added, parsing got much harder.

Too much language design theory for this forum.

Love Zhaoying · August 3, 2022

Ugh! I forgot to follow my own post! LOL!!

Love Zhaoying · August 3, 2022

1 hour ago, Quistess Alpha said:

I had a little spin in my mind about how ~I would implement a notecard interpreted "language" after reading one of your earlier posts somewhere, and glancing at some of Nanite Systems toys. and here's the rough idea I came up with:

all values are prefixed with a one-letter abbreviation of their type and an underscore except for strings: examples: [ s_string, k_key, v_<a,b,c>, r_<0,0,0,1> i_27 f_1.3 ](don't add lists as an extra type, use hacky workarounds to avoid them as arguments) everything else is (potentially) a command.

In another go at it a few years ago, I used a special UTF-16 character as the "marker". That wasn't very fun. But that approach used lists..

1 hour ago, Quistess Alpha said:

Each notecard line is parsed by spaces (or llList2CSV if that implementation ends up being more convenient) and then interpreted from right to left: if a value is encountered, cast to its correct type and replace in the list, if the command is a built-in, apply it to the arguments to the right on the stack, (checking llGetListEntryType and halting with an error if incorrect types or not enough arguments) replacing the function and arguments with the result. For "user defined functions"(which are also variables) pull the definition from a Json thing and put that on the stack over the call.

Since I used JSON for literally everything this time around, things are sooooo much easier than when I used lists in the previous go-arounds.

1 hour ago, Quistess Alpha said:

"if then else endif" gives enough clarity to do the obvious: when endif is encountered, jump to 'then' and keep reading, if the result after 'if' is non-zero, delete from else to endif, otherwise delete from then to else. jump back to endif.

"while do endwhile" would need a little more sophistication, but similar idea.

Since this specific language implementation is more like "assembly language", there are no "blocks"!

1 hour ago, Quistess Alpha said:

The best example of a stack based language I know of to steal ideas from is postscript, which I actually use once in a blue moon even though I'm pretty sure there are other better vector graphics languages, I haven't found them and am too lazy to really learn them. . . Postscript Language Reference Manual An example of me being a nerd.

I had to deconstruct some (non-binary) Postscript for a RL work project. Since the Postscript had been constructed by an old, really old, really-really old program (1994 I think) - it was fugly. But anyway, I got it to work - my program that re-used the old Postscript. Luckily, for that project I had a consultant to do the "munging" of the deconstructed data (store it for re-use, etc.).

Thank you for your feedback!!!

Edited August 3, 2022 by Love Zhaoying

Love Zhaoying · August 3, 2022

13 minutes ago, animats said:

That's very Forth.

This is an old language design argument: verb-subject, or subject-verb. See this Wikipedia article.

I think that I remember having FORTH either on the Apple ][, or on the Commodore 64..

I don't remember doing FORTH on the mainframe, but we definitely did FORTRAN on it.

14 minutes ago, animats said:
In programming languages, it's seen as whether you have C-type syntax:
int x;
or Pascal-type syntax.
x: integer;

Interesting that you brought up Pascal. For whatever reason, I like it a lot so chose it for my first RL job language (I was their first programmer so got to "choose"). I suppose about 1988 - Turbo Pascal, than later Borland Pascal. New owners brought in VERY earlier C++ for Windows, etc. etc.

But I digress - I had forgotten the Pascal syntax. Cool!

16 minutes ago, animats said:
C-type syntax seems more intuitive at first. But when you can declare new types, you have to parse
foo bar;
which means you have to know what "foo" means at that point in the parse. Which means declaration must precede use. Which means parsing C and C++ requires reading all the include files just to parse the syntax. C backed into this. Originally, you couldn't declare new types. When that capability was added, parsing got much harder.

Too much language design theory for this forum.

The way I designed the new language - it is a) untyped (unless typing is needed for a conversion), and b) there are no declarations needed - just store your variable. Everything is a String unless it isn't. Since I'm using 99.9% JSON - it's all a string anyway.

So, I avoided the issues that come with declarations / typing.

Thank you for your feedback!!!

animats · August 3, 2022

17 minutes ago, Love Zhaoying said:

I had forgotten the Pascal syntax.

Pascal is history now, but that declaration syntax lives on in Modula, Ada, Delphi, Go, and Rust. C syntax lives on C++, C#, and D. And LSL. There's still a division on this.

28 minutes ago, Love Zhaoying said:

The best example of a stack based language I know of to steal ideas from is postscript,

Yes. Postscript works that way. PostScript is direct input to an interpreter, for which this sort of syntax makes sense. Push data on stack, execute operators which pop from stack. But few people write Postscript. Programs write Postscript. Keeping the stack usage correct is better done by programs.

Love Zhaoying · August 3, 2022

10 minutes ago, animats said:

Pascal is history now, but that declaration syntax lives on in Modula, Ada

I remember my dad telling me at one point that all projects for NASA had to be in ADA (during those years). A couple years later, I got an ADA compiler for the Commodore 64! I think it was only 2 disks.

I've been coding C++ since the very early 90's, and am the only person at my Fortune 500 company that uses it!

10 minutes ago, animats said:

Postscript works that way. PostScript is direct input to an interpreter

Luckily, the PostScript I worked with was not binary, yet.

Edited August 3, 2022 by Love Zhaoying

Love Zhaoying · August 3, 2022

Sometimes I forget - when I started Second Life, I only had 20 years as a professional developer.

Now, I've had 35 years as a professional developer.

So, some of my additional "learning" / "improvements" in Second Life scripting may, in fact, be due to a large increase in my experience over the years.

Interpretative LSL Script - Determinism vs. Indeterminate Approach

Recommended Posts

Love Zhaoying

Link to comment

Share on other sites

Quistess Alpha

Link to comment

Share on other sites

animats

Link to comment

Share on other sites

Love Zhaoying

Link to comment

Share on other sites

Love Zhaoying

Link to comment

Share on other sites

Love Zhaoying

Link to comment

Share on other sites

animats

Link to comment

Share on other sites

Love Zhaoying

Link to comment

Share on other sites

Love Zhaoying

Link to comment

Share on other sites

Please sign in to comment

Linden Lab

Tilia

Second Life

Connect With Us

Partner With Us

Forums

Blogs

Activity