Jump to content

LSL Run-time Error: "System.NullReferenceException: String must not be null"


Love Zhaoying
 Share

You are about to reply to a thread that has been inactive for 259 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

I managed to create a nice LSL Run-Time error:   "System.NullReferenceException: String must not be null."

This happened because I was bypassing a String variable definition!  The function llJsonGetValue() threw the error if the string passed to it was "NULL".

Yes, I was using "jump".  I have my reasons!

To reproduce, leave the line with "jump BypassVariableDefition" uncommented.

/*
    Run-Time Error Research
    
    Error if String variable is undefined, results in:
         System.NullReferenceException: String must not be null.   

    Full Error message:

JSON Parser Tester [script:New Script - JSON Failure] Script run-time error
System.NullReferenceException: String must not be null.
  at (wrapper managed-to-native) LindenLab.SecondLife.Library:llJsonGetValueInternal (string,object[],int)
  at LindenLab.SecondLife.Library.llJsonGetValue (System.String j, System.Collections.ArrayList list) [0x00000] in <filename unknown>:0 
  at ....gMain () [0x00000] in <filename unknown>:0 
  at ....edefaultstate_entry () [0x00000] in <filename unknown>:0 
  at LindenLab.SecondLife.UserScript.OnStateEntry () [0x00000] in <filename unknown>:0 
  at LindenLab.SecondLife.Script.UserEvent (ScriptEvent e) [0x00000] in <filename unknown>:0 
  at LindenLab.SecondLife.Script.Run (ScriptEvent evt) [0x00000] in <filename unknown>:0     
    
*/


// To reproduce error, uncomment "jump" line below
// For no error, comment out "jump" line below.
Main() {



    string sBypassing;
    sBypassing = "Yes";

// Next line bypasses "string sIdentifier", results in System.NullReferenceException: String must not be null.
//    jump BypassVariableDefition;
    
    sBypassing = "No";
   
    string sIdentifier;
        
@BypassVariableDefition;
    llSay(0, "Bypassing Variable definition: " + sBypassing + " : "  + llJsonGetValue(sIdentifier,["Level1","Level2"]));
    return; 

}



default
{
    

    state_entry()
    {
        Main();
    }
}

 

Edited by Love Zhaoying
Added LSL to title and first sentence
Link to comment
Share on other sites

  • Love Zhaoying changed the title to LSL Run-time Error: "System.NullReferenceException: String must not be null"
9 minutes ago, Frionil Fang said:

Definitely, thanks!

IMHO, variables should be defined at the "start" of a "scope", no matter where the user put the definition within that "scope".

ETA: Looks like there's no way to vote or comment on it, since it was already "accepted", etc. At least I could "follow" it.

 

Edited by Love Zhaoying
Link to comment
Share on other sites

2 minutes ago, Love Zhaoying said:

IMHO, variables should be defined at the "start" of a "scope", no matter where the user put the definition within that "scope".

Bound to agree with that, and from what I recall most languages would consider a possibility of an undeclared variable being accessed an error at compile time (at least C and Java).

Side note, I fully approve of goto (jump) shenanigans, function calls are expensive in LSL so doing "tail calls" with jumps is cool.

  • Thanks 1
Link to comment
Share on other sites

5 minutes ago, Frionil Fang said:

Side note, I fully approve of goto (jump) shenanigans, function calls are expensive in LSL so doing "tail calls" with jumps is cool.

Thanks, good at least one person will agree with my plans! 

I am using jump shenanigans to create some LSL "extensions". 

The way I see it, "under the hood" it's all "jump" statements anyway..and since they did not give us the functionality I want/need, darn them all!!!

I am surprised that LSL is so tightly "scoped" considering how primitive it is!  By this, I mean that you can only "jump" to a label within the same scope or higher.

 

  • Like 1
Link to comment
Share on other sites

I did not mention this, but before I reduced the code for my example down to the bare minimum...

The variable that actually caused the error was also defined as a global!

If I commented out the local copy of the variable (which was really there "by accident" / as a "side effect" of some Parsing)..the error did not happen.

So, if I had not defined the variable locally, I never would have seen the error.

@Frionil Fang- this deepens the mystery a LITTLE bit, because obviously the variable was indeed defined already - as a global!  To me, this just means that the variable is "defined" but not "initialized" until the code hits the line where you declare it.

In this case, "MyString" shown below causes the error even though it was defined as a global!

string MyString; // Global definition for the same identifer

Main() {

jump PastMe;

// The error would not occur if the next line was commented out

string MyString;  // Local definition for the same identifier

@PastMe;

llSay(0, "Error due to MyString = " + MyString);

}

 

 

  • Like 1
Link to comment
Share on other sites

23 minutes ago, Love Zhaoying said:

The variable that actually caused the error was also defined as a global!

that makes sense, even if unintuitive; LSL's scope detection doesn't take jumps into account; the fact that there exists a variable definition previously in the same scope means the erroneous say references the local version of the variable (which isn't declared because of the jump!). Consider also:

default
{   touch_start(integer total_number)
    {   if(FALSE) jump hook_1; // no error if TRUE.
        else jump hook_2;
        @hook_1; string s = "1";
            jump hook_out;
        @hook_2; //string s = "2"; // compile time error!
        @hook_out;
        llSay(0,s);
    }
}

also noteworthy is that the fact that the variable is a string is important; integers seem to be initialized correctly to zero even if their declaration is jumped over.

Edited by Quistess Alpha
  • Thanks 1
Link to comment
Share on other sites

50 minutes ago, Quistess Alpha said:

that makes sense, even if unintuitive; LSL's scope detection doesn't take jumps into account; the fact that there exists a variable definition previously in the same scope means the erroneous say references the local version of the variable (which isn't declared because of the jump!). Consider also:

default
{   touch_start(integer total_number)
    {   if(FALSE) jump hook_1; // no error if TRUE.
        else jump hook_2;
        @hook_1; string s = "1";
            jump hook_out;
        @hook_2; //string s = "2"; // compile time error!
        @hook_out;
        llSay(0,s);
    }
}

also noteworthy is that the fact that the variable is a string is important; integers seem to be initialized correctly to zero even if their declaration is jumped over.

Note that I do not recall ever seeing runtime errors before. I suspect in this case it's because the JSON library expected a valid / initialized variable.

  • Like 1
Link to comment
Share on other sites

2016…?  At least.  More likely right from day 1…  I've known about it over a decade now…  (And also forgotten about it, and re-discovered it again in between… *mutters* )

10 hours ago, Frionil Fang said:

Bound to agree with that, and from what I recall most languages would consider a possibility of an undeclared variable being accessed an error at compile time (at least C and Java).

Side note, I fully approve of goto (jump) shenanigans, function calls are expensive in LSL so doing "tail calls" with jumps is cool.

Well, actually, C defines variables as uninitialised.  So in C's case, you'll get whatever was in memory already.  That's not to say modern IDE's and compilers haven't added warnings, they've added a whole bunch of linting stuff these days — stuff that was left out originally because systems were much smaller.  But you can turn those off, and you'll get C's default behaviour of "whatever was in memory" (whjch is particularly fun for stack allocated variables).  And ideas about wasting memory have changed, too, so global/static variables are often allocated in zero-filled memory (ie, they'll read as NULLs).  Pretty sure most loaders have the option to zero-fill sections (the executable just says, "give me X bytes of zero-filled memory), but even still, you can see huge swaths of 0's just sitting in many executables because it gets initialised at compile time, and then the whole chunk is just slapped into memory by the loader (and the compiler couldn't be bothered splitting that space into zero'd and initialised loader sections).  You see this up close and personal if you've ever done any assembler, where you actually define the loader sections yourself.

Pretty sure Java is different, because everything is an object, therefore everything has an initialiser, and because it's a VM not "bare metal" code — also, does it even actually have statically initialised variables (not the same as static variables, I'm talking about the space the object will occupy having been pre-initialised by the compiler)?  I rather suspect every Java program starts with a flurry of object creation, where each and every variable/object initialises itself.  Jave is also just plain newer, and the newer languages are burning not-so-precious-any-more memory and processor time with trivial niceties like initialising variables for you.  Mono is doing the same for LSL, by the way, also being VM-based (after all, it is literally "Microsoft's answer to Java")…  Just, it's initialising your variable to NULL, rather than "", which LSL isn't happy about.

Basically, the variable definition is almost certainly "hoisted" (to use a JavaScript term) to the top of the scope (maybe even the function as a whole), but then an assignment is placed at the spot where it's defined — since it may be initialised to a value other than "", and if that value is computed, then it doesn't actually exist yet.  So instead of trying to figure that out, or potentially wasting time initialising all your variables twice (three times if it's a for loop variable — how many of us are guilty of setting an already 0 variable to 0 at the front of our for loops?), it just puts an assignment at the definition spot and expects you not to do dumb things.

Link to comment
Share on other sites

11 hours ago, Frionil Fang said:

Also, annoyingly, they say they review past bugs periodically and stuff…  But I had a Jira accepted, and then like the very next meeting a week later, the lindens present were talking about it like it was a completely new idea, and accepting suggestions, most of which I'd already covered in my Jira issue (because I tend to be rather on the verbose side).  So we do clearly need to remind them of these things from time to time…  But the topic did come up again just recently, so it's at least on their radar.  Though probably infrequent enough that they've just labelled it as "too hard" for now.

Link to comment
Share on other sites

2 hours ago, Bleuhazenfurfle said:

That's not to say modern IDE's and compilers haven't added warnings,

Ah, yes, but I meant undeclared, not uninitialized. The compilers I checked just seem to keep better track of which variables are in scope than LSL, preventing the jump over declaration/access to undeclared variable at compile time. Doing that would almost certainly be unintended even if the actual memory structure for said variable was allocated regardless where the declaration is.

Edit: maybe I was a little too drunk last night but I can actually make C happily jump over variable declaration and still compile when I look at it now, with the expected unitialized values on access. I.e. the below appears to work just like in LSL:
 

Quote

 

#include <stdio.h>

int main()
{
    int n;
    for(n=0;n<2;++n) {
        printf("n=%d\n", n);
        {
            if(n) goto skip1;
            int foo = 137;
            skip1:
            printf("foo=%d\n", foo);
        }
        {
            if(n) goto skip2;
            const char* bar = "137";
            skip2:
            printf("bar=%s\n", bar);
        }
    }
 
    return 0;
}

 

 

Edited by Frionil Fang
lack of sobriety
  • Haha 1
Link to comment
Share on other sites

Also, also…  That inspired me to do a little test…  Looks like the variable is likely function-hoisted, not scope-hoisted.

default
{
    state_entry()
    {
        integer n;
        for ( ; n < 2 ; ++n )
        {
            llOwnerSay("N = " + (string)n);
            {
                llOwnerSay("Inner");
                if ( n ) jump skipInit;
                // change the above to !n, and watch it burn
                string foo = (string)n;
                @skipInit;
                llOwnerSay("foo = " + foo);
            }
            llOwnerSay("Outer");
            {
                llOwnerSay("Inner 2");
                if ( n ) jump skipInit2;
                string bar = "cows";
                @skipInit2;
                llOwnerSay("bar = " + bar);
            }
        }
    }
}

Which is mostly what I'd have expected.  All local variables are likely pre-allocated at function entry.  Though with how little memory we get, scope-hoisting would have been nicer.

Easiest example of this, is in Python, when you inspect a function it has a "varnames" array containing the names of all your function parameters, and then all your local variables.  There's a corresponding array of values created at function call (pretty sure Python just does a bulk zero-fill too, probably as simple as a call to calloc), and then the supplied parameter arguments are simply plugged into the appropriate spots before the function is started — all local variables your function will define already exist in the array by that point.  And within the bytecode, variables and function parameters are simply referred to by their index into that array, rather than their names (the names are kept around for introspection purposes, but never used by the bytecode itself).  Python does it this way because it doesn't use the system stack for function calls (makes generators trivial, and Python loves generators).  Also because everything in Python is an object (much like Java), all variables just happen to take up the exact same amount of space, too — a pointer to the actual value somewhere on the heap.  Of course, Python also doesn't have block scoping, but the same thing happens in languages that do.  Whatever the specifics, in compiled languages (C in particular) the local variables are generally "allocated" by simply adding (technically subtracting, since stacks generally grow downwards) their total size to the stack pointer, and expecting the programmer to initialise them appropriately (this is the C way, before linting comes into play).

Anyhow…  Dunno whether LSL (Mono) does arguments in Python or C style (I think I saw something in this forum a bit back, suggesting Mono uses a system stack), but I'd be quite surprised if one way or another, your Mono locals weren't just sitting in a chunk of bulk zero-filled memory — hence the NULL exception error if you skip the formal initialisation.

Link to comment
Share on other sites

27 minutes ago, Frionil Fang said:

Edit: maybe I was a little too drunk last night but I can actually make C happily jump over variable declaration and still compile when I look at it now, with the expected unitialized values on access. I.e. the below appears to work just like in LSL

Yup, it is the C way.  And lols…  You must have been doing that test, while I was doing the exact same one in LSL.

Link to comment
Share on other sites

10 minutes ago, Frionil Fang said:

I saw your post and was "huh" and turned it into C to look at it again, this wasn't convergent evolution!

Damn.  Oh wells.  Would have been funnier if it was.  heh

Also also also, if that was LSL, you'd have just done the "initialise a variable two (or three) times" thing I'd mentioned.

Edited by Bleuhazenfurfle
Link to comment
Share on other sites

1 hour ago, Frionil Fang said:

with the expected unitialized values on access. I.e. the below appears to work just like in LSL:

Two questions:

- What do you mean by "with the expected values on access"? (the term "on access" throws me off..)

- When you say, "the below appears to work just like in LSL", do you mean you get errors or undesired behavior in C?  If the local (or global) memory space WAS initialized with binary 0's, then the normal C "string" functions would see the string as "empty".

Please help me understand!

 

Link to comment
Share on other sites

28 minutes ago, Love Zhaoying said:

Two questions:

- What do you mean by "with the expected values on access"? (the term "on access" throws me off..)

When you use the variable name, presumably.

28 minutes ago, Love Zhaoying said:

- When you say, "the below appears to work just like in LSL", do you mean you get errors or undesired behavior in C?  If the local (or global) memory space WAS initialized with binary 0's, then the normal C "string" functions would see the string as "empty".

Because dynamic strings can change in size, they're not "in" the variable.  The variable is a pointer to the actual string contents which is stored somewhere else (typically the heap, but for string literals, that may be the bytecode/executable).  So a NULL (basically, 0) is a very very different thing compared to an empty string — the string isn't empty, it fundamentally doesn't exist.  If you try to read it anyhow, C won't throw any errors (not counting any linting done by the IDE or compiler, C also doesn't have exceptions) unless it explicitly checks first, most often the program just crashes with a page fault when it tries to read the start of memory (which it's not allowed to do), unless it's in DOS (or similar non-protected environment) in which case it just reads whatever garbage happens to be there (neither case is terribly good).

Basically, in Mono (in which LSL runs), the variable is pre-filled with 0's making it NULL, and then explicitly checked to make sure it's not still NULL when you go to use it (that's where the exception usually comes from — the fact they don't just swap it out for an empty value, probably means the check is being done by either the VM, or it's standard library).  They could basically have just done the equivalent of: myVar ??= ""; (what I often do in TypeScript) but here we are…

Edited by Bleuhazenfurfle
Link to comment
Share on other sites

1 hour ago, Love Zhaoying said:

What do you mean by "with the expected values on access"? (the term "on access" throws me off..)

- When you say, "the below appears to work just like in LSL", do you mean you get errors or undesired behavior in C?  If the local (or global) memory space WAS initialized with binary 0's, then the normal C "string" functions would see the string as "empty".

Mostly just saying the same as Bleuhazenfurfle, but anyway...

By access I basically just meant "when reading".

The C behavior appears similar to LSL, except LSL really doesn't like it when the variable is in a truly uninitialized state. Normally they would be guaranteed to get a zero-like value before access, but the jump across prevents that part it seems.

In C, the state of uninitialized memory is up to implementation but nothing stops you from using it, getting either garbage or zeroes. Strings being pointers, a null pointer is different from an empty string (a pointer to a zero byte), and dereferencing either a garbage or a null pointer is going to be a bad time.

  • Thanks 1
Link to comment
Share on other sites

52 minutes ago, Frionil Fang said:

Mostly just saying the same as Bleuhazenfurfle, but anyway...

By access I basically just meant "when reading".

The C behavior appears similar to LSL, except LSL really doesn't like it when the variable is in a truly uninitialized state. Normally they would be guaranteed to get a zero-like value before access, but the jump across prevents that part it seems.

In C, the state of uninitialized memory is up to implementation but nothing stops you from using it, getting either garbage or zeroes. Strings being pointers, a null pointer is different from an empty string (a pointer to a zero byte), and dereferencing either a garbage or a null pointer is going to be a bad time.

Makes perfect sense. 

I also was thinking earlier that the C behavior should depend on the implementation, but did not mention it in fear of getting shot down because you know - on a bad day, everyone's an expert but me! 🙂

===============

Current solution:

- For my current use-cases (recursive calls and "asynchronous" callbacks from Events that use "jump"), I decided to bypass any possibility of the issue by setting an Option whereby all user variables get promoted to Globals.  If not for this specific issue, I wouldn't have used that Option but left it up to the user. 

This SOUNDS like a kludge but after all, the current Parser Schema is for converting from something-very-much-like-BASIC to LSL.  And if you're old enough, you remember BASIC implementations where there are no "local variables". 🙂

- Later on, as I make the Parser use-cases more and more like "normal" LSL (such as allowing user functions, etc.), I'll do something else instead.  Such as, not allow jumping past a variable definition (just one of many solutions).

 

Edited by Love Zhaoying
Link to comment
Share on other sites

Out of curiosity I wanted to see more about this bug. Nothing majorly enlightening, useful or unexpected here considering the above discussions, but still mildly interesting.

Test 1: different variable types. It's not a huge surprise that keys and lists, like strings, can be null if the declaration is skipped. Integers, floats, vectors and rotations are not references, so they just behave as all 0s, without any further problems. This means for rotations you do *not* get the ZERO_ROTATION value -- that's <0, 0, 0, 1>.

Test 2: assignment. You can assign the null to another variable, not that it's any more useful.  Assigning something to a variable in null state works normally.

Test 3: comparisons. You can compare a null string normally, it's just equal to null but not other strings. You can coerce a null string directly into a TRUE value with "if(null_string)", suggesting it's equivalent to "if(null_string != "")". Trying to compare or coerce a null *list* throws the exception, much like llGetListLength(null_list) would.

Test 4: nulls in lists. You can put a null string in a list without a problem and that "neutralizes" it somewhat: it can be read and otherwise handled, behaving like an empty string for llList2* functions except llList2ListSlice, though llGetListEntryType identifies it as an invalid entry, not a string. llDumpList2String considers it an empty string. Most other list functions like llList2CSV and llListSort refuse to deal with it with a null reference exception. llList2Json doesn't turn it into JSON_NULL and dies, what a shame.

At least nothing crashed the server.

Edited by Frionil Fang
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

4 minutes ago, Frionil Fang said:

Most other list functions like llList2CSV and llListSort refuse to deal with it with a null reference exception. llList2Json doesn't turn it into JSON_NULL and dies, what a shame.

My reading is, the above were the only cases in your tests where you got an exception?

 

Link to comment
Share on other sites

  • 4 weeks later...
On 8/24/2023 at 4:24 PM, Frionil Fang said:

I got an email stating this bug was closed because it was Accepted!  Good news! 

https://jira.secondlife.com/browse/BUG-11377

Technically, this means I don't "have" to move all my variables to Globals and/or to one place for declaration before an initial "jump". 

In reality, I still need to add a "model" that will save/restore variables to and from JSON for a re-entrant code use-case.  (Which I need to do anyway.)

But still, "not crashing" is GOOD!

 

  • Like 1
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 259 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...