Jump to content

Recognising words in a llSubStringIndex


Loki Eliot
 Share

You are about to reply to a thread that has been inactive for 1494 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

Hi, I've made a robot that listens to conversations and checks them with a list of bad words. 

I know listeners are laggy but it's fun....

The issue i have is when it compares words with it's list, it is finding words within words... so for example it's finding the word ASS in GLASS.

How can i prevent this from happening?

Any help would be gratefully appreciated.

Here is my current script.....

 

integer gLsn;
key notecardQueryId;
string notecardName = "Badwords";
list Badwords = [];
integer notecardLine;
 
say(string inputString)
{
    llOwnerSay(inputString);
}
 
default
  {    
  
    state_entry()
    {   
        if (llGetInventoryKey(notecardName) == NULL_KEY)
        {
            llOwnerSay( "Notecard '" + notecardName + "' missing or unwritten");
            return;
        }
        notecardQueryId = llGetNotecardLine(notecardName, notecardLine);
        llSay(0, "Loading Words, Please Wait untill confirmation notcard has been read...");      
    }  
         
  dataserver(key query_id, string data) 
    {
        if (query_id == notecardQueryId)
           {
            if (data == EOF)
            {
                say("Done reading notecard, read " + (string) notecardLine + " notecard lines.");
                state ready;
            }
            else
           {
             ++notecardLine;
             Badwords = Badwords + [llToLower(data)];
             notecardQueryId = llGetNotecardLine(notecardName, notecardLine);
            }
        }
}
}
 
state ready
{
   state_entry()
      {
       gLsn = llListen(0,"",llDetectedKey(0),"");  
        }
        
 listen(integer channel,string name, key id, string msg)    
        {        
         integer i;       
         while(i < llGetListLength(Badwords))       
             {           
              if (~llSubStringIndex(llToLower(msg),llList2String(Badwords,i)))          
                 {           
                  llSay(0, (string)llKey2Name(id)+ " has said a BAD WORD");  
                  }         
               ++i;       
              }    
         }

Link to comment
Share on other sites

27 minutes ago, Loki Eliot said:

thank you for your replies but i don't understand either of you

Hmmm... How else can I say it?  Instead of looking for "WORD", which is a character string that might appear anywhere in the message, search for " WORD " (notice the blanks before and after WORD?).  That will look only for case where the character string is freestanding instead of part of another word.  Wulfie is suggesting a different way to do the same thing.  He's saying to look for

list temp = llParseString2List(message, [" "],[]);

which will create a list of all chunks within word, separated by the blanks spaces between them.  That's actually a better solution than mine, because it will catch "WORD " at the start of a sentence.  So, you can search the list and see whether the "WORD" you are looking for is there:

if ( ~llListFindList(temp,["WORD"]) )
{
    llShout(0,"I found one!");
}

Both Wulfie's method and mine will miss the case where the string "WORD" is followed immediately by punctuation (like "WORD."), but you can add filters for those few special cases.

Edited by Rolig Loon
Ooopsie
  • Like 1
Link to comment
Share on other sites

ok so im trying to add an empty space after each word in the Badword list. do i add that as the badword list is being read or during the comparison of the badword list to the just said message list?

Becasue i have two lists, the list of bad words and two the list or words being heard in chat 

Edited by Loki Eliot
Link to comment
Share on other sites

That depends on whether you are using my approach or Wulfie's.  If you use my approach, your list of bad words should be something like this:

list lBadWords = [" Darn ", " Rats ", " Crud ", " Phooey ", " Diddlydang "];

in which each word in the list has a blank before it and another after it.  If you're using Wulfie's approach, the list shouldn't contain any blanks at all.

No matter how you do it, I'm sure that you already know this is going to be a slow, clunky script that may add to chat lag.  Personally, I would never write something like this.  Every time that the script hears any message, it will have to take the message apart and look at every blessed word and see if it matches one on your "bad words" list. It's hard to imagine a task that's slower and clunkier than that.

Link to comment
Share on other sites

5 hours ago, Rolig Loon said:

If you use my approach, your list of bad words should be something like this:

list lBadWords = [" Darn ", " Rats ", " Crud ", " Phooey ", " Diddlydang "];

in which each word in the list has a blank before it and another after it.

My only criticism of this method is that you won't detect bad words at the start or end of a sentence, eg: "Darn Rats" contains no bad words.

Link to comment
Share on other sites

7 minutes ago, Wulfie Reanimator said:

My only criticism of this method is that you won't detect bad words at the start or end of a sentence, eg: "Darn Rats" contains no bad words.

Exactly.  That's why I said I like yours better.  ;)

As I pointed out, though, your approach will still miss a bad word followed by a comma or a period.  Better, but not perfect.

Edited by Rolig Loon
typos, of course
Link to comment
Share on other sites

20 minutes ago, Rolig Loon said:

Exactly.  That's why I said I like yours better.  ;)

As I pointed out, though, your approach will still miss a bad word followed by a comma or a period.  Better, but not perfect.

Most of the common cases can be handled by llParseString2List itself, since you can use multiple separators. That way you can get rid of commas, periods, exclamation/question marks, and other special characters.

Here's a basic example. It's not necessarily how I'd structure the code myself, but this should be easier to follow:

list bad_words = ["bleep", "bloop"];
default
{
    listen(integer channel, string name, key id, string message)
    {
        message = llToLower(message); // Make everything lowercase for simpler comparisons.

        list words = llParseString2List(message, [" ", ",", ".", "!", "?", "'", "\"", ":"], []);
        integer count = llGetListLength(words);

        integer i;
        while (i < count) // "Iterate" through every word.
        {
            list word = llList2List(words, i, i);
            integer index = llListFindList(bad_words, word);
            if (index != -1)
            {
                // Bad word was found!
                return;
            }
            ++i; // Move on to the next word.
        }
    }
}

 

Edited by Wulfie Reanimator
Link to comment
Share on other sites

another way which can simplify the punctuation checks is to preprocess the text replacing non-letters with white space

example
 

// global
list Badwords = [
   "list", "of", "bad", "words", "in", "lower", "case"
];

integer hasBadword(string text)
{
   // convert all non-letters to whitespace
   text = llToLower(text);
   integer len = llStringLength(text);   
   string prep;
   integer i;
   for (i = 0; i < len; i++)
   {
       string char = llGetSubString(text, i, i);
       // lookup list is faster than a lookup string
       if (!~llListFindList(
          ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"],
          [char])
       ) char = " ";  // replace with " " when char is not a letter
       prep += char;
   }

   // parse on whitespace to get whole words
   list words = llParseString2List(prep, [" "], []);
   
   // find the first badword if any
   len = llGetListLength(words);
   for (i = 0; i < len; i++)
   {
       if(~llListFindList(Badwords, [llList2String(words, i)]))
          return TRUE; // badword
   }    
   return FALSE; // no badword
}

default
{
    state_entry()
    {
        llListen(0, "", NULL_KEY, "");
    }
    
    listen(integer channel, string name, key id, string text)
    {
        if (hasBadword(text))
            llSay(0, llGetDisplayName(id) + " has said a BAD WORD!");
    }
}

 

Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 1494 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...