Jump to content

How to fix the inworld search


Phil Deakins
 Share

You are about to reply to a thread that has been inactive for 4508 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

I understand that LL has dumped the GSA and gone over to an open source search engine for the inworld search. It means that they have access to the code, which they never had with the GSA. With the GSA, they were limited to external customisations but now it's possible to create a system that actually does the job that SL needs - a round peg in a round hole search system, instead of the square peg in a round hole that the GSA was.

 

The Problem

A major problem is that parcels that are highly relevant to a search often aren't even listed in the top 1000, when most of the top 1000 results aren't anywhere near as relevant. Add a filter, such as Places, et voila, the parcel is ranked on the first page, and ranking higher than most of the parcels that were in the top 1000 when it wasn't ranked at all. Something clearly isn't right.

There is a reason it happens. It necessarily happens with all web search engines but, with those, people aren't paying to be shown in the results so it doesn't matter. In SL, everyone pays for their parcels to be shown in search, or the parcel isn't included in the engine's index. And, if a parcel isn't even listed for an All search, but ranked very highly for a Places search, the owner of the parcel isn't getting what s/he is paying for.

I wrote why it happens in another current thread and it's a bit lengthy, so I won't repeat it all here. To understand why my suggested solution would work, read why the problem happens here.

The result is that people are paying for their parcels to be shown in the results but often the parcels are not shown when they really should be shown.

 

The Solution

Leasing GSA machines is expensive, so LL couldn't really use this solution. As it was, they did lease three GSAs and the inworld search ran on one of them, meaning that the whole database/index for the inworld search was contained in one GSA, and that's the reason why the problem occurs - the whole index is in one search engine machine. With the open source engine now in use, it can be done another way.

Suppose the index is divided into sections, each of which is stored in a seperate search engine machine, and each machine is an autonomous search engine in its own right. For instance, suppose there are 100,000 items (parcels, people, events, etc.) in the whole index, and the index is split into 10 machines, each containing around 10,000 items. As well as those 10 machines, there is a Controller machine. When a search query arrives at the Controller, it feeds it to all 10 engines. Each of those processes it and returns the top 500 or 1000 results according to relevancy, along with a relevancy score for each one. If there aren't 500 or 1000 matches, it returns as many as there are.

The Controller receives all the results and sorts them into relevancy order according to the scores the machines provided for each result. Then it returns the top 1000 for display to the searcher.

So, for the cost of a few computers, that major search problem could be done away with. It may not be absolutely 100% perfect 100% of the time, but it would be as near as makes very little difference.

The problem is that, with a large index in a single engine, the engine often fills its "results set" quota before much of the index is reached, thereby leaving some very relevant results out altogether. If the index in the engine is much smaller (one tenth of the size in the example), then all relevant results would be reached in the index before the results set quote is filled. That would happen with all 10 engine machines, so all relevant parcels will be scored on relevancy, and no relevant parcel will be left out when it belong in.

It would be important not to seperate the index the into into places, events, people, etc. machines, or it wouldn't solve the problem, as the Places machine would always come up short for searches that have a very large number of matches. So it would be important divide the whole index evenly between the machines, and to mix the different types.

Link to comment
Share on other sites

How about listing Search using ABC's and 123's as is found in Phone Books. 

Delete all AD's / Listings / ect., greater than 30 days old, clean house.

Linden Labs should host Search for Free. No ad's, not traffic, no variables. Let the best, be the best.

All this Gaming is becoming unmanageable. What a tangled web we have woven.

 

Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 4508 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...