Jump to content

Marketplace search showing thousands of unrelated results for multiple keywords. This started about 2 days ago...


Ample Clarity
 Share

You are about to reply to a thread that has been inactive for 475 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

39 minutes ago, Love Zhaoying said:

Should be a quick technical fix.

Put a trace on the database search call, investigate indexes used, reindex and/or add new indexes as needed.

Or, if a change was recently made to the database calls for searching, investigate those changes, backout and/or remediate.

Easy peasy!

 

Quick question:  How does a search engine convert a glob of word salad into a searchable index?  How will the results of one glob of word salad compare to another glob of word salad?  How are these indexes searched by keywords and how are the search results ranked?

When I think about it, I begin to understand why my searches for phrases in descriptions always completely fail.  Globs are tokenized by white-space separation, and alike tokens are counted, likely with very common small words thrown out.  When a search string comes along, it is likewise tokenized, deduplicated and compared to recent digested search strings for similarities.  Any similarities are used to upward-bias the trending accumulators that seed the index searches to reduce workload that would be incurred by searching for all keywords in the query string.  Unique items that are rarely found by this process are likewise rarely clicked on and thus will remain at the bottom of the elevator shaft.  Apparently, the only way this soup-ladle can find them is when the glob of text describing the rare item contains a rare word and that rare word is the only word in the search query.

Edited by Ardy Lay
  • Like 1
Link to comment
Share on other sites

21 minutes ago, Ardy Lay said:

Quick question:  How does a search engine convert a glob of word salad into a searchable index?

That totally depends on the implementation!

If were to buy and use some specialized software for the purpose, the implementation would be "proprietary" - they won't tell you how it works.  In fact, once upon a time you'd have to buy /rent/lease an "appliance" from them that did the work.

If you had to "roll your own" and did the bare minimum - just search description for words, you could use "LIKE" in your queries. Very inefficient.

But, since the MarketPlace uses everything from Keywords, Relevance, and - as you say, "word salad" with NO way to select "which way to search" (LOOKY HERE, MAYBE THAT WAS A BAD DESIGN CHOICE!!) - it is probably a huge, nasty mess.  Search all the words + keywords + try to prioritize relevance + add the user's "AND NOT", etc. Yuck.

Anyway - Proper indexing is.."key". (Ironically.)

Back in the old SQL days, you could "force" an index by supplying "index hints" in your query - tell it which index it uses.

These days, I believe techies are moving more towards things like "NOSQL" which ain't SQL atall.  So, that will / would work differently.

If *I* had to do it - with my own level of SQL skills:

- I'd add "let the user select prioritization of relevancy vs. word salad, etc.

- I'd index words vs. keyword spam and let the user choose whether to search one, the other, or both

- Based on these criteria, the search engine would know "which way" to search, then prioritize the answers not just based on the results, but based on what the user wanted.

The draw back of "doing MP search well" - if LL ever "does it well" some day in the future is:  Those keyword-spam and irrelevant items will get less hits, less sales.  Boo-hoo for them!

I hope you enjoyed my answer/non-answer!

But to answer your ACTUAL question, "[converting] word salad into a searchable index" depends on the database implementation. Some can just "search the data" based on the database implementation.  For others (old school), you'd have to split words into separate indexes. That gives you an option to ignore "garbage" words (a, an, the, like, etc.).

But I digress.

And no, I ain't a Tigress.

Edited by Love Zhaoying
Removed extra blank lines from the end
Link to comment
Share on other sites

20 minutes ago, Love Zhaoying said:

That totally depends on the implementation!

If were to buy and use some specialized software for the purpose, the implementation would be "proprietary" - they won't tell you how it works.  In fact, once upon a time you'd have to buy /rent/lease an "appliance" from them that did the work.

If you had to "roll your own" and did the bare minimum - just search description for words, you could use "LIKE" in your queries. Very inefficient.

But, since the MarketPlace uses everything from Keywords, Relevance, and - as you say, "word salad" with NO way to select "which way to search" (LOOKY HERE, MAYBE THAT WAS A BAD DESIGN CHOICE!!) - it is probably a huge, nasty mess.  Search all the words + keywords + try to prioritize relevance + add the user's "AND NOT", etc. Yuck.

Anyway - Proper indexing is.."key". (Ironically.)

Back in the old SQL days, you could "force" an index by supplying "index hints" in your query - tell it which index it uses.

These days, I believe techies are moving more towards things like "NOSQL" which ain't SQL atall.  So, that will / would work differently.

If *I* had to do it - with my own level of SQL skills:

- I'd add "let the user select prioritization of relevancy vs. word salad, etc.

- I'd index words vs. keyword spam and let the user choose whether to search one, the other, or both

- Based on these criteria, the search engine would know "which way" to search, then prioritize the answers not just based on the results, but based on what the user wanted.

The draw back of "doing MP search well" - if LL ever "does it well" some day in the future is:  Those keyword-spam and irrelevant items will get less hits, less sales.  Boo-hoo for them!

I hope you enjoyed my answer/non-answer!

But to answer your ACTUAL question, "[converting] word salad into a searchable index" depends on the database implementation. Some can just "search the data" based on the database implementation.  For others (old school), you'd have to split words into separate indexes. That gives you an option to ignore "garbage" words (a, an, the, like, etc.).

But I digress.

And no, I ain't a Tigress.

 

 

 

 

 

Not related to the thread topic at all... but for small databases, you will get much better results using functions like "metaphone" in PHP as an example (I mean an equivalent on whatever programming language you are using...)... then run queries MATCH AGAINST, on sub queries (MySQL).... simple, easy, fast and "relevant" by default.... of course, your mileage will vary... but on a well built database, smart indexed (category, total sales, views, views and sales.... this using SL as a context, but, using indexes relevant to your application....), and depending on the case with partitions works like a charm without "voodoos".... 

Edited by Andred Darwin
  • Thanks 1
Link to comment
Share on other sites

6 hours ago, Yoofaloof Pacer said:

Surely it must be hitting the big players in their pockets hard.

Funny, just got another "tweet" hehehe ... even at the "you know which" website, seems like one the areas is now "hiding" the brackets... but the data still seems to be coming through socket.... 

[
  {
    "bracket": 1700000,
    "count": 1
  },
  {
    "bracket": 1350000,
    "count": 1
  },
  {
    "bracket": 1250000,
    "count": 1
  },
  {
    "bracket": 1200000,
    "count": 1
  },
  {
    "bracket": 1000000,
    "count": 1
  }

]

The said top brackets were above 2,000,000 in that guessed sample! (Still awesome for whoever is there!)

 

Edited by Andred Darwin
Link to comment
Share on other sites

13 minutes ago, Love Zhaoying said:

Surely, this is very dark majick!

I would not say there is anything dark on it, every company wants to have some control of search and manipulate it towards goals, etc..., LL would not be unique if they do... "desk" professionals sometimes need to "prove" they are doing something, or at least have something to say... in many cases, that's where the "train derail" (many ideas, bunch of distractions, no results! ie: voodoos!).

Edited by Andred Darwin
Link to comment
Share on other sites

1 minute ago, Andred Darwin said:
9 minutes ago, Love Zhaoying said:

Surely, this is very dark majick!

I would not say there is anything dark on it, every company wants to have some control of search and manipulate it towards goals, etc..., LL would not be unique if they do... "desk" professionals sometimes need to "prove" they are doing something, or at least have something to say... in many cases, that's where the "train derail".

Is joke.

You: "Works like a charm without voodoos".  

Charms are one type of "magic".

Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 475 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...