User talk:Rtanglao/SearchRankingNotes22June2012
From MozillaWiki
- normally we'd take 200 top search queries and rate them pass/fail ; pass == best answers in top 5 search results - take notes and decide if content tuning (tweak title or summary) or tweak search engine itself e.g. weights
- can't automate this because it requires judgement in the face of changing content and hot topics e.g. flash for ff13
- today however we don't have that data
- so we are comparing current versus unified and see which one is better
- Is unified search (which has weights) doing better than current search (which has no separation)?
- to test unified search append "&esunified=1"
- Cookie issues
- Why does iphone article show up as #1 instead of the cookies article? Is it because of indexing issues? Maybe the cookies article has too many words in the title? Is it possible to see how the scoring happens? Hard to get the full equation. Could this be made available as a tool in the admin panel or display it inline in search results (perhaps witha flag in the URL); WillKG to look into this and to make the weights available
- Hopefully after the first couple of search ranking tuning iterations, you'd reach a steady state in tuning that requires minimal tweaking; tweaking is always required for new content, hot topics, changing content
- the current search data is inaccurate because it doesn't handle multiple word search aggregation/normalization
- to be done: add synonym support to search aggregation/normalization
- Perhaps we are giving too much weight to title matches in forum results (which is why this forum article erroneously appears in the top 10: https://support.mozilla.org/en-US/questions/824656?s=cookies&r=6&as=s ; this forum post is relevant but it's not a helpful result since it's a thread about an edge case) - Workaround: Use a managed answer (put certain threads at the top to give people frequently useful links)
- Need to lock or remove old threads - maybe need to increase weight for solved topics
- KB should be weighted over Forum if we have KB content coverage
- the weights are here: http://kitsune.readthedocs.org/en/latest/searchchapter.html#search-scoring http://kitsune.readthedocs.org/en/latest/searchchapter.html#searching-on-the-site
- bug: contributor forum threads should NOT appear in search results
- bug: unsolved old forum threads should NOT appear
- we are now using elastic search ; sphinx is disabled
- kludge tactic: add keyword multiple times aka "keyword bombing" ; since the weight ix 4x, this will help things! this should index in real time; it worked for cookies (added the keyword "cookie" 10 times to the cookie article and this bumped it ahead of the iphone cookie article
- https://support.mozilla.org/en-US/search?q=where+is+the+firefox+button+at+the+top+of+page versus https://support.mozilla.org/en-US/search?q=where+is+the+firefox+button+at+the+top+of+page&esunified=1
- unified is better since it hows the right kb article as result #3 instead of result #5
- this is a tricky one since many articles have the word "firefox" or "button" in them
- is higher priority given to items with the words in the right order
- should we make "firefox" a stop word but make "firefox button" and other firefox + other word phrases special search terms e.g. synonyms or tailored searches or use proximity matches if they are available in elastic search
- https://support.mozilla.org/en-US/search?q=how+can+i+download+an+old+version+of+firefox&esunified=1 versus non unified
- unified is completely wrong, indexing failure?