i trying use solr(5)s suggestion. suggestion works getting recurring suggestions. tried use grouping on suggestion, not work. how can prevent recurring suggestions?
here necessary parts of schema.xml:
<field name="name" type="suggest" indexed="true" stored="true" multivalued="false"/> ... <fieldtype name="suggest" class="solr.textfield"> <analyzer type="index"> <tokenizer class="solr.standardtokenizerfactory" /> <filter class="solr.worddelimiterfilterfactory" generatewordparts="1" generatenumberparts="1" catenatewords="1" catenatenumbers="1" catenateall="1" splitoncasechange="1"/> <filter class="solr.lowercasefilterfactory"/> <filter class="solr.ngramfilterfactory" mingramsize="2" maxgramsize="15"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.standardtokenizerfactory"/> <filter class="solr.lowercasefilterfactory"/> </analyzer> </fieldtype>
my solrconfig.xml:
<searchcomponent name="suggest" class="solr.suggestcomponent"> <lst name="suggester"> <str name="name">mysuggester</str> <str name="lookupimpl">analyzinginfixlookupfactory</str> <str name="suggestanalyzerfieldtype">suggest</str> <str name="exactmatchfirst">true</str> <str name="dictionaryimpl">documentdictionaryfactory</str> <str name="field">name</str> <str name="weightfield">price</str> <str name="buildoncommit">true</str> <str name="buildonstartup">false</str> <str name="preservesep">false</str> </lst>
<requesthandler name="/suggest" class="solr.searchhandler" startup="lazy"> <lst name="defaults"> <str name="suggest">true</str> <str name="suggest.count">5</str> <str name="suggest.dictionary">mysuggester</str> <str name="suggest.collate">true</str> </lst> <arr name="components"> <str>suggest</str> <str>query</str> </arr>
example output "acer" suggestions params
/suggest?&suggest.dictionary=mysuggester&suggest.q=acer
<response> <lst name="responseheader"> <int name="status">0</int> <int name="qtime">6</int> </lst> <lst name="suggest"> <lst name="mysuggester"> <lst name="acer"> <int name="numfound">5</int> <arr name="suggestions"> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2350</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2099</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2000</long> <str name="payload"/> </lst> </arr> </lst> </lst> </lst> <result name="response" numfound="0" start="0"/> </response>
you can see suggestion acer v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" 3 times.
also grouping not work :
suggest?&suggest.dictionary=mysuggester&suggest.q=acer&group=true&group.field=name
<response> <lst name="responseheader"> <int name="status">0</int> <int name="qtime">90</int> </lst> <lst name="suggest"> <lst name="mysuggester"> <lst name="acer"> <int name="numfound">5</int> <arr name="suggestions"> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2350</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2099</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2000</long> <str name="payload"/> </lst> </arr> </lst> </lst> </lst> <lst name="grouped"> <lst name="name"> <int name="matches">0</int> <arr name="groups"/> </lst> </lst> </response>
you're using documentdictionaryfactory dictionary implementation. store suggested terms against each document. hence, if same suggestion term present in multiple documents, instances served.
to prevent this, can
- write intercepting api reads suggestions solr (eg: 30 @ time) , deduplicates them before returning data
- use dictionary filedictionaryfactory or highfrequencydictionaryfactory
Comments
Post a Comment