autosuggest - Solr suggester duplicate suggestions -


i trying use solr(5)s suggestion. suggestion works getting recurring suggestions. tried use grouping on suggestion, not work. how can prevent recurring suggestions?

here necessary parts of schema.xml:

<field name="name" type="suggest" indexed="true" stored="true" multivalued="false"/>   ... <fieldtype name="suggest" class="solr.textfield">   <analyzer type="index">                 <tokenizer class="solr.standardtokenizerfactory" />         <filter class="solr.worddelimiterfilterfactory" generatewordparts="1" generatenumberparts="1" catenatewords="1" catenatenumbers="1" catenateall="1" splitoncasechange="1"/>                      <filter class="solr.lowercasefilterfactory"/>                    <filter class="solr.ngramfilterfactory" mingramsize="2" maxgramsize="15"/>                 </analyzer>       <analyzer type="query">         <tokenizer class="solr.standardtokenizerfactory"/>               <filter class="solr.lowercasefilterfactory"/>                  </analyzer> </fieldtype> 

my solrconfig.xml:

<searchcomponent name="suggest" class="solr.suggestcomponent"> <lst name="suggester">   <str name="name">mysuggester</str>       <str name="lookupimpl">analyzinginfixlookupfactory</str>   <str name="suggestanalyzerfieldtype">suggest</str>         <str name="exactmatchfirst">true</str>   <str name="dictionaryimpl">documentdictionaryfactory</str>         <str name="field">name</str>   <str name="weightfield">price</str>         <str name="buildoncommit">true</str>           <str name="buildonstartup">false</str>   <str name="preservesep">false</str>     </lst>   

<requesthandler name="/suggest" class="solr.searchhandler" startup="lazy"> <lst name="defaults">      <str name="suggest">true</str>   <str name="suggest.count">5</str>   <str name="suggest.dictionary">mysuggester</str>   <str name="suggest.collate">true</str>      </lst> <arr name="components">   <str>suggest</str>   <str>query</str>     </arr> 

example output "acer" suggestions params

/suggest?&suggest.dictionary=mysuggester&suggest.q=acer

<response> <lst name="responseheader"> <int name="status">0</int> <int name="qtime">6</int> </lst> <lst name="suggest"> <lst name="mysuggester"> <lst name="acer"> <int name="numfound">5</int> <arr name="suggestions"> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2350</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2099</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2000</long> <str name="payload"/> </lst> </arr> </lst> </lst> </lst> <result name="response" numfound="0" start="0"/> </response> 

you can see suggestion acer v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" 3 times.

also grouping not work :

suggest?&suggest.dictionary=mysuggester&suggest.q=acer&group=true&group.field=name

 <response> <lst name="responseheader"> <int name="status">0</int> <int name="qtime">90</int> </lst> <lst name="suggest"> <lst name="mysuggester"> <lst name="acer"> <int name="numfound">5</int> <arr name="suggestions"> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2369</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-5421121tmakk intel core i5 4210u 1.7ghz 12gb 1tb 17.3" </str> <long name="weight">2350</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2099</long> <str name="payload"/> </lst> <lst> <str name="term"> <b>acer</b> v3-772g-542081tmamm intel core i5 4200m 2.5ghz / 3.1ghz 8gb 1tb 17.3" </str> <long name="weight">2000</long> <str name="payload"/> </lst> </arr> </lst> </lst> </lst> <lst name="grouped"> <lst name="name"> <int name="matches">0</int> <arr name="groups"/> </lst> </lst> </response> 

you're using documentdictionaryfactory dictionary implementation. store suggested terms against each document. hence, if same suggestion term present in multiple documents, instances served.

to prevent this, can

  1. write intercepting api reads suggestions solr (eg: 30 @ time) , deduplicates them before returning data
  2. use dictionary filedictionaryfactory or highfrequencydictionaryfactory

Comments