knn - User defined termvectors in ElasticSearch -


how (if @ possible) can 1 insert term-vector in elasticsearch index?

es computes term-vectors, behind scenes, in order carry out it's text mining tasks, useful able enter list of (term, weight) pairs instead.

why?

well, instance, though es enables knn (k-nearest-neighbors) k=2, in context of geographic proximity, doesn't have explicit k>2 functionality. if able insert our own term-vectors, hack k>2 functionality harnessing es's built in text-indexing methods.

any indications on issue?

as far know, there's no way elasticsearch (i'm still looking fastest knn real time search approach, elasticsearch 1 of choices).

elasticsearch based on inverted index, each term in term vector (which may comes sentence) indexed in sorted list. when we're searching query, query analyzed term vector , elasticsearch (lucene actually) search indices each term.

but knn requires calculating distance between 2 vectors don't share same term, traditional inverted index not designed requirement.

as have said, elasticsearch implement real time knn search when k = 2 geo query, don't think support k > 2.

by way, if have found approach implement real time knn search k may large number ( 100000 ?) , on huge data set (number of vectors), please tell me, thx :)


Comments