java - Elasticsearch : Disable IDF completely for search result scoring -

this sample data in elasticsearch

{     "_index": "12_index",     "_type": "skill_strings",     "_id": "avkv-km4axmy3feczw9t",     "_source": {        "str": "php php php"     }  },  {     "_index": "12_index",     "_type": "skill_strings",     "_id": "avkv-knfaxmy3feczw9u",     "_source": {        "str": "javascript php javascript javascript"     }  }   "bool":{   "must":[     // conditions     {"match_phrase":{"str":"php"}}   ],   "should":[     {"match_phrase":{"sentences":"javascript"}}   ] }

norms disable

in result set, php (with 16 occurrences) gets score of 13.65 (rounded off) whereas javascript same number of occurrences in doc gets lower score of 9.58

as per use case irrespective of how rare word or how short/long field is, want same score same term frequency.

how can ?

if literally want first document score 3.0 str:php (before score normalization), , second score 3.0 str:javascript (before score normalization), [you should script_score][1] , using [tf() function][2].

this bypass (1) length-normalization, (2) consideration of 'rarity' (idf), , (3) normalization of (tf)

Search This Blog

Ben

java - Elasticsearch : Disable IDF completely for search result scoring -

Comments

Post a Comment

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

post - imageshack API cURL -

dataset - MPAndroidchart returning no chart Data available -