Solr 5.1.0 and 5.2.1: Creating parent-child docs using DIH -


i trying import solr 5.1.0 , 5.2.1 data-config should produce documents following structure:

<parentdoc>     <someparentstuff/>     <childdoc>         <somechildstuff/>     </childdoc> </parentdoc> 

from understand 1 of answers on this question nested entities in dih, versions of solr should able create above structure following data-config.xml:

<dataconfig>     <datasource type="jdbcdatasource" driver="com.mysql.jdbc.driver"              url=""             user=""             password=""             batchsize="-1"     />     <document name="">         <entity rootentity="true" name="parent" pk="parent_id" query="select * parent">             <field column="parent_id" name="parent_id" />              <entity child="true" name="child" query="select * child parent_id='${parent.parent_id}'">                 <field column="parent_id" name="parent_id" />                 <field column="item_status" name="item_status" />             </entity>        </entity>     </document> </dataconfig> 

however, when perform full-import, get:

<result name="response" numfound="2" start="0">   <doc>     <long name="parent_id">477</long> <!-- child -->     <str name="item_status">ws</str>   </doc>   <doc>     <long name="parent_id">477</long> <!-- parent -->   </doc> </result> 

which understand denormalized layout you're supposed pre-5.1.0; however, expected:

<result name="response" numfound="1" start="0">     <doc>         <long name="parent_id">477</long>         <doc>             <long name="parent_id">477</long>             <str name="item_status">ws</str>         </doc>     </doc> </result> 

what need desired document structure? misunderstanding nested entities in dih supposed do?

unless swings tell me otherwise, seems have misunderstood creation of parent-child docs in solr 5.1.0+. expecting able nest documents , have them returned, that's not possible solr (at least @ point. future mystery.)

solr flat document model. means doesn't model parent-child relationships in way wanted in original question. there no nesting. flat , denormalized.

what solr adds n-number of child documents next parent in contiguous block. example:

childdoc1 childdoc2 childdoc3 parent

this structure reflected in documents "mistakenly" getting returned solr:

<result name="response" numfound="2" start="0">   <doc>     <long name="parent_id">477</long> <!-- child -->     <str name="item_status">ws</str>   </doc>   <doc>     <long name="parent_id">477</long> <!-- parent -->   </doc> </result> 

the nested document support available in dih after solr 5.0 add-on or outright replacement old way people used have index nested documents, , seems take care of updating child + parent docs @ same time you. convenient!

so, then, how express parent-child relationship when solr destroys nice, nested document model had planned? have parent docs , child docs , manage relationship in application. how parents , children?

the answer block joins.

use block joins during query time, , in application, process documents desired structure. let's @ 2 block join queries because can bit weird @ first.

{!parent which='type:parent'}item_id:5918307

this block join query says, "get me parent document has 1 or more child documents item_id of 5918307."

{!child of='type:parent'} (fielda:term^100.0 or fieldb:term^100.0 or fieldc:term or (fieldd:term^20.0)) , (instock:true^9999.0)

this block join query says, "get me 1 or more child documents parent documents contain word 'term' , in stock."

do not search on child fields when doing !child queries. so, referencing first example, not search on item_id, because give 500 error.

you'll notice type field in these queries. have add schema , data-config yourself. in schema, looks this:

<!-- use field differentiate between parent , child docs --> <field name="type" type="string" indexed="true" stored="false" /> 

then in data-config.xml, following parent:

if ('true' = 'true', 'parent', 'parent') type

and same child, except substitute "child" put "parent" before.

so in end might end making 2 queries, doesn't seem adding block join parser adds query time. i'm seeing maybe 50 or 100ms per query.

you can bypass needing nested documents being smart joins. i've discovered, however, because child documents mingle parent documents, have 1 "copy" of each parent specific child information in index. in situation, grab known parent fields first document, along child fields. rest of documents, grab child fields.

another option, when want parent doc , don't want whole bunch of other docs being returned, use grouping queries; however, wouldn't recommend it. when tried on query returned large number of results, saw query times go 10ms - 250ms range way 500ms - 1s range.


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -