java - How do I create a schema for Hive to parse deeply nested json (Azure Application Insights output) using SerDe? -
i'm trying create schema hive parse json, however, having trouble creating schema when json doc in following structure:
{ "context": { "custom": { "dimensions": [{ "action": "getfilters" }, { "userid": "12345678" }] } } }
i using hadoop emulator azure's hdinsights on windows (8.1) , using java (1.8.0_73). compiled serde maven. think following work:
add jar ../lib/json-serde-1.1.9.9-hive1.2-jar-with-dependencies.jar; drop table events; create external table events ( context struct<custom:struct<dimensions:array<struct<action:string>,struct<userid:string>>>> ) row format serde 'org.openx.data.jsonserde.jsonserde' location '/json/event';
when take out nested array>, schema parses ok, in, following exception:
mismatchedtokenexception(282!=9) @ org.antlr.runtime.baserecognizer.recoverfrommismatchedtoken(baserecog nizer.java:617) @ org.antlr.runtime.baserecognizer.match(baserecognizer.java:115) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnamecolontype(hivepa rser.java:34909) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnamecolontypelist(hi veparser.java:33113) @ org.apache.hadoop.hive.ql.parse.hiveparser.structtype(hiveparser.java :36331) @ org.apache.hadoop.hive.ql.parse.hiveparser.type(hiveparser.java:35334 ) @ org.apache.hadoop.hive.ql.parse.hiveparser.coltype(hiveparser.java:35 054) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnamecolontype(hivepa rser.java:34914) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnamecolontypelist(hi veparser.java:33085) @ org.apache.hadoop.hive.ql.parse.hiveparser.structtype(hiveparser.java :36331) @ org.apache.hadoop.hive.ql.parse.hiveparser.type(hiveparser.java:35334 ) @ org.apache.hadoop.hive.ql.parse.hiveparser.coltype(hiveparser.java:35 054) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnametype(hiveparser. java:34754) @ org.apache.hadoop.hive.ql.parse.hiveparser.columnnametypelist(hivepar ser.java:32951) @ org.apache.hadoop.hive.ql.parse.hiveparser.createtablestatement(hivep arser.java:4544) @ org.apache.hadoop.hive.ql.parse.hiveparser.ddlstatement(hiveparser.ja va:2144) @ org.apache.hadoop.hive.ql.parse.hiveparser.execstatement(hiveparser.j ava:1398) @ org.apache.hadoop.hive.ql.parse.hiveparser.statement(hiveparser.java: 1036) @ org.apache.hadoop.hive.ql.parse.parsedriver.parse(parsedriver.java:19 9) @ org.apache.hadoop.hive.ql.parse.parsedriver.parse(parsedriver.java:16 6) @ org.apache.hadoop.hive.ql.driver.compile(driver.java:409) @ org.apache.hadoop.hive.ql.driver.compile(driver.java:323) @ org.apache.hadoop.hive.ql.driver.compileinternal(driver.java:980) @ org.apache.hadoop.hive.ql.driver.runinternal(driver.java:1045) @ org.apache.hadoop.hive.ql.driver.run(driver.java:916) @ org.apache.hadoop.hive.ql.driver.run(driver.java:906) @ org.apache.hadoop.hive.cli.clidriver.processlocalcmd(clidriver.java:2 68) @ org.apache.hadoop.hive.cli.clidriver.processcmd(clidriver.java:220) @ org.apache.hadoop.hive.cli.clidriver.processline(clidriver.java:423) @ org.apache.hadoop.hive.cli.clidriver.executedriver(clidriver.java:793 ) @ org.apache.hadoop.hive.cli.clidriver.run(clidriver.java:686) @ org.apache.hadoop.hive.cli.clidriver.main(clidriver.java:625) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl. java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodacces sorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ org.apache.hadoop.util.runjar.main(runjar.java:212) failed: parseexception line 2:69 missing > @ ',' near 'struct' in column specif ication line 2:76 mismatched input '<' expecting : near 'struct' in column specification
hive>
that external table looked me. maybe try downloading distribution of json serde. have had success with: http://www.congiu.net/hive-json-serde/
i have had success in hdinsight 3.2 http://www.congiu.net/hive-json-serde/1.3/cdh5/ might try newer build hdp.
documentation here: https://github.com/rcongiu/hive-json-serde
Comments
Post a Comment