使用XPath简化Json查询 郝伟,2020/01/22 [TOC]

1. 要求说明

如果有以下Json数据:

a ='''{
    "ver": "6.8",
    "dcid": "477",
    "head": {
        "cid": "",
        "ctok": "",
        "cver": "1.0",
        "lang": "01",
        "sid": "1",
        "syscode": "09",
        "auth": "",
        "refs": ["paper 2015",  "paper1"],
        "extension":{
            "cver": "1.0",
            "lang": "chs",
            "info": ["V2032",  "JD03-1"]
        } 
    },
    "contentType": "json"
}'''

我们通用有以下两种访问方法:

  • jsonstr['head']['sid']
  • jsonstr.get('head').get('sid') 这样的写法比较难看,而且层次较深的时候访问会很麻烦。

2. 解决方案

为了简化Json的访问操作,本文编写了一个函数,能够以 eval_xpath(json_data, '/head/sid') 的方式简化操作。

以下是更多的示例:

# 查询语句
queries = [ '/ver',
            '/head/syscode',
            '/head/refs',
            '/head/extension/lang',
            '/head/extension/info',
            '/head/extension/extension']

则调用函数处理后,输出为:

/ver                           6.8
/head/syscode                   09
/head/refs                      ['paper 2015', 'paper1']
/head/extension/lang           chs
/head/extension/info           ['V2032', 'JD03-1']
/head/extension/extension      None

3. 源代码

# -*- coding: utf-8 -*-
"""
Created on Fri Jan 22 08:09:33 2021

@author: pc
"""

import json
import xmltodict # 注:解析XPath需要安装此库

# 解析json字符串
#class jsonprase(object):
#    def __init__(self, json_value):
#        try:
#            eval(json_value)
#            self.json_value = json.loads(json_value)
#        except Exception:
#            raise ValueError('must be a json str value')


def datalength(json_data, xpath="/"):
    return len(eval_xpath(json_data, xpath))

def json_to_xml(json_value):
    root = {"root": json_value}
    xml = xmltodict.unparse(root, pretty=True)
    return xml


def eval_xpath(json_value, xpath):
    elem = json_value
    nodes = xpath.strip("/").split("/")
    for key in nodes:
        try:
            #print('key:', key)
            elem = elem.get(key)
            #print('node:', elem)
        except AttributeError:
            # elem = [y.get(key) for y in elem]
            print("AttributeError")
            break
    return elem

a ='''{
    "ver": "6.8",
    "dcid": "477",
    "head": {
        "cid": "",
        "ctok": "",
        "cver": "1.0",
        "lang": "01",
        "sid": "1",
        "syscode": "09",
        "auth": "",
        "refs": ["paper 2015",  "paper1"],
        "extension":{
            "cver": "1.0",
            "lang": "chs",
            "info": ["V2032",  "JD03-1"]
        } 
    },
    "contentType": "json"
}'''

jdata = json.loads(a)
queries = [
        '/ver',
        '/head/syscode',
        '/head/refs',
        '/head/extension/lang',
        '/head/extension/info',
        '/head/extension/extension'
        ]

for q in queries:
    r = eval_xpath(jdata, q)
    print(q.ljust(30), r)

4. 补充

在分析Json的时候可以使用在线分析工具:https://www.json.cn 工具能够有效地快速理解Json的结构,其效果如下所示(所用Json数据源点我下载):

image

5. 参考文献

results matching ""

    No results matching ""