PythonでJSONを解析する方法

ここでは、jsonをインポートし、それを使ってPythonでJSONを解析する手順を、便利なJSON-Python変換テーブルの使用法を交えて紹介します。経験豊富なPython開発者でも、これから始める人でも、このステップバイステップのチュートリアルで、プロのようにJSONを解析する方法を学ぶことができます!
9 min read
How to parse JSON data with Python

このチュートリアルでは、以下の内容を取り上げます。

Pythonで学ぶJSON入門

PythonによるJSON解析について掘り下げる前に、JSONとは何か、Pythonでどのように使用するかを理解しましょう。

JSONとは?

JSON, short for JavaScript Object Notation, is a lightweight data-interchange format. It is simple for humans to read and write and easy for machines to parse and generate. This makes it one of the most popular data formats. Specifically, JSON has become the “language of the web” because it is commonly used for transmitting data between servers and web applications via APIs.

以下に、JSONの例を示します。


{
  "name": "Maria Smith",
  "age": 32,
  "isMarried": true,
  "hobbies": ["reading", "jogging"],
  "address": {
    "street": "123 Main St",
    "city": "San Francisco",
    "state": "CA",
    "zip": "12345"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "555-555-1234"
    },
    {
      "type": "work",
      "number": "555-555-5678"
    }
  ],
  "notes": null
}

ご覧のように、JSONはキーと値のペアで構成されています。各キーは文字列であり、各値は文字列、数値、真偽値、null、配列、またはオブジェクトのいずれかです。JSONは、JavaScriptのオブジェクトと似ていますが、Pythonを含む任意のプログラミング言語で使用できます。

PythonでJSONを処理する方法

Python natively supports JSON through the json module, which is part of the Python Standard Library. This means that you do not need to install any additional library to work with JSON in Python. You can import json as follows:

import json

The built-in Python json library exposes a complete API to deal with JSON. In particular, it has two key functions: loads and load. The loads function allows you to parse JSON data from a string. Note that despite its name appearing to be plural, the ending “s” stands for “string.” So, it should be read as “load-s.” On the other hand, the load function is for parsing JSON data into bytes.

Through those two methods, json gives you the ability to convert JSON data to equivalent Python objects like dictionaries and lists, and vice versa. Plus, the json module allows you to create custom encoders and decoders to handle specific data types.

Keep reading and find out how to use the json library to parse JSON data in Python!

Pythonを使ってJSONデータを解析する

実際の例を見て、さまざまなソースからのJSONデータをPythonデータ構造に解析する方法を学びましょう。

JSON文字列をPython辞書に変換する


Assume that you have some JSON data stored in a string and you want to convert it to a Python dictionary. This is what the JSON data looks like:

{
  "name": "iPear 23",
  "colors": ["black", "white", "red", "blue"],
  "price": 999.99,
  "inStock": true
}

そして、これがPythonでの文字列表現です。

smartphone_json = '{"name": "iPear 23", "colors": ["black", "white", "red", "blue"], "price": 999.99, "inStock": true}'

長い複数行のJSON文字列を格納する場合は、Pythonのトリプルクォート規則を使用することを検討してください。

You can verify that smartphone contains a valid Python string with the line below:

print(type(smartphone))

出力は以下の通りです。

<class 'str'>

str stands for “string” and means that the smartphone variable has the text sequence type.

smartphoneに含まれるJSON文字列を、以下のようにjson.load()メソッドを使用してPython辞書に解析します。

import json

# JSON string
smartphone_json = '{"name": "iPear 23", "colors": ["black", "white", "red", "blue"], "price": 999.99, "inStock": true}'
# from JSON string to Python dict
smartphone_dict = json.loads(smartphone_json)

# verify the type of the resulting variable
print(type(smartphone_dict)) # dict

このスニペットを実行すると、次の結果が得られます。

<class 'dict'>

Fantastic! smartphone_dict now contains a valid Python dictionary!

Thus, all you have to do to convert a JSON string to a Python dictionary is to pass a valid JSON string to json.loads()

これで、結果の辞書フィールドには、通常通りアクセスできます。

product = smartphone_dict['product'] # smartphone
priced = smartphone['price'] # 999.99
colors = smartphone['colors'] # ['black', 'white', 'red', 'blue']

Keep in mind that the json.loads() function will not always return a dictionary. Specifically, the returning data type depends on the input string. For example, if the JSON string contains a flat value, it will be converted to the correspective Python primitive value:

import json
 
json_string = '15.5'
float_var = json.loads(json_string)

print(type(float_var)) # <class 'float'>

同様に、配列リストを含むJSON文字列は、Pythonのリストになります。


import json
 
json_string = '[1, 2, 3]'
list_var = json.loads(json_string)
print(json_string) # <class 'list'>

Take a look at the conversion table below to see how JSON values are converted to Python data by json:

JSON ValuePython Data
stringstr
number (integer)int
number (real)float
trueTrue
falseFalse
nullNone
arraylist
objectdict

JSON APIレスポンスをPython辞書に変換する

Consider that you need to make an API and convert its JSON response to a Python dictionary. In the example below, we will call the following API endpoint from the {JSON} Placeholder project to get some fake JSON data:

https://jsonplaceholder.typicode.com/todos/1

そのRESTFul APIは、以下のJSONレスポンスを返します。

{
  "userId": 1,
  "id": 1,
  "title": "delectus aut autem",
  "completed": false
}

You can call that API with the urllib module from the Standard Library and convert the resulting JSON to a Python dictionary as follows:

import urllib.request
import json

url = "https://jsonplaceholder.typicode.com/todos/1"

with urllib.request.urlopen(url) as response:
     body_json = response.read()

body_dict = json.loads(body_json)
user_id = body_dict['userId'] # 1

urllib.request.urlopen() peforms the API call and returns an HTTPResponse object. Its read() method is then used to get the response body body_json, which contains the API response as a JSON string. Finally, that string can be parsed into a Python dictionary through json.loads() as explained earlier.

Similarly, you can achieve the same result with requests:

import requests
import json

url = "https://jsonplaceholder.typicode.com/todos/1"
response = requests.get(url)

body_dict = response.json()
user_id = body_dict['userId'] # 1

Note that the .json() method automatically transforms the response object containing JSON data into the respective Python data structure.

Great! You now know how to parse a JSON API response in Python with both urllib and requests.

JSONファイルをPythonの辞書に読み込む

Suppose you have some JSON data stored in a smartphone.json file as below:

{
  "name": "iPear 23",
  "colors": ["black", "white", "red", "blue"],
  "price": 999.99,
  "inStock": true,
  "dimensions": {
    "width": 2.82,
    "height": 5.78,
    "depth": 0.30
  },
  "features": [
    "5G",
    "HD display",
    "Dual camera"
  ]
}

目標は、このJSONファイルを読み込んでPythonの辞書にすることです。以下のスニペットでそれを実現します。

import json

with open('smartphone.json') as file:
  smartphone_dict = json.load(file)

print(type(smartphone_dict)) # <class 'dict'>
features = smartphone_dict['features'] # ['5G', 'HD display', 'Dual camera']

The built-in open() library allows you to load a file and get its corresponding file object. The json.read() method then deserializes the text file or binary file containing a JSON document to the equivalent Python object. In this case, smartphone.json becomes a Python dictionary.

完璧です!ほんの数行のコードを使って、PythonでJSONファイルを解析できました。

JSONデータからカスタムPythonオブジェクトへ

Now, you want to parse some JSON data into a custom Python class. This is what your custom Smartphone Python class looks like:

class Smartphone:
    def __init__(self, name, colors, price, in_stock):
        self.name = name    
        self.colors = colors
        self.price = price
        self.in_stock = in_stock

Here, the goal is to convert the following JSON string to a Smartphone instance:

{
  "name": "iPear 23 Plus",
  "colors": ["black", "white", "gold"],
  "price": 1299.99,
  "inStock": false
}

To accomplish this task, you need to create a custom decoder. In detail, you have to extend the JSONDecoder class and set the object_hook parameter in the __init__ method. Assign it with the name of the class method containing the custom parsing logic. In that parsing method, you can use the values contained in the standard dictionary returned by json.read() to instantiate a Smartphone object.

Define a custom SmartphoneDecoder as below:

import json
 
class SmartphoneDecoder(json.JSONDecoder):
    def __init__(self, object_hook=None, *args, **kwargs):
        # set the custom object_hook method
        super().__init__(object_hook=self.object_hook, *args, **kwargs)

    # class method containing the 
    # custom parsing logic
    def object_hook(self, json_dict):
        new_smartphone = Smartphone(
            json_dict.get('name'), 
            json_dict.get('colors'), 
            json_dict.get('price'),
            json_dict.get('inStock'),            
        )

        return new_smartphone

Note that you should use the get() method to read the dictionary values within the custom object_hook() method. This will ensure that no KeyErrors are raised if a key is missing from the dictionary. Instaed, None values will be returned.

You can now pass the SmartphoneDecoder class to the cls parameter in json.loads() to convert a JSON string to a Smartphone object:

import json

# class Smartphone:
# ...

# class SmartphoneDecoder(json.JSONDecoder): 
# ...

smartphone_json = '{"name": "iPear 23 Plus", "colors": ["black", "white", "gold"], "price": 1299.99, "inStock": false}'

smartphone = json.loads(smartphone_json, cls=SmartphoneDecoder)
print(type(smartphone)) # <class '__main__.Smartphone'>
name = smartphone.name # iPear 23 Plus

Similarly, you can use SmartphoneDecoder with json.load():

smartphone = json.load(smartphone_json_file, cls=SmartphoneDecoder)

はい、できました!これで、JSONデータを解析してPythonのカスタムオブジェクトにする方法がわかりました!

PythonデータをJSONに変換する

You can also go the other way around and convert Python data structures and primitives to JSON. This is possible thanks to the json.dump() and json.dumps() functions, which follows the conversion table below:

Python DataJSON Value
strstring 
intnumber (integer)
floatnumber (real)
Truetrue
False false
None null 
listarray
dictobject
Null None

json.dump() allows you to write a JSON string to a file, as in the following example:

import json

user_dict = {
    "name": "John",
    "surname": "Williams",
    "age": 48,
    "city": "New York"
}

# serializing the sample dictionary to a JSON file
with open("user.json", "w") as json_file:
    json.dump(user_dict, json_file)

This snippet will serialize the Python user_dict variable into the user.json file.

Similarly, json.dumps() converts a Python variable to its equivalent JSON string:

import json

user_dict = {
    "name": "John",
    "surname": "Williams",
    "age": 48,
    "city": "New York"
}

user_json_string = json.dumps(user_dict)

print(user_json_string)

このスニペットを実行すると、次のようになります。

これは、Pythonの辞書の正確なJSON表現です。

Note that you can also specify a custom encoder, but showing how to do it is not the purpose of this article. Follow the official documentation to learn more.

Is the json Standard Module the Best Resource for Parsing JSON in Python?

As is true in general for data parsing, JSON parsing comes with challenges that cannot be overlooked. For example, in case of invalid, broken, or non-standard JSON, the Python json module would fall short.

また、信頼できないソースからのJSONデータを解析する際には注意が必要です。これは、悪意のあるJSON文字列がパーサーを壊す原因となったり、大量のリソースを消費したりする可能性があるためです。これは、PythonのJSONパーサーが考慮すべき課題の1つに過ぎません。

You could introduce custom logic to deal with these particular cases. At the same time, that might take too long and result in complex and unreliable code. For this reason, you should consider a commercial tool that makes JSON parsing easier, such as Web Scraper IDE.

ウェブスクレイピングIDEは、特に開発者向けに設計されており、JSONコンテンツなどを解析するための幅広い機能を備えています。このツールは時間を大幅に節約し、JSON解析を安全に行うのに役立ちます。また、Bright Dataのブロック解除プロキシ機能を備えており、JSON APIを匿名で呼び出すことができます。

If you are in hurry, you might also be interested in our Data as a Service offer. Through this service, you can ask Bright Data to provide you with a custom dataset that fits your specific needs. Bright Data will take care of everything, from performance to data quality.

JSONデータの解析が、これまでになく簡単になりました!

まとめ

Python enables you to natively parse JSON data through the json standard module. This exposes a powerful API to serialize and deserialize JSON content. Specifically, it offers the json.read() and json.reads() methods to deal with JSON files and JSON strings, respectively. Here, you saw how to use them to parse JSON data in Python in several real-world examples. At the same time, you also understood the limitations of this approach. This is why you may want to try a cutting-edge, fully-featured, commercial solution for data parsing, such as Bright Data’s Web Scraper IDE.