Python에서 JSON 읽고 쓰기

Python과 JSON은 천생연분입니다. FastAPI나 Django로 REST API를 개발하거나, 데이터 파이프라인을 처리하거나, 설정 파일을 읽을 때 등 JSON을 다룰 일은 끊임없이 생깁니다. 반가운 소식은, Python 표준 라이브러리의 json 모듈이 필요한 모든 기능을 제공한다는 점입니다. pip 설치도 필요 없습니다.

실제로 쓰는 네 가지 함수

json 모듈은 일상적인 작업을 위한 네 가지 함수를 제공합니다:

json.loads(str) — JSON 문자열을 Python 객체로 파싱
json.dumps(obj) — Python 객체를 JSON 문자열로 변환
json.load(file) — 파일 객체에서 직접 JSON 파싱
json.dump(obj, file) — Python 객체를 JSON으로 파일에 쓰기

loads / dumps의 s는 string(문자열)의 약자입니다. s가 없는 함수들은 파일 객체와 함께 사용합니다. 규칙만 알면 쉽게 기억할 수 있습니다.

json.loads() — JSON 문자열 파싱

python

import json

json_string = '{"name": "Alice", "age": 30, "active": true, "score": 98.5}'

user = json.loads(json_string)

print(user["name"])    # Alice
print(user["age"])     # 30
print(user["active"])  # True
print(type(user))      # <class 'dict'>

타입 매핑에 주목하세요: JSON true는 Python True로, JSON false는 Python False로, JSON null은 Python None으로 변환됩니다. JSON 객체는 Python dict가 되고, JSON 배열은 Python list가 됩니다.

json.dumps() — JSON 문자열로 직렬화

python

import json

user = {
    "name": "Bob",
    "age": 25,
    "roles": ["admin", "editor"],
    "active": True,
    "extra": None
}

# 컴팩트 형식 (네트워크 전송에 적합)
compact = json.dumps(user)
print(compact)
# {"name": "Bob", "age": 25, "roles": ["admin", "editor"], "active": true, "extra": null}

# 보기 좋게 출력 (로그 및 사람이 읽기에 적합)
pretty = json.dumps(user, indent=2)
print(pretty)
# {
#   "name": "Bob",
#   "age": 25,
#   "roles": [
#     "admin",
#     "editor"
#   ],
#   "active": true,
#   "extra": null
# }

역방향 타입 매핑도 확인하세요: Python True → JSON true, Python None → JSON null. Python이 이를 자동으로 처리해 줍니다.

파일에서 JSON 읽기

가장 일반적인 사용 사례 중 하나 — 시작 시 설정 파일이나 데이터 파일 읽기:

python

import json

# 한 단계로 읽고 파싱
with open("config.json", "r", encoding="utf-8") as f:
    config = json.load(f)

print(config["database"]["host"])  # localhost
print(config["database"]["port"])  # 5432

JSON 파일을 열 때 항상 encoding="utf-8"을 지정하세요. JSON은 RFC 8259에 의해 UTF-8로 규정되어 있으며, 이를 생략하면 기본 인코딩이 cp1252인 Windows에서 문제가 발생할 수 있습니다.

파일에 JSON 쓰기

python

import json

results = {
    "timestamp": "2024-01-15T09:30:00Z",
    "total": 1523,
    "processed": 1521,
    "failed": 2,
    "errors": [
        {"id": 42, "reason": "missing field"},
        {"id": 99, "reason": "invalid format"}
    ]
}

with open("results.json", "w", encoding="utf-8") as f:
    json.dump(results, f, indent=2)

print("Results saved to results.json")

오류 올바르게 처리하기

json.loads()는 입력이 유효한 JSON이 아닐 때 json.JSONDecodeError (ValueError의 서브클래스)를 발생시킵니다. 제어할 수 없는 데이터를 파싱할 때는 항상 이를 처리해야 합니다:

python

import json

def safe_parse(json_str):
    try:
        return json.loads(json_str)
    except json.JSONDecodeError as e:
        print(f"Invalid JSON at line {e.lineno}, column {e.colno}: {e.msg}")
        return None

data = safe_parse('{"name": "Alice"}')   # 정상 작동
bad  = safe_parse('not json at all')     # 오류 출력, None 반환
also_bad = safe_parse('{"key": }')       # 위치 정보와 함께 오류 출력

JSONDecodeError는 파싱 실패가 발생한 정확한 줄과 열을 알려주므로 대용량 JSON 파일을 디버깅할 때 유용합니다.

dumps()의 유용한 옵션들

python

import json

data = {
    "z_key": 1,
    "a_key": 2,
    "price": 9.999999999
}

# 키를 알파벳 순으로 정렬 (재현 가능한 출력/diff에 유용)
print(json.dumps(data, sort_keys=True, indent=2))
# {
#   "a_key": 2,
#   "price": 9.999999999,
#   "z_key": 1
# }

# 비 ASCII 문자 그대로 유지 (기본값: \uXXXX로 이스케이프)
data2 = {"city": "Münich", "greeting": "こんにちは"}
print(json.dumps(data2, ensure_ascii=False))
# {"city": "Münich", "greeting": "こんにちは"}

# ensure_ascii=True (기본값) 사용 시:
print(json.dumps(data2))
# {"city": "M\u00fcnich", "greeting": "\u3053\u3093\u306b\u3061\u306f"}

비 ASCII 텍스트가 포함된 JSON 파일을 작성할 때 저는 항상 ensure_ascii=False를 추가합니다. 이스케이프된 버전도 기술적으로 유효한 JSON이지만, 텍스트 편집기에서 읽기가 훨씬 어렵습니다.

커스텀 객체 직렬화

기본적으로 json.dumps()는 커스텀 클래스 인스턴스나 datetime 객체를 직렬화할 수 없습니다. 두 가지 방법이 있습니다: json.JSONEncoder를 서브클래싱하거나, 먼저 dict로 변환하는 것입니다:

python

import json
from datetime import datetime, date

# 방법 1: 커스텀 인코더 클래스
class AppEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (datetime, date)):
            return obj.isoformat()
        return super().default(obj)

data = {"name": "Alice", "created_at": datetime(2024, 1, 15, 9, 30)}
print(json.dumps(data, cls=AppEncoder, indent=2))
# {
#   "name": "Alice",
#   "created_at": "2024-01-15T09:30:00"
# }

# 방법 2: default= 매개변수 (일회성 변환에 더 간단)
print(json.dumps(data, default=str, indent=2))  # 알 수 없는 타입을 str로 변환

실용적인 패턴: 설정 파일 로딩

거의 모든 Python 프로젝트에서 사용하는 실제 패턴 — 합리적인 기본값을 가진 JSON 설정 파일을 읽는 설정 로더입니다:

python

import json
import os
from pathlib import Path

DEFAULTS = {
    "database": {"host": "localhost", "port": 5432},
    "debug": False,
    "log_level": "INFO"
}

def load_config(path="config.json"):
    config = DEFAULTS.copy()

    config_path = Path(path)
    if config_path.exists():
        with open(config_path, "r", encoding="utf-8") as f:
            try:
                user_config = json.load(f)
                # 깊은 병합: 사용자 설정이 기본값을 덮어씀
                for key, value in user_config.items():
                    if isinstance(value, dict) and key in config:
                        config[key].update(value)
                    else:
                        config[key] = value
            except json.JSONDecodeError as e:
                print(f"Warning: config.json is invalid ({e.msg}), using defaults")

    return config

config = load_config()
print(config["database"]["host"])  # localhost (또는 덮어쓴 값)

마무리

Python의 json 모듈은 의존성 없이 필요한 모든 것을 제공합니다. 핵심 규칙: 문자열에는 loads()/dumps(), 파일에는 load()/dump()를 사용하고, 외부 데이터를 파싱할 때는 항상 JSONDecodeError를 처리하며, 라틴 문자가 아닌 데이터가 있을 때는 ensure_ascii=False를 추가하세요. JSON 데이터를 디버깅할 때는 JSON 포매터와 JSON 유효성 검사기가 많은 시간을 절약해 줄 것입니다.

← All JSON articles Browse all categories →