I had to work with some complex structured JSON responses having multiple levels of nesting. The issue has always been that the API would return nested JSON where I have to walk different levels to get data by using x.get('y')[0].get...
chains.
For this very reason, I was looking for an optimal and efficient solution so that at least I can avoid calling get()
many times.
The Problem
Let’s say you’re working with user data from an API:
user_data = {
"user": {
"profile": {
"name": "John Doe",
"contacts": [
{"type": "email", "value": "[email protected]"},
{"type": "phone", "value": "123-456-7890"}
],
"preferences": {
"notifications": {
"email": True,
"sms": False
}
}
}
}
}
To extract the email, you typically write:
# The verbose way I just mentioned.
try:
email = user_data["user"]["profile"]["contacts"][0]["value"]
except (KeyError, IndexError, TypeError):
email = None
A Clean Solution
This is a simple helper function that makes nested data traversal more easier:
def get_nested_value(data, keys, default=None):
"""
Extract value from nested dict/list using a path of keys.
Args:
data: The nested data structure (dict/list)
keys: List of keys representing the path to the value
default: Value to return if path doesn't exist
Returns:
The value at the specified path, or default if not found
"""
try:
for key in keys:
if isinstance(data, dict):
data = data.get(key)
if data is None:
return default
elif isinstance(data, list):
if isinstance(key, int) and 0 <= key < len(data):
data = data[key]
elif len(data) > 0 and isinstance(data[0], dict):
# For non-integer keys on lists, try first dict element
data = data[0].get(key)
else:
return default
else:
return default
return data
except (AttributeError, TypeError, IndexError):
return default
Now extracting data becomes simple:
# Clean and readable
email = get_nested_value(user_data, ["user", "profile", "contacts", 0, "value"])
name = get_nested_value(user_data, ["user", "profile", "name"])
email_notifications = get_nested_value(user_data, ["user", "profile", "preferences", "notifications", "email"])
print(f"Name: {name}")
print(f"Email: {email}")
print(f"Email notifications: {email_notifications}")
Handling Multiple Items from Lists
Sometimes you need all items from a list, not just the first one:
def get_all_nested_values(data, keys, target_key):
"""
Extract values from all items in a nested list.
Args:
data: The nested data structure
keys: Path to the list
target_key: Key to extract from each list item
Returns:
List of values found
"""
try:
# Navigate to the list
for key in keys:
if isinstance(data, dict):
data = data.get(key)
if data is None:
return []
else:
return []
# Extract target_key from each item in the list
if isinstance(data, list):
return [item.get(target_key) for item in data if isinstance(item, dict) and target_key in item]
return []
except (AttributeError, TypeError):
return []
# Get all contact values
contacts = get_all_nested_values(user_data, ["user", "profile", "contacts"], "value")
print(f"All contacts: {contacts}") # ['[email protected]', '123-456-7890']
Conclusion
This approach provides a clean, safe way to traverse nested data structures. It’s particularly useful when the structure might change or some fields might be optional.
I use this helper function pretty much everywhere in my DjangoRF serializer and directly feed the JSON to the serializer where it runs this function.