Searching on JSON Values in MySQL 7

The MySQL 5.7.7 JSON Lab release introduced a native JSON datatype, which opened the door for handling JSON data in ways that were previously impossible. In the Working with JSON Data in MySQL 7 tutorial, I presented a general overview of the new functions that were introduced to make working with the JSON data type easier.  Several new functions assist in the selecting of data from a JSON document.  We covered a few of them, such as JSON_EXTRACT(), JSON_ARRAY(), JSON_OBJECT(), and JSON_MERGE().  In today’s article, we’ll learn about some additional functions for searching on JSON values in MySQL 7.

Getting Information about Stored JSON Documents

Due to the complex structure of JSON objects, you’ll often want to get the lay of the land before attempting to retrieve any data from a JSON field.  There are a couple of functions to help with that.

JSON_LENGTH()

The JSON_LENGTH() function tells us how many elements each stored document has.  Using the following data as an example:

contacts table
+----------------------------------------------------------------------------+
| basic_info                                                                 |
+----------------------------------------------------------------------------+
| {"first_name": "Tom", "last_name": "Hardy", "age": 35, "sex": "Male"}      |              
+----------------------------------------------------------------------------+
| {"first_name": "Nancy", "last_name": "Taylor", "age": 50, "sex": "Female"} |  
+----------------------------------------------------------------------------+
| {"first_name": "Bill", "last_name": "Summers", "age": 25, "sex": "Male"}   |  
+----------------------------------------------------------------------------+

Here is how we would utilize JSON_LENGTH() to find out how many attributes each contact has:

mysql> select json_length(basic_info) from contacts as obj_length;
+----------------+
| obj_length     |
+----------------+
|              3 |
|              3 |
|              3 |
+----------------+

JSON_LENGTH() also works on JSON arrays, so that if the above data were contained in an array:

clique table
+------------------------------------------------------------------------------+
| members                                                                      |
+------------------------------------------------------------------------------+
| [{"first_name": "Tom", "last_name": "Hardy", "age": 35, "sex": "Male"},      |            
|  {"first_name": "Nancy", "last_name": "Taylor", "age": 50, "sex": "Female"}, |  
|  {"first_name": "Bill", "last_name": "Summers", "age": 25, "sex": "Male"}]   |
+------------------------------------------------------------------------------+
| [{"first_name": "Fred", "last_name": "Jones", "age": 59, "sex": "Male"},     |            
| {"first_name": "Brad", "last_name": "Johnson", "age": 44, "sex": "Male"}]    |   
+------------------------------------------------------------------------------+
| [{"first_name": "Tina", "last_name": "Summers", "age": 55, "sex": "Female"}] |           
+------------------------------------------------------------------------------+

…it would now return one value of 3, which represents the number of array elements:

mysql> select json_length(basic_info) 
from contacts as obj_length;
+----------------+
| obj_length     |
+----------------+
|              3 |
|              2 |
|              1 |
+----------------+

What does this tell us?  That Tina needs to get out and make some friends.

To find the length of nested values, the JSON_LENGTH()  function may accept a path as the second argument. Supplying a path tells JSON_LENGTH()  to return the length of the value within the document identified by the path:

mysql> SELECT JSON_LENGTH('{"item1": "A", "item2": {"nested_item1": 99}}', '$.item2')         
                                                                             as item2_length;
+---------------+
| item2_length  |
+---------------+
|             1 |
+---------------+

JSON_DEPTH()

Similar to the JSON_LENGTH() function, JSON_DEPTH() can be utilized to retrieve the maximum depth in a document.  The depth count begins at the document root, so that an empty array/object or scalar value has a depth of 1:

mysql> SELECT JSON_DEPTH('{}'), JSON_DEPTH('[]'), JSON_DEPTH('test val');
+------------------+------------------+------------------------+
| JSON_DEPTH('{}') | JSON_DEPTH('[]') | JSON_DEPTH('test val') |
+------------------+------------------+------------------------+
|                1 |                1 |                      1 |
+------------------+------------------+------------------------+

A non-empty array containing either an empty object or scalar element values will have a depth of 2:

mysql> SELECT JSON_DEPTH('[1, 2, 3, 4]'), JSON_DEPTH('[[], {}, "test"]');
+----------------------------+--------------------------------+
| JSON_DEPTH('[1, 2, 3, 4]') | JSON_DEPTH('[[], {}, "test"]') |
+----------------------------+--------------------------------+
|                          2 |                              2 |
+----------------------------+--------------------------------+  
 
 

On a single document, the depth starts at 1 and increases for each nested object.  Hence, the following document returns a depth of 3:

mysql> SELECT JSON_DEPTH('{"item1": "A", "item2": {"nested_item1": 99}');
+------------------------------------------------------------+
| JSON_DEPTH('{"item1": "A", "item2": {"nested_item1": 99}') |
+------------------------------------------------------------+
|                                                          3 |
+------------------------------------------------------------+

Note that the JSON_DEPTH() function does not accept a path as the second argument, so the reported depth is always relative to the document root.

JSON_KEYS()

Perhaps you have some experience with the PHP array_keys() function?  JSON_KEYS() is the near equivalent for JSON objects.  It returns the keys from the top-level of a JSON document as a JSON array. It does support the path argument, so including one will then return the top-level keys from the selected path:

mysql> SELECT JSON_KEYS('{"item1": "A", "item2": {"nested_item1": 99}');
+------------------------------------------------------------+
| JSON_KEYS('{"item1": "A", "item2": {"nested_item1": 99}')  |
+------------------------------------------------------------+
| ["item1", "item2"]                                         |
+------------------------------------------------------------+

Note that the return value does not include keys from nested objects. To get those, include the path argument:

mysql> SELECT JSON_KEYS('{"item1": "A", "item2": {"nested_item1": 99}', '$.item2');
+------------------------------------------------------------+
| JSON_KEYS('{"item1": "A", "item2": {"nested_item1": 99}')  |
+------------------------------------------------------------+
| ["nested_item1"]                                           |
+------------------------------------------------------------+

JSON_CONTAINS_PATH()

Similar to PHP’s array_key_exists() function, JSON_CONTAINS_PATH() returns a value of 1 if a JSON document contains data at a given path (or paths).  Otherwise it returns 0.  The reason that I added paths inside parentheses is that  JSON_CONTAINS_PATH() accepts a variable number of arguments to accommodate multiple paths:

JSON_CONTAINS_PATH(json_doc, one_or_all, path[, path] ...)

With regards to the handling of multiple paths, the return value depends on the second one_or_all argument:

  • ‘one’: return 1 if at least one path exists within the document, 0 otherwise.
  • ‘all’: return 1 if all paths exist within the document, 0 otherwise.
mysql> SELECT JSON_CONTAINS_PATH('{"item1": "A", "item2": {"nested_item1": 99}',      
                                 'one', 
                                 '$.item1', '$.item2', '$.Item3') AS 'Contains 3 Items?';
+---------------------------------------------+
| Contains 3 Items?                           |
+---------------------------------------------+
|                                           1 |
+---------------------------------------------+

In the above query, JSON_CONTAINS_PATH() returns 1 because the JSON document contains at least one of the path arguments. In fact, it contains two out of three passed paths.

Change the one_or_all argument to ‘all’ and now it returns 0, because there is no ‘$.Item3’:

mysql> SELECT JSON_CONTAINS_PATH('{"item1": "A", "item2": {"nested_item1": 99}',      
                                 'all', 
                                 '$.item1', '$.item2', '$.Item3') AS 'Contains 3 Items?';
+---------------------------------------------+
| Contains 3 Items?                           |
+---------------------------------------------+
|                                           0 |
+---------------------------------------------+

JSON_CONTAINS()

And that brings us to the main topic of this article: searching for JSON values.   The JSON_CONTAINS() function returns 1 if a specific value is contained within a JSON document, or, if a path argument is supplied, at a specific path within the document.  A value of 0 is returned if the value is not found.

mysql> SELECT JSON_CONTAINS('{"item1": "A", "item2": {"nested_item1": 99}', "A");
+--------------------------------------------------------------------+
| JSON_CONTAINS('{"item1": "A", "item2": {"nested_item1": 99}', "A") |
+--------------------------------------------------------------------+
|                                                                  1 |
+--------------------------------------------------------------------+

-- this call returns 0 because JSON_CONTAINS() does not search nested objects
mysql> SELECT JSON_CONTAINS('{"item1": "A", "item2": {"nested_item1": 99}', 99);
+--------------------------------------------------------------------+
| JSON_CONTAINS('{"item1": "A", "item2": {"nested_item1": 99}', 99)  |
+--------------------------------------------------------------------+
|                                                                  0 |
+--------------------------------------------------------------------+

Seems simple, but unlike JSON keys, which are always strings, JSON values can be any number of types.  That makes testing for equality a risky proposition.  To avoid comparing apples to oranges, JSON_CONTAINS() employs a few ground rules:

  1. Scalars:
    Values of types INTEGER and DECIMAL are comparable to each other.  Other scalar values must have the same JSON_TYPE() types before they may be considered for equality.
  2. Arrays:
    A candidate array is contained in a target array if and only if every element in the candidate is contained in some element of the target.
    A candidate non-array is contained in a target array if and only if the candidate is contained in some element of the target.
  3. Objects:
    A candidate object is contained in a target object if and only if for each key in the candidate there is a key with the same name in the target and the value associated with the candidate key is contained in the value associated with the target key.

Conclusion

Thanks to MySQL 7’s support for the JSON data type, we no longer have to store our JSON data as strings and resort to regular expressions for searching their contents.  Still a work in progress, expect to see more functions for searching and managing your JSON data in future releases.

See all articles by Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles