Regex for parsing single key: values out of JSON in Javascript
I'm trying to see if it's possible to lookup individual keys
out of a JSON
string in Javascript and return it's Value
with Regex
. Sort of like building a JSON
search tool.
Imagine the following JSON
"{
"Name": "Humpty",
"Age": "18",
"Siblings" : ["Dracula", "Snow White", "Merlin"],
"Posts": [
{
"Title": "How I fell",
"Comments": [
{
"User":"Fairy God Mother",
"Comment": "Ha, can't say I didn't see it coming"
}
]
}
]
}"
I want to be able to search through the JSON
string and only pull out individual properties.
lets assume it's a function
already, it would look something like.
function getPropFromJSON(prop, JSONString){
// Obviously this regex will only match Keys that have
// String Values.
var exp = new RegExp("\""+prop+"\"\:[^\,\}]*");
return JSONString.match(exp)[0].replace("\""+prop+"\":","");
}
It would return the substring of the Value
for the Key
.
e.g.
getPropFromJSON("Comments")
> "[
{
"User":"Fairy God Mother",
"Comment": "Ha, can't say I didn't see it coming"
}
]"
If your wondering why I want to do this instead of using JSON.parse()
, I'm building a JSON document store around localStorage
. localStorage
only supports key/value pairs, so I'm storing a JSON
string of the entire Document
in a unique Key
. I want to be able to run a query on the documents, ideally without the overhead of JSON.parsing()
the entire Collection
of Documents
then recursing over the Keys
/nested Keys
to find a match.
I'm not the best at regex
so I don't know how to do this, or if it's even possible with regex
alone. This is only an experiment to find out if it's possible. Any other ideas as a solution would be appreciated.
Solution 1:
I would strongly discourage you from doing this. JSON is not a regular language as clearly stated here: https://cstheory.stackexchange.com/questions/3987/is-json-a-regular-language
To quote from the above post:
For example, consider an array of arrays of arrays:
[ [ [ 1, 2], [2, 3] ] , [ [ 3, 4], [ 4, 5] ] ]
Clearly you couldn't parse that with true regular expressions.
I'd recommend converting your JSON to an object (JSON.parse) & implementing a find function to traverse the structure.
Other than that, you can take a look at guts of Douglas Crockford's json2.js parse method. Perhaps an altered version would allow you to search through the JSON string & just return the particular object you were looking for without converting the entire structure to an object. This is only useful if you never retrieve any other data from your JSON. If you do, you might as well have converted the whole thing to begin with.
EDIT
Just to further show how Regex breaks down, here's a regex that attempts to parse JSON
If you plug it into http://regexpal.com/ with "Dot Matches All" checked. You'll find that it can match some elements nicely like:
Regex
"Comments"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Comments": [ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ]
Regex
"Name"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Name": "Humpty"
However as soon as you start querying for the higher structures like "Posts", which has nested arrays, you'll find that you cannot correctly return the structure since the regex does not have context of which "]" is the designated end of the structure.
Regex
"Posts"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Posts": [ { "Title": "How I fell", "Comments": [ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ]