1
$\begingroup$

I am trying to use a datasource that has an SDMX-JSON endpoint. I am currently running my tests on Mathematica 10.3.1 running on 64-bit Linux.

I started off by familiarising myself with the tools to import JSON in Mathematica. It seams that there is at least some built in support :)

After reading the documentation I stumbled upon this post http://mathgis.blogspot.com/2010/03/extract-elevation-data-with-google.html

That had the example query of

json=Import["http://maps.google.com/maps/api/elevation/json?locations=39.7391536,-104.9847034&sensor=false"]

If used ToExpression@StringCases[json, NumberString] to parse the response. It was cleaver but I get mixed data types from my source so I would need something more general.

Let's look at OECD data for the sake of having a public dataset to play around with starting of with a simple

OECDendpoint = "http://stats.oecd.org/sdmx-json/data/QNA/AUS+AUT.GDP+B1_GE.CUR+VOBARSA.Q/all?startTime=2009-Q2&endTime=2011-Q4&dimensionAtObservation=allDimensions";

Import[OECDendpoint]
TableForm[Apply[List, %, {1}]]

Returns

TableForm[{{
  "header", {
   "prepared" -> "2018-07-29T10:02:49.3001013Z", 
    "id" -> "b1b796ff-408e-4040-9cda-64090e1b629f", "test" -> False, 
    "links" -> {{
      "href" -> "https://stats.oecd.org:443/sdmx-json/data/QNA/AUS+\
AUT.GDP+B1_GE.CUR+VOBARSA.Q/all?startTime=2009-Q2&endTime=2011-Q4&\
dimensionAtObservation=allDimensions", "rel" -> "request"}}, 
    "sender" -> {
     "id" -> "OECD", 
      "name" -> "Organisation for Economic Co-operation and \
Development"}}}, {
  "dataSets", {{
    "action" -> "Information", 
     "observations" -> {
      "0:0:0:0:0" -> {1.326648266393*^6, 0, Null, 0, 0, 0}, 
       "0:0:0:0:1" -> {1.330051435134*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:0" -> {288766.840976, 0, Null, 1, 0, 0}, 
       "0:0:0:0:2" -> {1.33914308436*^6, 0, Null, 0, 0, 0}, 
       "0:0:0:0:10" -> {1.421921730294*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:1" -> {289389.688978, 0, Null, 1, 0, 0}, 
       "0:0:0:0:3" -> {1.345735769993*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:2" -> {292485.252988, 0, Null, 1, 0, 0}, 
       "0:0:0:0:4" -> {1.354602321735*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:3" -> {290541.336982, 0, Null, 1, 0, 0}, 
       "0:0:0:0:5" -> {1.363201808666*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:4" -> {295413.484998, 0, Null, 1, 0, 0}, 
       "0:0:0:0:6" -> {1.376917494343*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:5" -> {298238.645008, 0, Null, 1, 0, 0}, 
       "0:0:0:0:7" -> {1.373125174019*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:6" -> {299451.813012, 0, Null, 1, 0, 0}, 
       "0:0:0:0:8" -> {1.387909118942*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:7" -> {304121.997028, 0, Null, 1, 0, 0}, 
       "0:0:0:0:9" -> {1.406428156015*^6, 0, Null, 0, 0, 0}, 
       "1:0:0:0:8" -> {304508.401029, 0, Null, 1, 0, 0}, 
       "1:0:0:0:9" -> {304961.841031, 0, Null, 1, 0, 0}, 
       "1:0:0:0:10" -> {305627.561033, 0, Null, 1, 0, 0}}}}}, {
  "structure", {
   "description" -> "Quarterly National Accounts", 
    "name" -> "Quarterly National Accounts", 
    "links" -> {{
      "href" -> "https://stats.oecd.org/sdmx-json/dataflow/QNA/all", 
       "rel" -> "dataflow"}}, 
    "dimensions" -> {
     "observation" -> {{
        "id" -> "LOCATION", "keyPosition" -> 0, 
         "values" -> {{"id" -> "AUS", "name" -> "Australia"}, {
           "id" -> "AUT", "name" -> "Austria"}}, "name" -> "Country", 
         "role" -> "REF_AREA"}, {
        "id" -> "SUBJECT", "keyPosition" -> 1, 
         "values" -> {{
           "id" -> "B1_GE", 
            "name" -> "Gross domestic product - expenditure \
approach"}}, "name" -> "Subject"}, {
        "id" -> "MEASURE", "keyPosition" -> 2, 
         "values" -> {{
           "id" -> "VOBARSA", 
            "name" -> "National currency, volume estimates, OECD \
reference year, annual levels, seasonally adjusted"}}, 
         "name" -> "Measure"}, {
        "id" -> "FREQUENCY", "keyPosition" -> 3, 
         "values" -> {{"id" -> "Q", "name" -> "Quarterly"}}, 
         "name" -> "Frequency", "role" -> "FREQ"}, {
        "id" -> "TIME_PERIOD", 
         "values" -> {{"id" -> "2009-Q2", "name" -> "Q2-2009"}, {
           "id" -> "2009-Q3", "name" -> "Q3-2009"}, {
           "id" -> "2009-Q4", "name" -> "Q4-2009"}, {
           "id" -> "2010-Q1", "name" -> "Q1-2010"}, {
           "id" -> "2010-Q2", "name" -> "Q2-2010"}, {
           "id" -> "2010-Q3", "name" -> "Q3-2010"}, {
           "id" -> "2010-Q4", "name" -> "Q4-2010"}, {
           "id" -> "2011-Q1", "name" -> "Q1-2011"}, {
           "id" -> "2011-Q2", "name" -> "Q2-2011"}, {
           "id" -> "2011-Q3", "name" -> "Q3-2011"}, {
           "id" -> "2011-Q4", "name" -> "Q4-2011"}}, 
         "name" -> "Period", "role" -> "TIME_PERIOD"}}}, 
    "annotations" -> {{
      "text" -> "", "title" -> "Copyright OECD - All rights reserved",
        "uri" -> ""}, {
      "text" -> "", "title" -> "Terms and Conditions", 
       "uri" -> "http://www.oecd.org/termsandconditions/"}, {
      "text" -> "", "title" -> "Privacy Policy", 
       "uri" -> "http://www.oecd.org/privacy/"}, {
      "text" -> "", "title" -> "MyOECD", 
       "uri" -> "https://www.oecd.org/login"}, {
      "text" -> "", "title" -> "Contact Us", 
       "uri" -> "http://www.oecd.org/contact/"}}, 
    "attributes" -> {
     "dataSet" -> {}, "series" -> {}, 
      "observation" -> {{
        "id" -> "TIME_FORMAT", 
         "values" -> {{"id" -> "P3M", "name" -> "Quarterly"}}, 
         "name" -> "Time Format"}, {
        "id" -> "OBS_STATUS", "values" -> {}, 
         "name" -> "Observation Status"}, {
        "id" -> "UNIT", 
         "values" -> {{
           "id" -> "AUD", "name" -> "Australian Dollar"}, {
           "id" -> "EUR", "name" -> "Euro"}}, "name" -> "Unit", 
         "role" -> "UNIT_MEASURE"}, {
        "default" -> "0", "id" -> "POWERCODE", 
         "values" -> {{"id" -> "6", "name" -> "Millions"}}, 
         "name" -> "Unit multiplier", "role" -> "UNIT_MULT"}, {
        "id" -> "REFERENCEPERIOD", 
         "values" -> {{"id" -> "2010", "name" -> "2010"}}, 
         "name" -> "Reference period", "role" -> "BASE_PER"}}}}}}]

As we can see from the result the response has quite a bit of structure in it so simply parsing the string would get us too close to the sun and reminds me of this post (https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454).

Luckily we can hint Import that we are dealing with JSON.

Import[OECDendpoint, "JSON"]

I am interested in particular fields of the response. So let's try to get the elements available to import.

Import[OECDendpoint, {"JSON", "Elements"}]

We see that there is only one option {"Data"}

Ok so we use "Data". Now I would like to extract "dataSets" and "structure"

Import[OECDendpoint, {"JSON", "Data", {"dataSets", "structure"}}]

This works as expected, but when I try to extract "observations" and "dimensions" I can not figure out the syntax.

Import[OECDendpoint, {"JSON", "Data", {"dataSets", {"observations"}, "structure", {"dimensions"}}}]

Question: Is this the best practice for constructing queries or should I simply do something in the lines of ReplaceAll for finding the values I need?

myResponse = Import[OECDendpoint, "JSON"]
{"observations" /. ("dataSets" /. myResponse), "dimensions" /. ("structure" /. myResponse)}

Note: Just before pressing the "Post Your Question" button I also found this related question with an answer from Szabolcs which seams related. (Format google places JSON output)

$\endgroup$

1 Answer 1

5
$\begingroup$

Query

The answer is in your question, I would suggest to download the whole data parsed in "RawJSON" format and then use Query offline for any sort of data analysis you may want.

oecddata = Import[OSCDendpoint, {"RawJSON", "Data"}];

Query["dataSets", 1, "observations"]@oecddata

Mathematica graphics

Query["dataSets", 1, "observations", Values, First]@oecddata
(* {1.32665*10^6, 1.33005*10^6, 1.33914*10^6, 1.34574*10^6, 
 1.3546*10^6, 1.3632*10^6, 1.37692*10^6, 1.37313*10^6, 1.38791*10^6, 
 1.40643*10^6, 1.42192*10^6, 288767., 289390., 292485., 290541.,
 295413., 298239., 299452., 304122., 304508., 304962., 305628.} *)
$\endgroup$
1
  • $\begingroup$ I fixed a typo I had in my variable name OECDendpoint. (I can not edit single char in your post) I can also give version specific feedback. When running Import[OSCDendpoint, {"RawJSON", "Data"}] on Mathematica 10.3.1 then you will run into the error "The Import element \"\!(\"Data\")\" is not present when importing \ as \!(\"RawJSON\").". When checking elements then the list is empty so one has to omit the "Data" leaving Import[OSCDendpoint, {"RawJSON"}]. I also double checked and the wolfram development platform does have the "Data" element so it must be fixed in some future version. $\endgroup$ Commented Aug 1, 2018 at 16:21

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.