0

I have a long list of JSON data, with repeats of contents similar to followings.

Due to the original JSON file is too long, I will just shared the hyperlinks here. This is a result generated from a database called RegulomeDB.

Direct link to the JSON file

I would like to extract specific data (eQTLs) from "method": "eQTLs" and "value": "xxxx", and put them into 2 columns (tab delimited) exactly like below. Note: "value":"xxxx" is extracted right after "method": "eQTLs"is detected.

eQTLs   firstResult, secondResult, thirdResult, ...

In this example, the desired output is:

eQTLs   EIF3S8, EIF3CL

I've tried using a python script but was unsuccessful.

import json
with open('file.json') as f:
    f_json = json.load(f)
    print 'f_json[0]['"method": "eQTLs"'] + "\t" + f_json[0]["value"]

Thank you for your kind help.

3
  • Do you have a preferred language for doing this? Commented Nov 8, 2022 at 14:36
  • Hi @NickODell, no I don't. But bash would be good. Commented Nov 8, 2022 at 14:38
  • Double request with bioinformatics.stackexchange.com/questions/19978/… Commented Nov 9, 2022 at 10:01

2 Answers 2

1

Maybe you'll find the JSON-parser useful. It can open urls and can manipulate strings any way you want:

$ xidel -s "https://regulomedb.org/regulome-search/?regions=chr16:28539847-28539848&genome=GRCh37&format=json" \
  -e '"eQTLs	"||join($json("@graph")()[method="eQTLs"]/value,", ")'
eQTLs   EIF3S8, EIF3CL

Or with the XPath/XQuery 3.1 syntax:

-e '"eQTLs	"||join($json?"@graph"?*[method="eQTLs"]?value,", ")'
Sign up to request clarification or add additional context in comments.

Comments

0

Try this:

cat file.json | grep -iE '"method":\s*"eQTLs"[^}]*' -o | cut -d ',' -f 1,5 | sed -r 's/"|:|method|value//gi' | sed 's/\s*eqtls,\s*//gi' | tr '\n' ',' | sed 's/,$/\n/g' | sed 's/,/, /g' | xargs echo -e 'eQTLs\x09'

6 Comments

Hi @SaSkY, thank you for trying. However, I am getting errors as followings. grep: invalid option -- P usage: grep [-abcdDEFGHhIiJLlMmnOopqRSsUVvwXxZz] [-A num] [-B num] [-C[num]] [-e pattern] [-f file] [--binary-files=value] [--color=when] [--context[=num]] [--directories=action] [--label] [--line-buffered] [--null] [pattern] [file ...]
@austin7923 I updated the answer please try the command again
Thanks! It works! But it would be good if the final output could be glued together with a "comma", exactly like the one shown in the post.
@austin7923 I updated the answer again, can you try it and tell me if it works as you expected ?
Hi @SaSkY, Some edits need to be done on your command. It works flawlessly now. Thank you! cat file.json | grep -iE '"method":\s*"eQTLs"[^}]*' -o | cut -d ',' -f 1,5 | sed -r 's/"|:|method|value//gi' | sed 's/\s*eqtls,\s*//gi' | tr '\n' ',' | sed 's/,$/\n/g' | sed 's/,/, /g' | xargs echo 'eQTLs
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.