File parsing takes too long. #98
Replies: 6 comments
-
The jm script (https://github.com/pkoppstein/jm) is based on JSON Machine, and could be used as follows to find the value of a specific key:
Even if you don't want to use jm itself, you could examine it to see how it accomplishes what you want. Alternatively, you might consider using the --stream option of jq (https://github.com/stedolan/jq), which is designed for just this kind of problem:
|
Beta Was this translation helpful? Give feedback.
-
If you need to use it from inside PHP, just use simple foreach (Items::fromFile('500gb.json') as $key => $item) {
if ($key === "PTHT0012803") {
// your code
}
} Keep in mind, that a file of this size might get hours to parse with JSON Machine. I guess 2-4 depending on the machine and PHP configuration. You also might be interested in #97. |
Beta Was this translation helpful? Give feedback.
-
Sorry, I read 500 GB instead of just 5 GB. Then it should be a matter of minutes. Make sure xdebug is disabled and JIT enabled. Also make longer your php time limit if you parse from browser. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your help! I will try the method you mentioned later. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your reply. I used code like this before, but it took too long. I think it may be because foreach takes too much time. Is there any way to avoid this situation in jsonmachine? If not, I will try to split large files. |
Beta Was this translation helpful? Give feedback.
-
If you split them, it will take about the same time anyway. There is no faster solution in JSON Machine for now. Keep up with #97 which should bring some speedup. |
Beta Was this translation helpful? Give feedback.
-
Hi,
This may be a stupid question, but this problem needs to be solved urgently so I opened this issue…I have a very large json file, about 5GB with 30 million lines. I tried parsing the json file with jsonmachine, but it seemed to take so long that I got an Internal Server Error in the browser. I noticed in the readme file that 100GB files can also be parsed, but I'm not sure how to write code since I'm not very good at php.
My json file format is roughly as follows:
{ "head":{...}, "PTHT0000001":{"CDD":[...],"SMART":[...]}, ..., "PTHT0012803":{"CDD":[...],"SMART":[...]} }
My goal is to find a unique PTHTxxxxxxx and extract its value. How should I parse it?
Thank you very much!
Beta Was this translation helpful? Give feedback.
All reactions