node.js - Splitting a really long file with JSON array -


i have big file json array(~8gb). need split group of small files each contains part of array.

the array contains objects.

i decided implement algorithm:

  • read file symbol
  • add symbols buffer
  • try parse buffer json-object.
  • if parses write object file
  • when file achieves size, change file

i tried implement myself finish this:

var fs = require('fs');  readable = fs.createreadstream("walmart.dump", {     encoding: 'utf8',     fd: null, }); var chunk, buffer = '', counter=0; readable.on('readable', function() {     readable.read(1);     while (null !== (chunk = readable.read(1))) {         buffer += chunk; // chunk 1 symbol         console.log(buffer.length);         if (chunk !== '}') continue;         try {             var res = json.parse(buffer);             console.log(res);             readable.read(1);             readable.read(1);             readable.read(1);             //array.apply(null, {length: 10}).map(function(){return readable.read(1)});             buffer = '{';         } catch(e) { }     } }) 

did resolve similar problem?

clarinet module (https://github.com/dscape/clarinet) looks quite promising me. it's based on sax-js should quite robust , tested.


Comments