Parse large JSON file in Nodejs FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
I try to populate strapi database with large json file (4,6Mo),
It is working in local but when I deploy to Heroku (Hobby Basic
)
I got this error:
src/index.js
"use strict";
const path = require("path");
const dataDirectory = path.resolve(process.cwd(), "data");
const JsonPath = path.join(dataDirectory, "places.json");
const StreamArray = require("stream-json/streamers/StreamArray");
const fs = require("fs");
module.exports = {
async bootstrap({ strapi }) {
if (!(await strapi.entityService.count("api::place.place"))) {
strapi.log.info("Create place 🚀");
const pipeline = fs
.createReadStream(JsonPath)
.pipe(StreamArray.withParser());
pipeline.on("data", async (data) => {
await strapi.entityService.create("api::place.place", {
data: data.value,
});
});
} else {
strapi.log.info("Place ready 🚀");
}
},
};
console error
2022-01-18T00:40:24.566747+00:00 app[web.1]: [2022-01-18 00:40:24.566] info: Create place 🚀
2022-01-18T00:40:40.028542+00:00 app[web.1]:
2022-01-18T00:40:40.028550+00:00 app[web.1]: <--- Last few GCs --->
2022-01-18T00:40:40.028551+00:00 app[web.1]:
2022-01-18T00:40:40.028566+00:00 app[web.1]: [22:0x646cfd0] 16102 ms: Mark-sweep (reduce) 249.6 (258.2) -> 247.6 (258.2) MB, 803.9 / 0.0 ms (average mu = 0.165, current mu = 0.089) allocation failure scavenge might not succeed
2022-01-18T00:40:40.028567+00:00 app[web.1]: [22:0x646cfd0] 16974 ms: Mark-sweep (reduce) 248.6 (258.2) -> 247.9 (258.2) MB, 867.2 / 0.0 ms (average mu = 0.088, current mu = 0.005) allocation failure scavenge might not succeed
2022-01-18T00:40:40.028567+00:00 app[web.1]:
2022-01-18T00:40:40.028567+00:00 app[web.1]:
2022-01-18T00:40:40.028567+00:00 app[web.1]: <--- JS stacktrace --->
2022-01-18T00:40:40.028568+00:00 app[web.1]:
2022-01-18T00:40:40.028574+00:00 app[web.1]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2022-01-18T00:40:40.029368+00:00 app[web.1]: 1: 0xb00d90 node::Abort() [node]
2022-01-18T00:40:40.029964+00:00 app[web.1]: 2: 0xa1823b node::FatalError(char const*, char const*) [node]
2022-01-18T00:40:40.030637+00:00 app[web.1]: 3: 0xcedbce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2022-01-18T00:40:40.031291+00:00 app[web.1]: 4: 0xcedf47 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2022-01-18T00:40:40.031941+00:00 app[web.1]: 5: 0xea6105 [node]
2022-01-18T00:40:40.032555+00:00 app[web.1]: 6: 0xea6be6 [node]
2022-01-18T00:40:40.033131+00:00 app[web.1]: 7: 0xeb4b1e [node]
2022-01-18T00:40:40.033724+00:00 app[web.1]: 8: 0xeb5560 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
2022-01-18T00:40:40.034294+00:00 app[web.1]: 9: 0xeb84de v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-01-18T00:40:40.034878+00:00 app[web.1]: 10: 0xe7990a v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
2022-01-18T00:40:40.035544+00:00 app[web.1]: 11: 0x11f2f06 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
2022-01-18T00:40:40.036316+00:00 app[web.1]: 12: 0x15e7819 [node]
2022-01-18T00:40:40.061945+00:00 app[web.1]: Aborted
2022-01-18T00:40:40.205401+00:00 heroku[web.1]: Process exited with status 134
2022-01-18T00:40:40.365837+00:00 heroku[web.1]: State changed from starting to crashed
Thank you very much if you have any suggestions or ideas to resolve this error.
Solution 1:
A data
event will be emitted each time a chunk is processed without considering whether an entity has been inserted.
So data events will be emitted much faster than strapi can process it.
See Backpressuring in Streams for more informations
You should use core pipeline function with a custom writable stream and call done
when entity has been created :
const { pipeline } = require("stream/promises");
await pipeline(
fs.createReadStream(JsonPath),
StreamArray.withParser(),
new Writable({
objectMode: true,
write: (data, _, done) => {
strapi.entityService
.create("api::place.place", {
data: data.value,
})
.then(() => done());
},
})
);
Note that you can also use oleoduc (i'm the author). This tiny library provides utils to easily stream data.
const { oleoduc, writeData } = require("oleoduc");
await oleoduc(
fs.createReadStream(JsonPath),
StreamArray.withParser(),
writeData((data) => {
return strapi.entityService.create("api::place.place", {
data: data.value,
});
})
);