c++ - Concurrency parallel_for using too much memory -
i have vector of gray scale images (loaded using opencv 3). each image 2560x64800 pixels , vector can hold 23 images. max size of vector approximately 3.6 gb. need "blob detection" on each image , speed things, run image analysis using parallel processing of images. detected blobs small 40x40 pixels image should returned. entire data set consists of 13 sets of images - have chosen make outer loop normal sequential for-loop:
for (size_t cn = 0; cn < numberofchannels.size(); ++cn) { vector<mat> imagevector = readimages(...); // loop through images in "imagevector" concurrency::parallel_for(size_t(0), size_t(imagevector.size()), [&](size_t ii) { // detect blobs vector<keypoint> keypoints = detectblobs(imagevector[ii], false); // return result thesnips[cn][ii] = collectimagesnips(imagevector[ii], keypoints, 40); }); };
the function declaration 2 functions "detectblobs" (based on opencv 3 "simpleblobdetection") , "collectimagesnips" are:
vector<keypoint> detectblobs(mat image, bool usetestimage) vector<mat> collectimagesnips(mat image, vector<keypoint> keypoints, int rectsize)
when run supplied "parallel_for" loop expect (hope) 3.6 gb ram used store loaded "imagevector" , each thread additional approx. 160 gb (+ minor overhead blob detection storing small images of detected blobs) result. instead memory usage explodes each thread takes additional 3.6 gb (+ additional overhead), , windows stops responding.
as novice c++-coder might have missed obvious, see expected behavior? , more importantly, there way avoid multiplying entire imagevector each thread?
nb! since images want analyse reconstruction large number of smaller images, reading 1 image @ time without large disk i/o overhead not possible.
nb! though reducing allowed number of threads solve problem of windows crashing, not acceptable solution since cpu-resources not utilized.
Comments
Post a Comment