Async: Way to determine unprocessed items from async.map() after error is emitted

Created on 10 Jul 2017  ·  5Comments  ·  Source: caolan/async

Hi,

Whilst using map(...) is there a way when handling an error emitted from the iteratee function to determine how many items have not yet been processed from the collection supplied?

I am using map to perform multiple outgoing HTTP requests (using the node request library) for an array of different urlsparams, etc. Part way through making any of these requests I may get a particular error from the target server which I can handle, but I then want to re-process the current item being worked on and then any remaining ones that map has not yet picked up.

I did wonder about maybe setting a flag on each item in my collection that has been worked on successfully (without error), and then when the error gets emitted that I am interested in, handle it accordingly. Then perhaps create a new array from the items with the flag set to false for those not yet processed and perform a further map over these, making sure I invoke the original final callback from the original map.

Not sure if this makes any sense, but is there any way to achieve what I have described above?

question

Most helpful comment

Hi @parky128, thanks for the question!

Would wrapping the iteratee with reflect work? reflect always passes a result object to the callback, so even if one of the iteratee functions errors, map would finish. Then you could just iterate over the results object from map, check which ones have an error property, and then handle it accordingly. You wouldn't have to reprocess any items that map may not have picked up.

async.map(coll, async.reflect(function(val, callback) {
  // your request code
}, function(err, results) {
  // err will always be null now
  results.forEach(function(result, i) {
    if (result.error) {
      // your code for handling errors
      // if `coll` is an array, you could access the value that caused 
      // this error through `coll[i]` as `map` preserves the order for arrays 
    } else {
      // otherwise `result.value` will contain the result of that `iteratee` call
    }
  });
});

Otherwise, to answer your question, map always returns an array. You could iterate over that array and check which values are undefined. Those will correspond to item(s) that either errored, passed undefined to their callback, were in progress when the error occurred or had not started. The reflect approach is probably the safer option though, as undefined may be a valid result from an iteratee call.

All 5 comments

Hi @parky128, thanks for the question!

Would wrapping the iteratee with reflect work? reflect always passes a result object to the callback, so even if one of the iteratee functions errors, map would finish. Then you could just iterate over the results object from map, check which ones have an error property, and then handle it accordingly. You wouldn't have to reprocess any items that map may not have picked up.

async.map(coll, async.reflect(function(val, callback) {
  // your request code
}, function(err, results) {
  // err will always be null now
  results.forEach(function(result, i) {
    if (result.error) {
      // your code for handling errors
      // if `coll` is an array, you could access the value that caused 
      // this error through `coll[i]` as `map` preserves the order for arrays 
    } else {
      // otherwise `result.value` will contain the result of that `iteratee` call
    }
  });
});

Otherwise, to answer your question, map always returns an array. You could iterate over that array and check which values are undefined. Those will correspond to item(s) that either errored, passed undefined to their callback, were in progress when the error occurred or had not started. The reflect approach is probably the safer option though, as undefined may be a valid result from an iteratee call.

Thanks for taking the time to offer a potential solution. I would actually want the async.map final callback to get invoked as soon as one of the iteratee functions errors with the particular error case I am looking to catch.

Looking at the docs I see I can achieve this by passing an error to the iteratee callback function, but I'm wondering if in the final callback that async.map will invoke, whether I could just use that results object to compare with the original collection and see what is left to process.

I just dont want async.map to attempt processing any other requests as soon as one of these returns the error case I am interested in.

I just dont want async.map to attempt processing any other requests as soon as one of these returns the error case I am interested in.

Assuming your iteratee is asynchronous, async.map will have started processing all of the items when the final callback is invoked. You could potentially compare the results object to the collection to see which items have not yet finished processing, but that also has its gotchas. For example, you would have to do it synchronously, on the same tick the final callback is invoked, as the results object will be updated as iteratees resolve.

You could try mapSeries. mapSeries will only run one request at a time. i.e. it only calls the next item when the current one finishes processing (as opposed to starting all of them at once). This means when an error occurs, and the final callback is invoked, no more iteratees will run. You could then compare the results to the collection to see which items have not yet been processed. This is still a little bit of a workaround, but it's nicer than using async.map. The main drawback of this approach though is that requests are no longer handled in parallel.

For example, if your collection is an array

async.mapSeries(coll, function(val, callback) {
  // your iteratee function
}, function(err, results) {
  if (err) {
    // unprocessItems will include the item that errored.
    var unprocessItems = coll.slice(results.length - 1);
    // handle the unprocessedItems
  }
});

Hmm Ok, yes my iteratee function is asynchronous as it uses the request library to make an outgoing http request and I callback from it with the result. So if I say had 10 items to iterate over, 4 succeeded and the 5th failed which I would then invoke the callback to the iteratee function with an error param, should that mean the results object in the final callback to async.map will contain only the 4 successful results?

If it does then for now I will live with the fact it will still make all the outgoing calls. I may down the line be a me to split my array into smaller arrays and perform smaller async.maps within a mapSeries to minimise the initial hit to the target server the requests are hitting.

@hargasinski - I have ended up using the async.reflect approach and this is working nicely for me, gives me full visibility of all items that gave an error :+1:

Thanks!

Was this page helpful?
0 / 5 - 0 ratings