The multithread shard search can be slightly improved by the use of CompletionService. Currently the multithread search is done by creating a SearchCall for every shard and submitting them to the thread pool. Later on, the search function waits for each submitted SearchCall in the order they are submitted rather in the order they are finished. In the worst case the first SearchCall takes much longer than the later once. All finished SearchCalls are unprocessed until the first SearchCall gets finished. This can increase the memory consumption (later gc) and decrease processing speed.
The attached patch fixes this by using ExecutorCompletionService
.