|
Here is a patch for one problem at least.
I have applied this patch to trunk 473, and the unit tests hang in the MasterTest.
The Master.shutdown is hanging in the join with the DistributeShardsThread, which is in _updateLock.getUpdatedCondition().await(); I am guessing that await is not honoring the interrupt. Jstack follows Attaching to process ID 82990, please wait... No deadlocks found. Thread t@34051: (state = BLOCKED)
Thread t@34307: (state = BLOCKED)
Thread t@34563: (state = IN_NATIVE)
Thread t@34819: (state = BLOCKED)
Thread t@35075: (state = IN_NATIVE)
Thread t@35331: (state = BLOCKED)
Thread t@35587: (state = BLOCKED)
Thread t@35843: (state = BLOCKED)
Thread t@36099: (state = IN_NATIVE)
Thread t@36355: (state = BLOCKED) Thread t@36611: (state = BLOCKED) Thread t@36867: (state = BLOCKED)
Thread t@37123: (state = BLOCKED)
Thread t@37379: (state = BLOCKED)
This failure may be intermittant. The next time I ran the test set, the test completed normally.
java -version host environment macos X leopard- Hi Peter, I think with the zkclient refactoring this kind of issues should be solved.
Do you think we can close this issue? There is a related issue in that if a client can't broadcast to all nodes because the search configuration has not yet updated, then the entire search is lost rather than just recording the exception. This is also related to the bug for restructuring of the ZK data so that ZK can delete all state when a node is lost rather than depending on the master to do that ( I merging this issues into
|
|||||||||||||||||||||||||||||||||||||||||||
I can see how to change "exists", but many others are more delicate because I can't tell if they are idempotent or not.