Monday, December 04, 2006

"Concurrency Violation"

After someone started up some process on the other node, VCS reports a "Concurrency Violation", and tries to offline that process. What is this, and is it bad?


A Concurrency Volation is reported when the Agent of a resource reports
that same resource or process is running on another node. The Agent will then
try to run the offline script for that resource on that other node. This is to
prevent split brain.

If the Agent cannot offline the process on the other node, then you may
want to manually offline the process or change the Agent's monitoring.

Sometimes a Concurrency Violation is more or less a "false alarm", because
it has a lot to do with how good your monitoring is. You need to find out
from your Agent, how exactly is it monitoring? If it is an Application Agent
resource, look at the MonitorProgram script, or look at MonitorProcesses.
If it looks like the Agent is just monitoring for something very superficial,
then just change the monitoring. If you are changing the monitoring in
production, you may want to freeze the Service Group or make the
resource non-Critical.

Some agents have a "second level" or "deep" monitor feature, either built
with the agent, or requiring you to write a custom script. If you can write
one, you need to make it better than the first level monitor, which is
obviously superficial if it reports online but the resource is really offline.

No comments: