We continue our tutorial on Concuerror, explaining the errors it can detect and the options that you can use to filter out those that are not important for your application.
Read also the first part!
Index
Symbolic names
Concuerror makes an effort to replace every Erlang PID that appears in its report with a symbolic name. The first process spawned is labelled P and every other process is named after the process that spawned it, with an integer denoting the order in which it was spawned i.e. P’s first “child” is named P.1, P.1’s third child is named P.1.3 and so on.
If you prefer to see the raw PIDs you can use
--symbolic_names false
. Concuerror is reusing the same processes, so the
results should be consistent across different interleavings.
Abnormal exits
Every time a process exits abnormally, Concuerror will mark the interleaving as erroneous. This means that any exception that escapes to the top-level will trigger a warning. In our example, processes P, P.1.1 and P.1.1.1 exited abnormally.
Ignoring “after” timeouts
If we take a look at the trace we can see that P triggered the standard
timeout clause of a gen
call:
As explained here, Concuerror by default assumes that any receive statement may trigger the after clause, unless it is impossible for a matching message not to have already arrived.
However, let’s assume that we don’t care about such timeouts. We can use the
--after_timeout 1000
to treat any timeout higher than 1000ms as
infinity. Notice that the one of the tips we got earlier suggests the same
course of action:
Treating abnormal exit reasons as normal
The other two processes exited abnormally because they were terminated by events
generated by the delivery of a stop
message to a supervisor. The supervisor
then sent shutdown
exit signals to the relevant processes. Again, let’s assume
that this is acceptable behaviour in our context, using --treat_as_normal
shutdown
(also suggested by a tip).
A report without problems
We now run Concuerror again, using the new options:
This time the output finishes in:
Concuerror finished the exploration, finding all the different states that the
program can reach. None of these interleavings had any processes terminating
abnormally with a reason different from shutdown
, nor any other errors.
Interleavings as a graph
Concuerror can also produce a
DOT file,
including all the interleavings. This feature can be enabled with --graph <filename>
.
You can then use dot
to convert
the DOT file to an image.
Racing events
The report includes a section where each pair of events that are racing is reported, to justify each additional interleaving. Let’s take a look at the output:
We can see e.g. that the delivery of an exit signal from a linked process races
with the delivery of an arbitrary message, (since P.1.1
is trapping exits), a
similar race appears between a message from a timer and finally a process
exiting races with the cancellation of a monitor, as the monitor message will be
sent in one case but not in the other.
The same pair of events may be reported multiple times, but this should happen only if we are exploring the reverse interleaving of a different race.
If you see a pair of events that you believe should not be racing, feel free to report the discrepancy! Concuerror’s race detection accuracy is the most sensitive component regarding the total number of interleavings explored!
“Deadlocked” processes
Since we called poolboy:stop/1
, all processes were properly stopped. What
would happen if we had let a process lingering? We already have such a test,
pool_startup_no_stop/0
(we need to use poolboy:start/1
instead of
start_link
, as the normal exit signal from the initial process also causes a
shutdown).
Running this test, yields:
The output file now contains the following description:
Concuerror reports any processes that have not exited, when no other processes
are available to run. Such processes are blocked at a receive statement and
are considered “deadlocked”. These warnings can be turned off by --ignore_error
deadlock
.
Concuerror will always reset the test to the initial state, so leaving some processes running may be a better alternative to suffering from all the races between exit signals that appear at termination.