RAII, locks and clang-tidy
A colleague of mine spent some time chasing a bug in a C++ library related to concurrency.
At the end it all boiled down to a silly declaration that wasn’t one.
RAII
There is a salient feature of C++ called Resource Acquisition Is Initialization (RAII) in which it is possible to bind resource lifetimes to the lifetime of declarations in C++.
This is very useful because the C++ object model guarantees that objects are destroyed when the scope in which they were declared ends. This way it is possible to acquire the resource in the constructor and release it in the destructor. This is even more useful in the presence of exceptions, whose unusual control flow is often the source of resources not beeing released under exceptional flows. Because it ties resources to lifetimes of objects and given that C++ gives guarantees regarding destruction of objects when exceptions are thrown, makes RAII a very convenient and practical technique.
However there are still some confusing cases due to the C++ syntax, which is sometimes surprising.
Temporaries
There is a number of situations in which a temporary needs to be created in a C++ program. A common one is when we bind a reference to a prvalue.
Because passing arguments is modeled in the same way as initialization, a similar case is as follows.
Temporaries are in general destroyed when they are not needed any longer. In
the call to foo
above this happens at the end of the expression (conceptually
at the semicolon).
However, in the first case, we don’t want the references x
and w
to point
to objects that have been freed. So in that case (and a number of others) the
lifetime of the temporary is extended such that the temporary is destroyed when
the reference goes out of scope.
Locks
C++ has a concurrency primitive class std::mutex
that is used to implement
mutual exclusion between threads. Often mutexes are of interest to link to
scopes, hence using them as RAII. For that purpose there are the classes
std::lock_guard
and std::scoped_lock
.
A common way to acquire a lock for the duration of a scope is as follows.
My colleague was using this idiom, so what was wrong for him? Well, as usually he thought he was using the idiom, instead the code looked like this.
I hope you can see the problem here, the lock is acquired and released immediately upon destruction of the temporary. Part of the problem is that RAII involves a declaration. It often happens that the name of the declaration is not relevant, hence one might accidentally omit it. Unfortunately, in one of those typical C++ syntactic flukes, we end not having a declaration anymore: we’re creating a temporary.
It looks like this case is almost always an error. I fail to see what is the usefulness of acquiring a lock to just release it afterwards. So, barring memory ordering consequences, this is just an expensive no-operation.
Finally note that there is a convoluted case where this syntax might be OK.
But we will happily ignore that one, because frankly, a code like this serves no favour to the reader. And code is more often read than written.
There is still a case where we could argue that this idiom is iseful. Class
std::scoped_lock
allows locking more than one lock at a time. So if we manage
to acquire all of them (even if we release all of them immediately) we may know
that the protected resources were at some point available. For instance we could
log that fact, but the logging process itself may not demand exclusive access
to the processes. However it is always possible to write something like this.
More wordy, yes, but an unusual usage that we can still reduce to the syntax of the usual case.
Attributes
Given a user-defined class that we can modify, such as A
above, its
constructors can be marked with the standard attribute nodiscard
.
However we cannot do that with std::lock_guard
as it is a class that is
defined by the C++ library and it may not include such annotation (it may not
be desirable to introduce false positives in some of the unusual cases).
Linting with clang-tidy
One thing we can do is use a linting tool such as clang-tidy
. This
clang-based tool provides a practical mechanism to add new diagnostics. Most
checks are designed around the clang ASTMatcher library that is very
convenient to match clang ASTs.
Adding a new check
Let’s see how we can add a check to clang-tidy
. In this post we will see
the basics, so check the documentation for greater details.
First let’s define a header inside the concurrency
namespace. There are a few
namespaces but this one feels right for our purpose.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CONCURRENCY_USELESSLOCK_H
#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CONCURRENCY_USELESSLOCK_H
#include "../ClangTidyCheck.h"
namespace clang {
namespace tidy {
namespace concurrency {
class UselessLockCheck : public ClangTidyCheck {
public:
UselessLockCheck(StringRef Name, ClangTidyContext *Context)
: ClangTidyCheck(Name, Context) {}
void registerMatchers(ast_matchers::MatchFinder *Finder) override;
void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
};
} // namespace concurrency
} // namespace tidy
} // namespace clang
#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CONCURRENCY_USELESSLOCK_H
This header defines a class that inherits from ClangTidyCheck
and we override
a member function called registerMatcher
and check
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include "UselessLockCheck.h"
#include "clang/AST/ASTContext.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Lex/Preprocessor.h"
using namespace clang::ast_matchers;
namespace clang {
namespace tidy {
namespace concurrency {
void UselessLockCheck::registerMatchers(MatchFinder *Finder) {
Finder->addMatcher(
cxxTemporaryObjectExpr(isExpansionInMainFile(),
hasType(asString("std::lock_guard<std::mutex>")))
.bind("uselesslock"),
this);
}
void UselessLockCheck::check(const MatchFinder::MatchResult &Result) {
const auto *MatchedExpr = Result.Nodes.getNodeAs<Expr>("uselesslock");
diag(MatchedExpr->getBeginLoc(), "this lock is not protecting anything");
}
} // namespace concurrency
} // namespace tidy
} // namespace clang
Member function registerMatcher
will
register the matcher (or matchers) that will then be passed onto function
check
. These use the ASTMatcher library, which is very powerful, but basically
we say: match all CXXTemporaryObjectExpr
nodes (we’ll see below why) using the
matcher constructor cxxTemporaryObjectExpr
. But of those CXXTemporaryObjectExpr
,
only match those that appear in the main file (not in a header) and they are
trying to construct a type std::lock_guard<std::mutex>
. Finally bind a node that
matches all these properties to the name uselesslock
.
Once a node matches it is passed to check
. Here we can do some additional
filtering if that can’t easily be expressed using matchers. We don’t have to do
anything like that so we just diagnose that the lock is useless.
Finally, the required plumbing to compile and register the check follows (in diff format for clarity).
Trying the check
Let’s consider the following input
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <mutex>
std::mutex protect_resource;
struct A
{
explicit A(int);
};
void critical() {
{
std::lock_guard<std::mutex>{protect_resource};
}
{
A{3};
A(3);
}
}
Now we can run clang-tidy
, enabling explicitly our new check (ignore the
complaint about the compilation database, this is kind of expected given that
we haven’t created any).
Voilà, here we have our new diagnostic. Now we can add this to CI if needed. At this point I’ll stop but we can extend it with a fix-it (so clang suggests how to ammend the issue).
How do we know what to match?
There are two mechanisms here. First we can use clang-query
which allows us
to use the C++ syntax in a REPL-like tool. But before we can know what we want
to match we may need to see the AST as clang
represents it. We can do that
doing -Xclang -ast-dump
.
In the example above
This prints a giant tree, after all header mutex
will include lots of things,
but the interesting bits are at the end.
`-FunctionDecl 0x12506090 <line:10:1, line:18:1> line:10:6 critical 'void ()' `-CompoundStmt 0x12507588 <col:17, line:18:1> |-CompoundStmt 0x12506fe0 <line:11:3, line:13:3> | `-ExprWithCleanups 0x12506fc8 <line:12:6, col:50> 'std::lock_guard<std::mutex>':'std::lock_guard<std::mutex>' | `-CXXBindTemporaryExpr 0x12506fa8 <col:6, col:50> 'std::lock_guard<std::mutex>':'std::lock_guard<std::mutex>' (CXXTemporary 0x12506fa8) | `-CXXTemporaryObjectExpr 0x12506f68 <col:6, col:50> 'std::lock_guard<std::mutex>':'std::lock_guard<std::mutex>' 'void (std::lock_guard<std::mutex>::mutex_type &)' list | `-DeclRefExpr 0x12506368 <col:34> 'std::mutex':'std::mutex' lvalue Var 0x125056e0 'protect_resource' 'std::mutex':'std::mutex' `-CompoundStmt 0x12507568 <line:14:3, line:17:3> |-CXXTemporaryObjectExpr 0x125073c8 <line:15:7, col:10> 'A' 'void (int)' list | `-IntegerLiteral 0x12507008 <col:9> 'int' 3 `-CXXFunctionalCastExpr 0x12507540 <line:16:7, col:10> 'A' functional cast to struct A <ConstructorConversion> `-CXXConstructExpr 0x12507510 <col:7, col:10> 'A' 'void (int)' `-IntegerLiteral 0x125074f0 <col:9> 'int' 3
As you can see this is a representation of our function critical
. It has two
compound statements of which we care about the one including that
CXXTemporaryObjectExpr
. This is exactly the node we matched above in the
matcher. To avoid matching other things like A{3}
we need to ensure the
expression has type std::lock_guard<std::mutex>
, again a fact we stated in
the matcher.
Cool right? 😃