Faster code-completion with clang
I am investigating potential code-completion speedups while using clang's code-completion mechanism. The flow described below is what I found in rtags, by Anders Bakken.
Translation units are parsed by a daemon monitoring files for changes. This is done by called clang_parseTranslationUnit
and related functions(reparse*
, dispose*
). When the user requests a completion at a given line and column in a source file, the daemon passes the cached translation unit for the last saved version of the source file and the current source file to clang_codeCompleteAt
. (Clang CodeComplete docs).
The flags passed to clang_parseTranslationUnit
(from CompletionThread::process, line 271) are CXTranslationUnit_PrecompiledPreamble|CXTranslationUnit_CacheCompletionResults|CXTranslationUnit_SkipFunctionBodes
. The flags passed to clang_codeCompleteAt
(from CompletionThread::process, line 305) are CXCodeComplete_IncludeMacros|CXCodeComplete_IncludeCodePatterns
.
The call to clang_codeCompleteAt
is very slow - it takes around 3-5 seconds to obtain a completion even in the cases where the completion location is a legitimate member access code, a subset of the intended use case mentioned in the documentation of clang_codeCompleteAt
. This seems way too slow by IDE code-completion standards. Is there a way of speeding this up?
The problem that clang_parseTranslationUnit has is that precompiled preamble is not reused the second time that is called code completion. Calculate the precompile preamble takes more than the 90% of these time so you should allow that the precompiled preamble was reused as soon as posible.
By default it is reused the third time that is called to parse/reparse translation unit.
Take a look of this variable 'PreambleRebuildCounter' in ASTUnit.cpp.
Other problem is that this preamble is saved in a temporary file. You can keep the precompiled preamble in memory instead of a temporary file. It would be faster. :)
Sometimes delays of this magnitude are due to timeouts on network resources (NFS or CIFS shares on a file search path or sockets). Try monitoring the time each system call takes to complete by prefixing the process your run with strace -Tf -o trace.out
. Look at the numbers in angle brackets in trace.out
for the system call that takes a long time to complete.
You can also monitor the time between system calls to see which processing of a file takes too long to complete. To do this, prefix the process your run with strace -rf -o trace.out
. Look at the number before each system call to look for long system call intervals. Go backwards from that point looking for open
calls to see which was the file that was being processed.
If this doesn't help, you can profile your process to see where it spends most of its time.