Remove comments from C/C++ code
Is there an easy way to remove comments from a C/C++ source file without doing any preprocessing. (ie, I think you can use gcc -E but this will expand macros.) I just want the source code with comments stripped, nothing else should be changed.
EDIT:
Preference towards an existing tool. I don't want to have to write this myself with regexes, I foresee too many surprises in the code.
Solution 1:
Run the following command on your source file:
gcc -fpreprocessed -dD -E test.c
Thanks to KennyTM for finding the right flags. Here’s the result for completeness:
test.c:
#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo
/* comments? comments. */
// c++ style comments
gcc -fpreprocessed -dD -E test.c
:
#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo
Solution 2:
It depends on how perverse your comments are. I have a program scc
to strip C and C++ comments. I also have a test file for it, and I tried GCC (4.2.1 on MacOS X) with the options in the currently selected answer - and GCC doesn't seem to do a perfect job on some of the horribly butchered comments in the test case.
NB: This isn't a real-life problem - people don't write such ghastly code.
Consider the (subset - 36 of 135 lines total) of the test case:
/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.
/\
\/ This is not a C++/C99 comment!
This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.
/\
\* This is not a C or C++ comment!
This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.
This is followed by regular C comment number 3.
/\
\
\
\
* C comment */
On my Mac, the output from GCC (gcc -fpreprocessed -dD -E subset.c
) is:
/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.
/\
\/ This is not a C++/C99 comment!
This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.
/\
\* This is not a C or C++ comment!
This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.
This is followed by regular C comment number 3.
/\
\
\
\
* C comment */
The output from 'scc' is:
The regular C comment number 1 has finished.
/\
\/ This is not a C++/C99 comment!
This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.
/\
\* This is not a C or C++ comment!
This is followed by regular C comment number 2.
The regular C comment number 2 has finished.
This is followed by regular C comment number 3.
The output from 'scc -C' (which recognizes double-slash comments) is:
The regular C comment number 1 has finished.
/\
\/ This is not a C++/C99 comment!
This is followed by C++/C99 comment number 3.
The C++/C99 comment number 3 has finished.
/\
\* This is not a C or C++ comment!
This is followed by regular C comment number 2.
The regular C comment number 2 has finished.
This is followed by regular C comment number 3.
Source for SCC now available on GitHub
The current version of SCC is 6.60 (dated 2016-06-12), though the Git versions were created on 2017-01-18 (in the US/Pacific time zone). The code is available from GitHub at https://github.com/jleffler/scc-snapshots. You can also find snapshots of the previous releases (4.03, 4.04, 5.05) and two pre-releases (6.16, 6.50) — these are all tagged release/x.yz
.
The code is still primarily developed under RCS. I'm still working out how I want to use sub-modules or a similar mechanism to handle common library files like stderr.c
and stderr.h
(which can also be found in https://github.com/jleffler/soq).
SCC version 6.60 attempts to understand C++11, C++14 and C++17 constructs such as binary constants, numeric punctuation, raw strings, and hexadecimal floats. It defaults to C11 mode operation. (Note that the meaning of the -C
flag — mentioned above — flipped between version 4.0x described in the main body of the answer and version 6.60 which is currently the latest release.)
Solution 3:
gcc -fpreprocessed -dD -E did not work for me but this program does it:
#include <stdio.h>
static void process(FILE *f)
{
int c;
while ( (c=getc(f)) != EOF )
{
if (c=='\'' || c=='"') /* literal */
{
int q=c;
do
{
putchar(c);
if (c=='\\') putchar(getc(f));
c=getc(f);
} while (c!=q);
putchar(c);
}
else if (c=='/') /* opening comment ? */
{
c=getc(f);
if (c!='*') /* no, recover */
{
putchar('/');
ungetc(c,f);
}
else
{
int p;
putchar(' '); /* replace comment with space */
do
{
p=c;
c=getc(f);
} while (c!='/' || p!='*');
}
}
else
{
putchar(c);
}
}
}
int main(int argc, char *argv[])
{
process(stdin);
return 0;
}
Solution 4:
There is a stripcmt program than can do this:
StripCmt is a simple utility written in C to remove comments from C, C++, and Java source files. In the grand tradition of Unix text processing programs, it can function either as a FIFO (First In - First Out) filter or accept arguments on the command line.
(per hlovdal's answer to: question about Python code for this)
Solution 5:
This is a perl script to remove //one-line and /* multi-line */ comments
#!/usr/bin/perl
undef $/;
$text = <>;
$text =~ s/\/\/[^\n\r]*(\n\r)?//g;
$text =~ s/\/\*+([^*]|\*(?!\/))*\*+\///g;
print $text;
It requires your source file as a command line argument. Save the script to a file, let say remove_comments.pl and call it using the following command: perl -w remove_comments.pl [your source file]
Hope it will be helpful