Declaring variables inside loops, good practice or bad practice?

Question #1: Is declaring a variable inside a loop a good practice or bad practice?

I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.

Example:

for(int counter = 0; counter <= 10; counter++)
{
   string someString = "testing";

   cout << someString;
}

Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?

This is excellent practice.

By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.

This way:

If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.

In short, you are right to do it.

Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.

{
    int i, retainValue;
    for (i=0; i<N; i++)
    {
       int tmpValue;
       /* tmpValue is uninitialized */
       /* retainValue still has its previous value from previous loop */

       /* Do some stuff here */
    }
    /* Here, retainValue is still valid; tmpValue no longer */
}

For question #2: The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).

With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.

This is true even outside of an if(){...} block. Typically, instead of :

    int result;
    (...)
    result = f1();
    if (result) then { (...) }
    (...)
    result = f2();
    if (result) then { (...) }

it's safer to write :

    (...)
    {
        int const result = f1();
        if (result) then { (...) }
    }
    (...)
    {
        int const result = f2();
        if (result) then { (...) }
    }

The difference may seem minor, especially on such a small example. But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.

Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().

Complementary information

The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.

In response to comment on allocation: The above rule is true in C, but might not be for some C++ classes.

For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.

However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.

Generally, it's a very good practice to keep it very close.

In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.

In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.

Suppose you wanted to avoid those redundant creations/allocations, you would write it as:

for (int counter = 0; counter <= 10; counter++) {
   // compiler can pull this out
   const char testing[] = "testing";
   cout << testing;
}

or you can pull the constant out:

const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
   cout << testing;
}

Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?

It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:

for (int counter = 0; counter <= 10; counter++) {
   string testing = "testing";
   cout << testing;
}

would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).

Doing this:

string testing;
for (int counter = 0; counter <= 10; counter++) {
   testing = "testing";
   cout << testing;
}

would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.

So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.

I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.

To JeremyRR, you could do this:

{
  string someString = "testing";   

  for(int counter = 0; counter <= 10; counter++)
  {
    cout << someString;
  }

  // The variable is in scope.
}

// The variable is no longer in scope.

I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.

My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".

I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.

Declaring variables inside loops, good practice or bad practice?

Complementary information

Related

Recent Posts