Initialize all the elements of an array to the same number
There are many ways to fill an array with the same value, and if you are concerned about performance then you need to measure.
C++ has a dedicated function for filling an array with a value, and I would use this (after #include <algorithm>
and #include <iterator>
):
std::fill(std::begin(A), std::end(A), 3);
You shouldn't underestimate what optimizing compilers can do with something like this.
If you are interested in seeing what the compiler does, then Matt Godbolt's Compiler Explorer is a very good tool if you're prepared to learn a little bit of assembler. As you can see from here, compilers can optimize the fill
call to twelve (and a bit) 128-bit stores with any loops unrolled. Because compilers have knowledge of the target environment they can do this without encoding any target-specific assumptions in the source code.
He assumes that long
is four times longer than short
(that is not guaranteed; he should use int16_t and int64_t).
He takes that longer memory space (64 bits) and fills it with four short (16 bits) values. He is setting up the values by shifting bits by 16 spaces.
Then he wants to treat an array of shorts as an array of longs, so he can set up 100 16-bit values by doing only 25 loop iteration instead of 100.
That's the way your teacher thinks, but as others said this cast is undefined behavior.
What an absolute load of hogwash.
For starters,
v
will be computed at compile time.The behaviour of dereferencing
B
followinglong *B = (long*)A;
is undefined as the types are not related.B[i]
is a dereference ofB
.There's no justification whatsoever for the assumption that a
long
is four times larger than ashort
.
Use a for
loop in the simple way and trust the compiler to optimise. Pretty please, with sugar on top.
The question has the C++ tag (no C tag), so this should be done in C++ style:
// C++ 03
std::vector<int> tab(100, 3);
// C++ 11
auto tab = std::vector<int>(100, 3);
auto tab2 = std::array<int, 100>{};
tab2.fill(3);
Also the teacher is trying outsmart the compiler which can do mind-blowing things. There is no point to do such tricks since the compiler can do it for you if configured properly:
- Your code assemblies
- Your code assemblies with tick removed
- Array approach
- Vector approach
As you can see, the -O2
result code is (almost) the same for each version. In case of -O1
, tricks give some improvement.
So the bottom line, you have to make a choice:
- Write hard-to-read code and do not use compiler optimizations
- Write readable code and use
-O2
Use the Godbolt site to experiment with other compilers and configurations. See also the latest cppCon talk.
As explained by other answers, the code violates type aliasing rules and makes assumptions that are not guaranteed by the standard.
If you really wanted to do this optimization by hand, this would be a correct way that has well-defined behaviour:
long v;
for(int i=0; i < sizeof v / sizeof *A; i++) {
v = (v << sizeof *A * CHAR_BIT) + 3;
}
for(int i=0; i < sizeof A / sizeof v; i++) {
std:memcpy(A + i * sizeof v, &v, sizeof v);
}
The unsafe assumptions about the sizes of the objects were fixed by the use of sizeof
, and the aliasing violation was fixed by using std::memcpy
, which has well-defined behaviour regardless of the underlying type.
That said, it's probably best to keep your code simple and let the compiler do its magic instead.
Why I need the left shift operator?
The point is to fill a bigger integer with multiple copies of the smaller integer. If you write a two-byte value s
to a big integer l
, then shift the bits left for two bytes (my fixed version should be clearer about where those magic numbers came from) then you'll have an integer with two copies of the bytes that constitute the value s
. This is repeated until all pairs of bytes in l
are set to those same values. To do the shift, you need the shift operator.
When those values are copied over an array that contains an array of the two-byte integers, a single copy will set the value of multiple objects to the value of the bytes of the larger object. Since each pair of bytes has the same value, so will the smaller integers of the array.
Why I need another array of
long
?
There are no arrays of long
. Only an array of short
.