Is violation of DRY principle always bad? [closed]
Those are entirely valid reasons to violate DRY. I should add a third: performance. It's rarely a big deal, but it can make a difference, and abstraction can risk slowing things down.
Actually, I'll add a fourth: wasting time and potentially introducing new bugs by changing two (or more) parts of a codebase that might already be working just fine. Is it worth the cost of figuring out how to abstract these things if you don't have to and it probably won't save any or much time in the future?
Typically, duplicated code is not ideal, but there are certainly compelling reasons to allow it, probably including further reasons than what the OP and myself have suggested.
Let's try to understand why DRY is important, and then we can understand where breaking the rule is reasonable:
DRY should be used to avoid the situation where two pieces of code are conceptually doing some of the same work, so whenever you change the code in one place you have to change the code in the other place. If the same logic is in two separate places, then you have to always remember to change the logic in both places, which can be quite error prone. This can apply at any scale. It can be an entire application that is being duplicated or it can be a single constant value. There also may not be any repeated code at all, it may just be a repeated principle. You have to ask "If I were to make a change in one place, would I necessarily need to make an equivalent change somewhere else?". If the answer is "yes", then the code is violating DRY.
Imagine that you have a line like this in your program:
cost = price + price*0.10 // account for sales tax
and somewhere else in your program, you have a similar line:
x = base_price*1.1; // account for sales tax
If the sales tax changes, you are going to need to change both of those lines. There is almost no repeated code here, but the fact that if you make a change in one place it requires a change in another place is what makes the code not DRY. What's more, it may be very difficult to realize that you have to make the change in two places. Maybe your unit tests will catch it, but maybe not, so getting rid of the duplication is important. Maybe you would factor the value of the sales tax into a separate constant that can be used in multiple places:
cost = price + price*sales_tax;
x = base_price*(1.0+sales_tax);
or maybe create a function to abstract it even more:
cost = costWithTax(price);
x = costWithTax(base_price);
Either way, it is very likely to be worth the trouble.
Alternatively, you may have code that looks very similar but isn't violating DRY:
x = base_price * 1.1; // add 10% markup for premium service
If you were to change the way sales tax is calculated, you wouldn't want to change that line of code, so it isn't actually repeating any logic.
There are also cases where having to make the same change in multiple places is okay. For example, maybe you have code like this:
a0 = f(0);
a1 = f(1);
This code isn't DRY in a few ways. For example, if you were to change the name of function f
, you would have to change two places. You could perhaps make the code more DRY by creating a small loop and turning a
into an array. However, this particular duplication isn't a big deal. First, the two changes are very close together, so accidentally changing one without changing the other is unlikely. Second, if you are in a compiled language, then the compiler will most likely catch the problem anyway. If you are not in a compiled language, then hopefully your unit tests will catch it.
There are many good reasons to make your code DRY, but there are many good reasons not to also.
Yes, certain code duplications are notoriously difficult to factor out without making the readability significantly worse. In such situations I leave a TODO
in comments as a reminder that there is some duplication but at the time of writing it seemed better to leave it like that.
Usually what happens is what you write in your first point, the duplications diverge and are no longer duplications. It also happens that the duplication is a sign of a design issue but it only becomes clear later.
Long story short: try to avoid duplication; if the duplication is notoriously difficult to factor out and at the time of writing harmless, just leave a comment as a reminder.
See also 97 Things Every Programmer Should Know:
p. 14. Beware the Share by Udi Dahan
The fact that two wildly different parts of the system performed some logic in the same way meant less than I thought. Up until I had pulled out those libraries of shared code, these parts were not dependent on each other. Each could evolve independently. Each could change its logic to suit the needs of the system’s changing business environment. Those four lines of similar code were accidental—a temporal anomaly, a coincidence.
In that case, he created dependece between two parts of the system that were better kept independent. The solution was essentially duplication.