Why are private fields private to the type, not the instance?

In C# (and many other languages) it's perfectly legitimate to access private fields of other instances of the same type. For example:

public class Foo
{
    private bool aBool;

    public void DoBar(Foo anotherFoo)
    {
        if (anotherFoo.aBool) ...
    }
}

As the C# specification (sections 3.5.1, 3.5.2) states access to private fields is on a type, not an instance. I've been discussing this with a colleague and we're trying to come up with a reason why it works like this (rather than restricting access to the same instance).

The best argument we could come up with is for equality checks where the class may want to access private fields to determine equality with another instance. Are there any other reasons? Or some golden reason that absolutely means it must work like this or something would be completely impossible?


Solution 1:

I think one reason it works this way is because access modifiers work at compile time. As such, determining whether or not a given object is also the current object isn't easy to do. For example, consider this code:

public class Foo
{
    private int bar;

    public void Baz(Foo other)
    {
        other.bar = 2;
    }

    public void Boo()
    {
        Baz(this);
    }
}

Can the compiler necessarily figure out that other is actually this? Not in all cases. One could argue that this just shouldn't compile then, but that means we have a code path where a private instance member of the correct instance isn't accessible, which I think is even worse.

Only requiring type-level rather than object-level visibility ensures that the problem is tractable, as well as making a situation that seems like it should work actually work.

EDIT: Danilel Hilgarth's point that this reasoning is backwards does have merit. Language designers can create the language they want, and compiler writers must conform to it. That being said, language designers do have some incentive to make it easier for compiler writers to do their job. (Though in this case, it's easy enough to argue that private members could then only be accessed via this (either implicitly or explicitly)).

However, I believe that makes the issue more confusing than it needs to be. Most users (myself included) would find it unneccessarily limiting if the above code didn't work: after all, that's my data I'm trying to access! Why should I have to go through this?

In short, I think I may have overstated the case for it being "difficult" for the compiler. What I really meant to get across is that above situation seems like one that the designers would like to have work.

Solution 2:

Because the purpose of the kind of encapsulation used in C# and similar languages* is to lower mutual dependence of different pieces of code (classes in C# and Java), not different objects in memory.

For example, if you write code in one class that uses some fields in another class, then these classes are very tightly coupled. However, if you are dealing with code in which you have two objects of the same class, then there is no extra dependency. A class always depends on itself.

However, all this theory about encapsulation fails as soon as someone creates properties (or get/set pairs in Java) and exposes all the fields directly, which makes classes as coupled as if they were accessing fields anyway.

*For clarification on kinds of encapsulation see Abel's excellent answer.

Solution 3:

Quite some answers have already been added to this interesting thread, however, I didn't quite find the real reason for why this behavior is the way it is. Let me give it a try:

Back in the days

Somewhere between Smalltalk in the 80's and Java in the mid 90's the concept of object-orientation matured. Information hiding, not originally thought of as a concept only available to OO (mentioned first in 1978), was introduced in Smalltalk as all data (fields) of a class is private, all methods are public. During the many new developments of OO in the 90's, Bertrand Meyer tried to formalize much of the OO concepts in his landmark book Object Oriented Software Construction (OOSC) which has since then be considered an (almost) definitive reference on OO concepts and language design.

In the case of private visibility

According to Meyer a method should be made available to a defined set of classes (page 192-193). This gives obviously a very high granularity of information hiding, the following feature is available to classA and classB and all their descendants:

feature {classA, classB}
   methodName

In the case of private he says the following: without explicitly declaring a type as visible to its own class, you cannot access that feature (method/field) in a qualified call. I.e. if x is a variable, x.doSomething() is not allowed. Unqualified access is allowed, of course, inside the class itself.

In other words: to allow access by an instance of the same class, you have to allow the method access by that class explicitly. This is sometimes called instance-private versus class-private.

Instance-private in programming languages

I know of at least two languages currently in use that use instance-private information hiding as opposed to class-private information hiding. One is Eiffel, a language designed by Meyer, that takes OO to its utmost extremes. The other being Ruby, a far more common language nowadays. In Ruby, private means: "private to this instance".

Choices for language design

It has been suggested that allowing instance-private would be hard for the compiler. I don't think so, as it is relatively simple to just allow or disallow qualified calls to methods. If for a private method, doSomething() is allowed and x.doSomething() is not, a language designer has effectively defined instance-only accessibility for private methods and fields.

From a technical point of view, there's no reason to choose one way or the other (esp. when considering that Eiffel.NET can do this with IL, even with multiple inheritance, there's no inherent reason not to provide this feature).

Of course, it's a matter of taste and as others already mentioned, quite some methods might be harder to write without the feature of class-level visibility of private methods and fields.

Why C# allows only class encapsulation and not instance encapsulation

If you look at internet threads on instance encapsulation (a term sometimes used to refer to the fact that a language defines the access modifiers on instance level, as opposed to class level), the concept is often frowned upon. However, considering that some modern languages use instance encapsulation, at least for the private access modifier, makes you think it can be and is of use in the modern programming world.

However, C# has admittedly looked hardest at C++ and Java for its language design. While Eiffel and Modula-3 were also in the picture, considering the many features of Eiffel missing (multiple inheritance) I believe they chose the same route as Java and C++ when it came to the private access modifier.

If you really want to know the why you should try to get a hold of Eric Lippert, Krzysztof Cwalina, Anders Hejlsberg or anyone else who worked on the standard of C#. Unfortunately, I couldn't find a definitive note in the annotated The C# Programming Language.

Solution 4:

This is only my opinion, but pragmatically, I think that if a programmer has access to the source of a class, you can reasonably trust them with accessing the class instance's private members. Why bind a programmers right hand when in their left you've already given them the keys to the kingdom?