VB.NET What is the purpose of a class or module?
Newbie sauce here... So, I tried to find the answer but couldn't.
What is the purpose of having a class or module? Everything I read tries to tell me what it is, but not what it's for. Why would I need to make one?
Everything I read seems to makes assumptions about the person reading the tutorial, as if I know a lot.
A module is really very similar to just a class containing only shared members. In fact, in C#, there is no such construct as a "module". You cannot write any application without having at least one module or class, so I suspect your real question is not "why use classes and modules", but rather "why use multiple classes and modules and when is it appropriate to start a new one". Since modules and classes are essentially the same thing, I'll just focus on why you would have multiple classes at all. There are essentially four main reasons to create a new class:
- Store data in discreet items
- Organize your code
- Provide seams in your code
- Divide your code into layers and support n-tiers
Now, let's look at each one in more detail:
Store Data in Discreet Items
Often times you need to store multiple data about a single item and pass that data around between methods as a single object. For instance, if you write an application which works with a person, you will likely want to store multiple data about the person, such as their name, age, and title. You could obviously store these three data as three separate variables, and pass them as separate parameters to methods, such as:
Public Sub DisplayPerson(name As String, age As Integer, title As String)
Label1.Text = name
Label2.Text = age.ToString()
Label3.Text = title
End Sub
However, it's often more convenient to pass all the data as a single object, for instance, you could create a MyPersonClass
, like this:
Public Class MyPersonClass
Public Name As String
Public Age As Integer
Public Title As String
End Class
And then you could pass all the data about a person in a single parameter, like this:
Public Sub DisplayPerson(person As MyPersonClass)
Label1.Text = person.Name
Label2.Text = person.Age.ToString()
Label3.Text = person.Title
End Sub
By doing it this way, it makes it much easier in the future to modify your person. For instance, if you needed to add the ability to store a skill for the person, and you had not put the person data in a class, you would have to go to every place in the code that passes person data and add the additional parameter. In a large project, it could be very difficult to find all of those places to fix, which could lead to bugs. However, the need for the class becomes even more apparent when you start needing to store a list of multiple people. For instance, if you need to store the data for 10 different people, you would need a list or array of variables, for instance:
Dim names(9) As String
Dim ages(9) As Integer
Dim titles(9) As String
It's of course, not at all obvious that names(3)
and age(3)
both store data for the same person. That is something you just have to know, or you have to write it in a comment so you don't forget. However, this is much cleaner and easier to do when you have the class to store all the data for a person:
Dim persons(9) As Person
Now, it's completely obvious that persons(3).Name
and persons(3).Age
are both data for the same person. In that way, it is self-documenting. No comment is needed to clarify your logic. As a result, again, the code will be less bug-prone.
Often, classes will contain not only the data for a particular item, but also the methods that act on that data. This is a convenient mechanism. For instance, you may want to add a GetDesciption
method to the person class, such as:
Public Class MyPersonClass
Public Name As String
Public Age As Integer
Public Title As String
Public Function GetDescription() As String
Return Title & " " & Name
End Function
End Class
Then you can use it like this:
For Each person As MyPersonClass In persons
MessageBox.Show("Hello " & person.GetDescription())
Next
Which, as I'm sure you'll agree, is much cleaner and easier than doing something like this:
For i As Integer = 0 To 9
MessageBox.Show("Hello " & GetPersonDescription(title(i), names(i)))
Next
Now lets say you want to store multiple nicknames for each person. As you can easily see, persons(3).Nicknames(0)
is far simpler than some crazy two dimensional array, such as nicknames(3)(0)
. And what happens if you need to store multiple data about each nickname? As you can see, not using classes would get messy very fast.
Organize Your Code
When you write a lengthy program, it can become very messy very quickly and lead to very buggy code if you do not properly organize your code. The most-important weapon you have in this battle against spaghetti-code is to create more classes. Ideally, each class will contain only the methods that are logically directly related to each other. Each new type of functionality should be broken out into a new well-named class. In a large project, these classes should be further organized into separate namespaces, but if you don't at least split them out into classes, you are really going to make a mess. For instance, lets say you have the following methods all thrown into the same module:
GetPersonDescription
GetProductDescription
FirePerson
SellProduct
I'm sure you'd agree, that it's just much easier to follow the code if these methods were broken out into separate classes, such as:
-
Person
GetDescription
Fire
-
Product
GetDescription
Sell
And that's just a very, very simple example. When you have thousands of methods and variables dealing with many different items and different types of items, I'm sure you can easily imagine why classes are important to help organize and self-document the code.
Provide Seams in Your Code
This one may be a bit more advanced, but it's very important, so I'll take a stab at trying to explain it in simple terms. Let's say you create a trace-logger class which writes log entries to a trace log file. For instance:
Public Class TraceLogger
Public Sub LogEntry(text As String)
' Append the time-stamp to the text
' Write the text to the file
End Sub
End Class
Now, lets say you want to have the logger class be able to write to a file or to a database. At this point it becomes obvious that writing the log entry to the file is really a separate type of logic which should have been in its own class all along, so you can break it out into a separate class, like this:
Public Class TextFileLogWriter
Public Sub WriteEntry(text As String)
' Write to file
End Sub
End Class
Now, you can create a common interface and share it between two different classes. Both classes will handle writing log entries, but they will each perform the functionality in entirely different ways:
Public Interface ILogWriter
Sub WriteEntry(text As String)
End Interface
Public Class TextFileLogWriter
Implements ILogWriter
Public Sub WriteEntry(text As String) Implements ILogWriter.WriteEntry
' Write to file
End Sub
End Class
Public Class DatabaseLogWriter
Implements ILogWriter
Public Sub WriteEntry(text As String) Implements ILogWriter.WriteEntry
' Write to database
End Sub
End Class
Now, that you have broken that data-access logic out into its own classes, you can refactor your logger class like this:
Public Class TraceLogger
Public Sub New(writer As ILogWriter)
_writer = writer
End Sub
Private _writer As ILogWriter
Public Sub LogEntry(text As String)
' Append the time-stamp to the text
_writer.WriteEntry(text)
End Sub
End Class
Now, you can reuse the TraceLogger
class in many more situations without having to ever touch that class. For instance, you could give it an ILogWriter
object that writes the entries to the windows event log, or to a spreadsheet, or even to an email--all without ever touching the original TraceLogger
class. This is possible because you have created a seam in your logic between the formatting of the entries and the writing of the entries.
The formatting doesn't care how the entries get logged. All it cares about is how to format the entries. When it needs to write and entry, it just asks a separate writer object to do that part of the work. How and what that writer actually does internally is irrelevant. Similarly, the writer doesn't care how the entry is formatted, it just expects that whatever is passed to it is an already-formatted valid entry that needs to be logged.
As you may have noticed, not only is the TraceLogger
now reusable to write to any kind of log, but also, the writers are reusable for writing any type of log to those types of logs. You could reuse the DatabaseLogWriter
, for instance, to write both trace logs and exception logs.
A Little Rant Regarding Dependency Injection
Just humor me, a little, as I make this answer a little bit longer with a rant about something important to me... In that last example, I used a technique called dependency injection (DI). It's called dependency injection because the writer object is a dependency of the logger class and that dependency object is injected into the logger class via the constructor. You could accomplish something similar without dependency injection by doing something like this:
Public Class TraceLogger
Public Sub New(mode As LoggerModeEnum)
If mode = LoggerModeEnum.TextFile Then
_writer = New TextFileLogWriter()
Else
_writer = New DatabaseLogWriter()
End If
End Sub
Private _writer As ILogWriter
Public Sub LogEntry(text As String)
' Append the time-stamp to the text
_writer.WriteEntry(text)
End Sub
End Class
However, as you can see, if you do it that way, now you'll need to modify that logger class every time you create a new type of writer. And then, just to create a logger, you have to have references every different type of writer. When you write code this way, pretty soon, any time you include one class, you suddenly have to reference the whole world just to do a simple task.
Another alternative to the dependency injection approach would be to use inheritance to create multiple TraceLogger
classes, one per type of writer:
Public MustInherit Class TraceLogger
Public Sub New()
_writer = NewLogWriter()
End Sub
Private _writer As ILogWriter
Protected MustOverride Sub NewLogWriter()
Public Sub LogEntry(text As String)
' Append the time-stamp to the text
_writer.WriteEntry(text)
End Sub
End Class
Public Class TextFileTraceLogger
Inherits TraceLogger
Protected Overrides Sub NewLogWriter()
_Return New TextFileLogWriter()
End Sub
End Class
Public Class DatabaseTraceLogger
Inherits TraceLogger
Protected Overrides Sub NewLogWriter()
_Return New DatabaseLogWriter()
End Sub
End Class
Doing it with inheritance, like that, is better than the mode-enumeration approach, because you don't have to reference all the database logic just to log to a text file, but, in my opinion, dependency injection is cleaner and more flexible.
Back to a Summary of Logic Seams
So, in summary, seams in your logic are important for reusability, flexibility, and interchangeability of your code. In small projects, these things are not of the utmost importance, but as projects grow, having clear seams can become critical.
Another big benefit of creating seams is that it makes the code more stable and testable. Once you know that the TraceLogger
works, there is a big advantage to being able to extend it for future uses, such as writing logs to a spreadsheet, without having to touch the actual TraceLogger
class. If you don't have to touch it, then you don't risk introducing new bugs and potentially compromising the rest of the code that already uses it. Also, it becomes far easier to test each piece of your code in isolation. For instance, if you wanted to test the TraceLogger
class, you could just, for your test, make it use a fake writer object which just logs to memory, or the console, or something.
Divide Your Code Into Layers and Support N-Tiers
Once you have properly organized your code into separate classes, where each class is only responsible for one type of task, then you can start to group together your classes into layers. Layers are just a high-level organization of your code. There's nothing specific in the language that makes something technically a layer. Since there's nothing directly in the language that makes it clear where each layer starts and ends, people will often put all the classes for each layer into separate namespaces. So, for instance, you may have namespaces that look like this (where each namespace is a separate layer):
MyProduct.Presentation
MyProduct.Business
MyProduct.DataAccess
Typically, you always want to have at least two layers in your code: the presentation or user-interface layer and the business-logic layer. If your application does any data access, that is typically put in its own layer as well. Each layer should be, as much as possible, independent and interchangeable. So, for instance, if our TraceLogger
class in the above example is in a business layer, it should be reusable by any kind of UI.
Layers expand upon all of the previous topics by providing further organization, self-documentation, reusability, and stability. However, another major benefit for layers is that it becomes far easier to split your application into multiple tiers. For instance, if you need to move your business and data access logic into a web service, it will be very simple to do so if you have already written your code cleanly into defined layers. If, however, all of that logic is intermingled and interdependent, then it will be a nightmare to try and break just the data access and business logic out into a separate project.
The End of What I Have to Say
In short, you never need to create more than one class or module. It's always going to be possible to write your entire application in a single class or module. Entire operating systems and software suites were developed, after all, before object oriented languages were even invented. However, there is a reason why object-oriented programming (OOP) languages are so popular. For many projects, object-orientation is incredibly beneficial.
A class is the mechanism to encapsulate state (data) and behaviour (methods).
You need to use classes if you want to have any sort of abstraction in your code - ways to organize your code for usage.
Not having them means your code is all over the place and it will degenerate into something that is hard to change and maintain.
The most basic form a computer program can take is called a Procedure: you write a list of instructions (lines of code) for the computer to perform and then the program exits.
Many computer programs, however, are intended to run independently of the individual clicking "run" each time it is needed. It is this concept of reusing code that is central to most discussions about writing software. A Module allows you to store code in a container and refer to it in other parts of your program.
A Class is a more general concept across object-oriented programming, which allows you to define a "thing" and then create it multiple times while your program is running.
Suppose you wanted to create a Virtual Pet game in Visual Basic. You allow the user to add as many different animals as they want, and you begin to realise that keeping track of this is very complicated. By using classes this becomes really easy...
Class PetDog
Private _id As Integer
Public Name As String
Public Breed As String
Public Sub Bark()
Console.WriteLine(Name + " says Woof!")
End Sub
End Class
Once you've written this code, allowing the user to add another Dog to their petting zoo becomes as simple as this:
Dim Dog1 As New PetDog()
Dim Dog2 As New PetDog()
Now, you can interact with Dog1 and Dog2 independently of each other, despite only defining it a single time in your code.
Dog1.Name = "Fido"
Dog2.Breed = "Poodle"
Dog1.Bark()
The above snippet would print "Fido says Woof!".
Hope that helps :)