"It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it was the winter of despair"
Have you ever seen a warning from the C# compiler that looks like this:
Perhaps you have, and you've ignored it along with the hundreds of other warnings in your project. I hope that, by the end of this article, you'll decided to do some housekeeping: those warnings are there because people a lot cleverer than me know that there are issues that can arise from ignoring them!
To dive into what this warning means (and why we should care), we have to take a step back and examine three keywords: virtual
, override
and new
. But even before that, we have to look (generally) at how object references work (and this is true for pretty-much all OO environments).
Object member resolution.
1. The simple case
Let's say you have a class Animal
, and you create an object from it, animal
. Let's imagine that that class has a single member, which we'll keep as a method, to keep this simple, for now:
public class Anima()
{
public string MakeNoise()
{
return "generic animal sound";
}
}
public static class Program
{
public static void Main(string[] args)
{
var animal = new Animal();
var noise = animal.MakeNoise();
Console.WriteLine(noise); // prints out "generic animal noise"
}
}
So far, so good. No surprises here. What's happening under the hood is that, at compile-time, the class is compiled as a "template", with the MakeNoise
method compiled into the result, and the address of that method within the assembly is stored alongside the "template" for Animal.
At run-time, the program asks for a new animal
, so the template is used to allocate memory for a pointer to the animal and pointer(s) to all the members (in this case, the single MakeNoise
method), and those member addresses are copied to the area of memory which is used to represent the animal
in code, so when you invoke animal.MakeNoise()
, the memory address for that method is already on-hand. That method was actually compiled with 1 parameter: what is going to be this
during the call, and we can get an idea of how the runtime invokes it by doing the same with reflection:
var animal = new Animal();
var method = typeof(Animal).GetMethod("MakeNoise");
method.Invoke(animal, new object[0]);
Note that, even though there are no parameters to MakeNoise
the reflection invocation requires an empty array.
Side-notes:
- when invoking a static member method, the
this
argument is null - this is analagous to the JavaScript
.apply()
method on function objects - most OO languages hide this from you. Python, on the other hand, doesn't -- member methods must have a first argument which is the
this
pointer, most often calledself
2. Hiding methods
In the example above, we can see we've set up for a base Animal class. We'd perhaps like to make Dog
s that "woof" and Cat
s that meow, eg:
public class Dog: Animal
{
public string MakeNoise()
{
return "woof";
}
}
public class Cat: Animal
{
public string MakeNoise()
{
return "meow";
}
}
public static class Program
{
public static void Main(string[] args)
{
var animal = new Animal();
var dog = new Dog();
var cat = new Cat();
// will print out:
// generic animal sound
// woof
// meow
Console.WriteLine(animal.MakeNoise());
Console.WriteLine(dog.MakeNoise());
Console.WriteLine(cat.MakeNoise());
}
}
All well and good. But we probably want to refactor: each animal simply has it's unique sound printed out to the console. What if we did this:
public static class Program
{
public static void Main(string[] args)
{
var animal = new Animal();
var dog = new Dog();
var cat = new Cat();
PrintNoises(new[]
{
animal, dog, cat
});
}
private static void PrintNoises(Animal[] animals)
{
foreach (var animal in animals)
{
Console.WriteLine(animal.MakeNoise());
}
}
}
Well, we'd find that instead of getting different sounds, we get the same message ("generic animal sound") three times!
Let's look at Dog
to get an idea of what's going on here:
The compiled Dog
type actually has two MakeNoise
methods which we can find by reflection:
public void Show()
{
foreach (var method in typeof(Dog).GetMethods())
{
Console.WriteLine($"{method.DeclaringType}.{method.Name}");
}
}
This prints out two lines:
Dog.MakeNoise
Animal.MakeNoise
So the method that's invoked on the dog
object depends entirely on what type it's posing as at the point of calling:
(dog as Animal).MakeNoise(); // generic animal noise
(dog as Dog).MakeNoise(); // woof
This is rather inconvenient, but there's an easy way to resolve this:
3. virtual
and override
If we change our Animal class a little:
public class Animal
{
public virtual string MakeNoise()
{
return "generic animal sound";
}
}
First we should see a different compiler warning:
(and if we do nothing about it, the result is the same as if we added the 'new' keyword)
Now we update our derivatives:
public class Dog: Animal
{
public override string MakeNoise()
{
return "woof";
}
}
public class Cat: Animal
{
public override string MakeNoise()
{
return "meow";
}
}
And re-run the refactored program, we should see the desired result:
public static void Main(string[] args)
{
var animal = new Animal();
var dog = new Dog();
foreach (var method in typeof(Dog).GetMethods())
{
// note that this now only prints out _one_ method:
// Dog.MakeNoise
Console.WriteLine($"{method.DeclaringType}.{method.Name}");
}
var cat = new Cat();
// will print out:
// generic animal sound
// woof
// meow
PrintNoise(new[]
{
animal, dog, cat
});
}
private static void PrintNoise(Animal[] animals)
{
foreach (var animal in animals)
{
Console.WriteLine(animal.MakeNoise());
}
}
What's happening here is that your class "template" for Dog
, Cat
and Animal
now no longer have the memory address of their implementations of MakeNoise
baked into the template. Instead, there's a bit of logic there which boils down to: "at run-time, patch the object that is a result of new Dog()
to have the MakeNoise
method always point to the override from the Dog
class". Now when that object is down-cast to the type Animal
, the Dog.MakeNoise
method is still invoked. This likely to be the desired behavior in 99.99% of the cases where you're deriving from classes and implementing methods with the same name.
Remember also that properties are implemented with backing fields and methods, even when they are auto-props, eg:
public class AutoFoo
{
public int Id { get; set; }
}
// is the same as:
public class ManualFoo
{
public int Id
{
get => _id; // getter method
set => _id = value; // setter method
}
private int _id;
}
So the same discussion about virtual/override
and new
applies to properties.
When people talk about this virtual table of addresses, you may hear the term "vtable" used.
Conclusion:
- We should pay attention to compiler warnings -- they can save us from unexpected runtime behaviors!
- We should prefer to make members
virtual
when we intend tooverride
behavior in derived classes - If we really can't make members
virtual
andoverride
, then we need to keep in mind that thenew
keyword simply hides the ancestor member, and we have to be careful about the cast type of the object when that member is invoked
You may wonder why you'd ever use new
on purpose! Sometimes you don't have a choice:
- the new member has a different signature
- property with a different type
- method with different return type and same parameters
- the class we're deriving from is in an assembly not under our control, so we can't make the base class member
virtual
In the case of (1), this should be a "code smell" -- an indication that the code is doing something poorly, and should be refactored to be better. In the case of (2), we could also refactor to have a new facade class shielding the original, alien type and exposing the new property that we want. In both cases, choosing to use the new
keyword or ignoring the compiler warning can lead to unexpected behaviors at runtime.
Top comments (0)