Designing for Reuse with Inheritance

May 7, 1998
by Paul Hemmer

As a software developer you are a designer. The tools of Object Orientated Design will allow you to conceptualize and implement complete systems which are reliable and maintainable. Central to the idea of Object Oriented Programming and Design is the notion of Inheritance. Often a buzzword among those who are "Object Oriented", Inheritance is generally perceived by those who have not yet grasped the notion as mysterious and overly complicated. Once understood however, Inheritance provides the basis for which software can be designed for reusability and extendibility. That is, if it is used properly. Although the unique nature of Director as a development environment allows you to use as much or as little Lingo as you like, this only holds true for small scale projects. Increasingly, as developers we are harnessing the power of Director as an environment for larger scale application development.

Polymorphism, Encapsulation, Inheritance, Decentralization

These are some terms which you will stumble across when reading about Object Oriented programming and design. Several syllables long and somewhat indirect in nature, these terms naturally push people away from further exploring their meanings. In essence, each of these terms refers to a relatively simply concept. If you understand the idea that objects are containers for related pieces of data and the operations (methods) that work directly with that data, you understand the idea of Encapsulation. Encapsulation simply represents the idea of bundling related data and operations into a single package. Inheritance is the tool that allows objects to share commonalties. Polymorphism is what allows objects to "tweak" their commonalties to better suit their own needs. Decentralization refers to the idea that there is no main controller or system manager from which parts of the system send or receive messages and instructions. These concepts will be further illustrated below.

Picture a system containing many "bundles" of functionality working together and communicating as a complete system. It sounds simple enough, but it's all too easy to approach the design of these intercommunicating modules in haste and as an afterthought to initial code writing. Doing so will most likely lead to design errors that of course can be avoided. The basic idea behind having distinct "bundles" of functionality is the notion of decentralization. In older procedural programming paradigms, systems were "controlled" by some kind of main procedure. This main procedure would accept messages and send messages to different procedures and functions on a system wide level. The entire program was controlled by a central, "main", procedure. The idea of a system wide controller object goes directly against the notion of a decentralized system. The idea here is to break free from the mindset which requires programs to be controlled from a central control mechanism.

To break free from this mindset, you must begin visualizing systems as a collection of independently owned and operated "bundles" of functionality which can to talk to each other, but do not rely on some central controller to tell them what to do and when to do it. If you find that your design includes a global controller class you have not fully realized the abstract nature of Object Orientation and aren't designing your classes to be fully self contained units of operation. The use of a global controller class is a clear sign that you are still thinking procedurally, only decorating your thoughts with object syntax. The goal here is to aid you in realizing the level of abstraction necessary to design Object Oriented systems that work.

Classes and objects are not one in the sam

e. The difference between classes and objects lies in a simple idea that is vital to the proper design of Object Oriented systems. During the runtime of a program, an instance of a class is called an object. For example, if you have a Lingo parent script called person, the script itself is a class. Think of it as a template for an object. Only when you've created an instance of that class with the following statement do you have an object:

set personObject = new(script "person")

In the previous example, the object is personObject and it is an instance of the class defined by the parent script person. The difference seems trivial at first yet is extremely important when designing reusable and extendible software. The object is a distinct runtime instance of the class from which it was created. Sometimes this point is only realized for the fact that each instance can have unique values for common properties. While this is certainly correct, there is more to it from a design perspective. This narrow view of the distinction between objects and classes often leads to design errors, illustrated as follows.

Picture yourself conceptualizing the Object Oriented Design for a new software system. You are doing what is often described as "finding classes." A common way of finding possible classes is by reading the specification for the software and looking for nouns. Nouns represent "things" and so do classes. Words like "person", "boy", "account", "tool palette", "schedule", "mailbox", "scrollbar" etc., are nouns and therefore are listed as possible classes. Often when doing an Object Oriented Design, this list of words becomes the final list of classes which later get implemented. The cause? Maybe it's an instinctual "itch" to start writing code. This often leads to design errors involving many wrong and missing classes. Such errors cause increased complexity and decreased reusability. In such a case, it can be said that the level of abstraction which the design has reached is not as high as it should be. There is an abundance of unnecessary classes.

Simplicity is Designed

Abstraction represents the idea of breaking down ideas and tasks into simpler ideas and subtasks that are easier to comprehend and implement. On a physical level, this is something people do everyday. If you've written any kind of computer program in a procedural language, you have certainly used abstraction to some extent to break a problem down. In Object Oriented Programming and Design, abstraction represents finding commonalties between related ideas, and breaking ideas down into distinct groups of related data and operations. For example, a system with only two classes representing cop and robber is not at its highest level of abstraction. Without also having a class representing person from which cop and robber inherit commonalties, duplicate code would need to be written in both cop and robber to implement what should only be implemented once in a more abstract class.

Sure classes for cop and robber could have been written "stand alone." However, in so doing what would happen to the reusability and extendibility of the software? If you decided to add a baby as a new type of person, it wouldn't make much sense to have the baby class be a descendant of cop or robber. The same code common to every person would have to be rewritten yet again, and then again for every other new "type of person" you decide to add to the system. This defeats the notion of reusability and extendibility through design. This may seem like a simple and obvious abstraction. If it does, that's because it is. Many of the abstractions needed in a complex system, however, are not so obvious. Because of this, it is of the utmost importance that the system be designed in its entirety before writing the first line of code.

Adding functionality in a descendant class (a class which inherits from another class) is a common use for Inheritance. However, you might want to simply change the functionality provided by the parent class. For example, maybe the method written in person for "walk" could be rewritten in baby to reflect crawling. The method in baby would still be called "walk" but its implementation would be different. What was just described represents Polymorphism, a long and ugly word describing a descendant class that changes functionality found in its parent script to better suit it's own needs. At runtime it may seem "magical" but it's really just designed functionality and key to the idea of software reuse by design. Notice also that Polymorphism intends to eliminate the need for case statements in your design. If you find you're using case statements to determine what "kind" of object something is before sending it a message, it is a clear sign that more levels of Inheritance are needed. Polymorphism represents a "magical" automatic case statement for objects.

Before moving on from Polymorphism, a common design pitfall must be illustrated. The problem exists when a descendant class is "changing" derived functionality to make it do nothing. Can you see the conceptual error in such an example? Remember what Inheritance means. If a class B inherits from a class A, then B is a special version of A. This means that B is everything that is A, plus some. If you're writing an empty method in B to override the method of the same name found in A, clearly there you have made a conceptual error. In such a case, common sense says that either B is not really a special version of A, or there are methods and properties in A, that are not really common to all instances of A.

The following illustrates a trivial example. If a class Person has a method called talk and you add a descendant class called mutePerson, one approach at eliminating the ability for a mutePerson to talk would be to create an empty talk method within mutePerson so that a call to mutePerson to talk won't do anything. But does this make sense? Granted it will work just fine, but doesn't it make more sense that talk simply is not a common attribute of every person? By further abstracting the Inheritance hierarchy to include a descendant class which truly encapsulates what is common to all people, and deriving from that more specialized types of people, this type of error can be easily avoided.

So you're still faced with the task of properly finding out what classes will need to be written and how they will interact. Often the nouns found in the specification will provide you with a good start in finding classes. It would follow that adjectives describe Inheritance, because an adjective represents a "special version" of something else. However, for each possible class, you must carefully consider the abstraction which is really being described, as well as the data structure which provides its foundation.

Recall the list of possible classes described above (person, boy, account, schedule, mailbox, and scrollbar.) Think of what types of data and operations would be implemented if each of those words represented a class. You might say the obvious things for person would be properties for sex, age, weight and height etc., as well as methods available to get and set each of those properties. Next consider the use of the term boy as a class name. Obviously this class would have a person as its ancestor, after all a boy also has a sex, age, height, weight etc. Should boy be a distinct abstraction? It is a noun in the specification, but does boy represent nothing more than a male person with a young age? If so, the abstraction should not be made. For example :


set boy = new(script "person")
mSetAge(boy, 7)
mSetSex(boy, "male")

Clearly boy is an instance of person and not a distinct class. The difference between a class and an object is not always obvious through nouns in a specification. If it were, the mistake of abstracting boy as a separate class would not have been made. In fact, relying on nouns alone in a specification tends to blur the distinction between what are truly abstractions and what should simply be instances of a common abstraction. When a descendant class adds nothing to the class which it inherits from, it is an unnecessary abstraction and will only add complexity to the system.

Another warning sign that a class represents a bad abstraction is when it is a descendant of another class and there is only going to be one instance of it in the system. This will happen if you miss the distinction between a class and an object and create a distinct class to represent something which is simply an instance of another class. For example, while carCompany could be a possible class name, BMW would not be a class which inherited from carCompany, it would be a specific instance of carCompany. Also, as stated earlier, beware of class abstractions which represent "system controllers" or the like.

How do you find the classes that don't jump right out as nouns? The process involves taking a "noun" as a language dependant "thing" and considering what makes this "thing" what it is. Consider the terms account, schedule and mailbox. Can you think of any abstractions which might be missing even without a software specification to read? It might help to describe what each of these terms is meant to represent.

An account represents a history of transactions and a balance value
A schedule represents a list of appointments.
A mailbox represents a list of messages.

Having found only these nouns in the (small and purely illustrative) system specification one might begin stubbing out scripts for these classes. Maybe beginning with the following :


parent script : classMailbox
property pMessageList
on new me
  set pMessageList = []
  return me
end
-- methods :
mGetNextMessage me
mGetFirstMessage me
mGetLastMessage me
mAddMessage me
mRemoveMessage me
-- parent script : classBankAccount
property pTransactionHistory, pBalance
on new me
   set pTransactionHistory = []
   set pBalance = 0
  return me
end

-- methods
mViewTransactionHistory  me
mGetBalance me
mUpdateBalance
on mAddTransaction me, dDate, dAmount, dType
-- yes, I'm implementing this one,
-- it'll serve a point - try and guess
-- what the point will be. 
set dNewTransaction = [dDate, dAmount, dType]
  add pTransactionList, dNewTransaction
  mUpdateBalance me
end

etc... Does the previous example seem like a fairly good representation of a mailbox and a bank account? Although it might not jump right out at you, there is a fundamental abstraction missing in this situation. When finding classes, try to conceptualize what the "noun" actually represents. Try to think in terms of data abstractions, and data structures, rather than "labels" which are specific to your spoken language. For instance, while a mailbox in the English language represents a physical box that holds pieces of mail, the abstract notion of a mailbox is that of a container holding things. The same notion applies to a bank account. The bank account is a container for things we call transactions and balance information. Notice the striking similarities between a mailbox and a bank account? From an English language representation, a bank account and a mailbox are quite different things. However, in an abstract representation, the fundamental underlying structure is the same.

Yes, both do have unique characteristics which must be distinctly addressed and so they do represent distinct classes. However, the point which must be made clear is that from the specification we found the nouns bank account and mailbox, but the data structure which drives them did not appear as a possible class. The notion of a list is the common abstract data type providing the foundation for these classes. Finding these types of classes requires a cyclical approach to design as described in a later section. We must also come up with a class representing what this "container" is going to hold. For example, in the previous example, the stub script for class bank account uses the following piece of code to create a new transaction and add it to the transaction history :


set dNewTransaction = [dDate, dAmount, dType]
add pTransactionHistory, dNewTransaction

Looks simple enough. The transaction is a list itself containing a value for date, amount and type of transaction. This three item list is then added to the list of transactions. Think about the types of operations that will most likely be performed on each item in the transaction list. Don't limit the scope of the operations you think about to the current system specification. Think long term. Think about what this "thing" should be able to do - in any instance. It's easy enough to parse out values from a two-dimensional list, but that means code needs to be written within bank account to do that. Does list parsing functionality seem like it should be the responsibility of bank account? Common sense says no.

Should a transaction then be a list also since that's how it was represented here? Absolutely not. Instead, create a new class transaction which has distinct properties for date, amount and type as well as methods to work on those properties. Then bank account could inherit from list and contain any number of transactions. Class bank account then will only have to implement what is truly unique to a bank account. In this minimalist example, that would be the balance property. In other words, bank account and its related classes could have the following implementation :


-- parent script : classList
property pList, pCursor
on new me
 set pList = []
 set pCursor = 1
  return me
end
on mAdd me, dVal
  append pList, dVal
end
-- other list functions would have
-- implementations also notice that  this class
-- has a cursor - a reference  to the "current"
-- item. Therefore methods such as mGetCurrent,
-- mGoNext, mGoPrevious, mGoFirst, mGoLast
-- would also be implementations in this class.
-- Please read any book on data structures to
-- realize the full potential here, as well
-- for many similar abstract data types.
-- parent script : bankaccount
property ancestor, pBalance
on new me
  set ancestor = new(script "classList")
  set pBalance = 0
  return me
end
on mUpdateBalance me

-- parent script : transaction
property pDate, pAmount, pType
on new me, dDate, dAmount, dType
  set pDate = dDate
  set pAmount = dAmount
  set pType = dType
  return me
end
on mGetDate me
on mGetAmount me
on mGetType me

A possible runtime implementation could be as follows :


global gBankAccount
set gBankAccount = new(script"bankaccount")
set dTransaction = new(script"transaction", ¬
  "03/28/98", 1000.00, "deposit")
mAdd(gBankAccount, dTransaction)

Or, in the case of a similar implementation for mailbox :


global gMailGox
set gMailBox = new(script "mailbox")
set dNewPieceOfMail = new(script "mailMessage", ¬
  "FromMe", "ToYou", "Date")
mAdd(gMailBox, dNewPieceOfMail)

Notice that all list functionality (mAdd as illustrated) becomes available to all descendants of list. The list class is a highly reusable software component. The list adds simplicity, as a call to mAdd is all that is needed to add anything to any class which is derived from list. It makes no difference if it's a bank account, a mailbox or a schedule book. Without the list abstraction we'd need methods such as mAddMessage, mAddTransaction, mGoNextMessage, etc. in each class. This would be redundant and less efficient. It would also decrease reusability and increase overall complexity.

At this point you might be wondering how the classes communicate with each other?. Notice that bank account will have to know about the interface to transaction in order to get values from each transaction it holds. The interface represents the list of methods which are available for use by other objects. It's like the menu board above the cash registers in a fast food restaurant. "Here's what we can give you." The fact that a bank account is going to need to know the interface to transaction represents a line of communication, or a cohesion between bank account and transaction. However, transaction doesn't have to know anything about bank account or even that bank account exists in order to work. This is perfectly acceptable, and common place. In fact, it's the goal. After all, a system with objects that don't talk to each other won't do anything!

Ask, Don't Take

It is very important that as a designer, you take the time to carefully document the interface to each class you write. It is also important that you try to keep the size of the interface as small as possible. In other words, only put methods in a class when it makes sense that an "outsider" would need to use them. Don't blindly write a get and set method (accessor and mutator methods respectively) for every property of the object.

Lingo provides two functions called setAProp and getAProp that are generally used on property lists. Unfortunately, these functions will also work for setting and getting properties from within an object. Using these functions with objects is a very bad idea, as they directly violate an important facet of Object Oriented Design called Information Hiding. If you need to get the value of some property of an object, you must ask the object to give it to you. The same applies to setting a value in an object. Using setAProp and getAProp is like breaking into a class and tampering with what is not yours. In other words, ask, don't take. This is like a security system for an object. The object is responsible for itself and must be able to protect itself if it's going to be reliable. What if you could simply set the balance of your bank account to a million dollars without having to go through a teller to deposit any money? Sounds like a fine idea! What if somebody else could set the balance in your bank account to zero by taking all your money without having to go through a teller? Get the picture? The design and documentation of a good class interface is extremely important for reusability and extendibility. For example, consider the following :


property pMyPropety
on mGetMyProperty me
  return pMyProperty
end

In this implementation, the class has a value for pMyProperty. If other parts of your system accessed this value with getAProp, it would work but consider what would happen if the implementation changed at some point in the future. What if pMyProperty was no longer a "simple value" and instead had to be calculated or pulled from a database as in the following example :


property pMyPropety
on mGetMyProperty me
-- this method remains the outsiders
-- "teller window" to pMyProperty
-- i.e. although the implementation
-- has changed, users of the object
-- don't  need to know about the change,
-- and shouldn't care anyway. 
  return mCalculateNewValue(me)
end
on mCalculateNewValue me
-- This method would NOT become
-- a part of the Interface. 
-- It is used as an implementation
-- change without affecting the
-- public interface. 
  set pMyProperty = some_ calculation_¬
    or_database_query()
end

If such changes were made, all parts of your system that were simply "taking" pMyProperty with the getAProp function would die a sudden death. This is because they don't know about the changes that were made. But they shouldn't have to know about changes in other parts of the system. Using setAProp throughout a system would wreak just as much havoc when implementation changes are made. Instead, by having accessor an mutator methods in a class interface, such as mGetMyProperty, other parts of the system would not need to know about the change in implementation and could continue working as usual. This is fundamental to Object Oriented Design. For the same reason, it is important that once the interface to a class has been defined, it not be changed, ever. In languages such as Eiffel and C++ one can explicitly declare a method as public or private. Unfortunately Lingo does not provide this functionality, but it can be mimicked to a certain extent by defining a method within a class without the me parameter.

For example :


-- a class which holds two values,
--  and always knows their sum.
 
property pProp1, pProp2, pSum
on mSetProp1 me, dVal 
  set pProp1 = dVal
  set pSum = mAdd(pProp1, pProp2)  
end
on mSetProp2 me, dVal
  set pProp2 = dVal
  set pSum = mAdd(pProp1, pProp2)  
end
on mAdd dVal1, dVal2
  return dVal1 + dVal2
end

The previous example is trivial, but take notice of what is happening here. The method mSetProp1 is defined with me as the first argument. This is the normal way of defining methods in a class. Within mSetProp1 the class properties can be accessed directly. This is what the me parameter is there for. Notice that there is no me parameter being passed into the mAdd method. Two important things happen when a method is defined like this. Firstly, the method is not accessible by outside objects. In effect it is hidden, or private. For example, if myObj is an instance of the class defined in the previous example, a call :

mAdd(myObj, 1, 1)

would result in an error. However, because this is somewhat of a Lingo "workaround" to the notion of private methods, properties of the class are not directly accessible from within the method either. Not passing in a me parameter in effect isolates the method from its own class.

For example, the following will not work :


property pSum
.
.
on mAdd dVal1, dVal2
  set pSum =  dVal1 + dVal2
end

In the previous example, the property pSum will not be available inside mAdd because the mAdd method has no knowledge of "me" and therefore no knowledge of the property of me called pSum.

The goal in all this is to reduce each class to a very concise and distinct purpose which can be easily implemented and highly reused with as few dependencies on other classes as possible. Classes should be designed for reuse not only within the context of the current system (as a parent to many classes for example), but in future systems as well. Pay particular attention to frameworks of classes also. A framework represents a collection of closely related classes. Frameworks should be just as reusable in new systems as individual classes. After all, that's the point of all this, right?

Developing classes which work like this does require a lot of thought and an understanding of Object Oriented Design principles. Once grasped, the rules and strategies of Object Oriented Design provide a powerful framework from which robust and reusable systems can be designed and implemented. The time it takes, however, is clearly well spent. Learning the syntax for OOP is an afterthought, and a minimal one at best. Thinking in terms of long term extendibility and reusability is key in this type of design.

Designing Object Oriented systems is cyclical at first. The process starts with finding possible classes. Responsibilities and interactions between classes are then charted out and analyzed. Dependencies (established lines of communication) between classes are minimized. Commonalties are found and new classes are abstracted while "wrong" classes are abandoned. This cycle repeats until one feels the system design is cohesive, extendible, reusable and highly abstract. Remember, classes should represent an abstract idea, not a single function or specific value. Also remember that classes should not represent "controllers" or "managers" of other classes for reasons already made clear. The interface to each class is then documented. Then and only then does coding begin. Coding involves first stubbing out the classes. Then, one by one, methods are implemented and tested.

Finally when each class has been implemented, fully tested and fully documented, implementation of runtime objects with values specific to the system begins. Keep in mind that code and classes which are highly specific to the system at hand will exist. This is OK. If you took the time to design the system properly, the amount of code which is specific to the current system will be minimal, while reusable classes and frameworks will comprise the bulk of the system. More reusable classes and frameworks of classes mean faster design and development on future projects. That my friends, is what Object Oriented Programming and Design is all about.

Recommended for further insight into the Object Oriented Paradigm.

1.
"Object Oriented Software Construction" 2nd Edition by Bertrand Meyer
An excellent reference, covers everything in a very accessible fashion.

2.
"Object Oriented Design Heuristics" by Arthur J. Riel
This book provides many clear "rules of thumb" for designing OO systems.

3.
"Lingo Sorcery" by Peter Small
Provides a unique, forward thinking, bottom up approach to Lingo OOPS.

Paul Hemmer received a BS in Information Technology from Rochester Institute of Technology. He is Senior Developer for a Rochester NY multimedia company. He primarily does Object Oriented Lingo development and lately has doing quite a bit of ASP programming. Paul is very much an advocate of OOP and always tries to push Director beyond its limits. He is committed to increasing awareness of the power and importance of the object oriented mindset and he does his best to make sure everybody on DirectL knows how he feels. Paul's real love is music. He is a bass player who happens to enjoy slinging Lingo by day. ;)