Adherents to Domain Driven Design (DDD) often struggle with how to structure the “layers” of their application with respect to where object repositories should be placed. As usual in software development the answer to this quandary is “It depends”. In this posting I’ll discuss several things that should be taken into consideration when making this decision.
Recall that an object repository is responsible for managing the life cycle of an object once it has been created (and added to the repository). This means that, by convention, the application will only obtain, store, and delete existing objects through their associated repositories. Objects are created using one factory model or another (usually factory methods or factory objects). The object repository encapsulates functions such as storing and retrieving objects from persistent storage and caching of retrieved objects.
But where should we place these repositories and how should we access them? Should they reside In the Domain with the rest of business objects? Should they reside in the Data Access Layer (after all, it manages access to the “data”)? Or should they reside in some other layer (or layers)? How do objects that use them get access to them?
Fundamentals
Usage
The first set of questions that must be answered is how you expect the repositories to be used. Will they be used by multiple clients? Will the repository manage multiple types of “persistence”? For example, will the objects be stored in multiple formats (Relational, xml)?
Design Principles
Once the usage questions have been answered, we must consider the application of a number of principles and practices that can guide this architectural decision. These principles and practices include the use of interfaces and factories, Dependency Injection, the Dependency Inversion Principle (DIP), and YAGNI (You Aren’t Gonna Need It) and DRY (Don’t Repeat Yourself).
Interfaces
When you have multiple implementations of an object (in our case the repositories), interfaces provide a way to insulate client objects from those implementation changes. The interface provides a set of signatures (syntax and semantics) that implementers of the interface must adhere to. By only using those signatures through the interface, the client object remains (willfully) ignorant of the actual implementation of that interface and is therefore protected from changes to the implementation.
interface ISomeInterface
{
void MethodA();
void MethodB();
}
class SomeInterfaceImplementedOneWay : ISomeInterface
{
public void MethodA()
{
// do something useful
}
public void MethodB()
{
// do something useful
}
}
class SomeInterfaceImplementedAnotherWay : ISomeInterface
{
public void MethodA()
{
// do something useful, but in a different way
}
public void MethodB()
{
// do something useful, but in a different way
}
}
// client code can use ISomeInterface without understanding how it is implemented…
…
SomeInterface someInterface = null;
…
… // acquire a reference to ISomeInterface from somewhere somehow
…
someInterface.MethodB();
…
someInterface.MethodA();
…
But how does the client get a reference to the interface? Two common methods are by passing an object that implements the interface as a parameter to the client (Dependency Injection) or by having the client obtain the reference from a factory.
Dependency Injection
An object can be protected from changes in its dependencies by having the objects it depends on passed into it by its caller through a reference to an interface. This can be done as parameters on the object’s constructor or by passing parameters into the particular methods that have the dependencies. If done in the constructor the object generally retains a reference to the “depended upon” interface. This is particularly useful if multiple methods use the depended upon interface.
class Client
{
private ISomeInterface itsSomeInterfaceImp;
public Client(ISomeInterface someInterfaceImplementation)
{
itsSomeInterfaceImp = someInterfaceImplementation;
}
public void Method1()
{
itsSomeInterfaceImp.MethodA();
}
public void Method2()
{
itsSomeInterfaceImp.MethodB();
}
}
If done as a parameter to a method, the object generally does not retain the reference but merely uses the reference within the scope of the method.
class Client
{
public void Method3(ISomeInterface someInterfaceImp)
{
someInterfaceImp.MethodA();
}
}
While Dependency Injection provides a way to insulate an object from changes in its dependencies, it begs the question, “How does the object that injects the dependency get the object that implements the dependency?” And it is in the answer to this question that the cost of Dependency Injection can be seen. The answer is often “It is injected into the object that, in turn, injects it into the object that depends up on it.” The end result is a call chain where the depended upon object is passed as a parameter from method to method. Whether this is onerous or not depends upon factors such as the number of other parameters being passed, the use of tools to manage refactorings, and the designer’s sense of aesthetics. An alternative is to use factories to obtain the dependencies.
Factories
The use of factories to obtain objects is prevalent within the development community. The client object uses the factory to obtain a reference to an object that implements the interface that it depends upon. The factory object determines which object should be returned to the client. This determination can be made in a number of different ways including reading a configuration file to obtain the name of the object that should be created.
class SomeInterfaceFactory
{
public SomeInterface Get(…)
{
string typeName = GetTypeNameFromConfigFile();
if(typeName == “SomeInterfaceImplementedOneWay”)
return new SomeInterfaceImplementedOneWay();
else if(typeName == “SomeInterfaceImplementedAnotherWay”)
return new SomeInterfaceImplementedAnotherWay();
else
throw new Exception(“Don’t know one way from another”);
}
private string GetTypeNameFromConfigFile()
{
string typeName;
// code that gets the name from the config file
return typeName;
}
}
or by having the object implementation passed into the factory at application initialization
class SomeInterfaceFactory
{
static SomeInterface theSomeInterfaceImp = null;
public static SomeInterface
{
get
{
if(theSomeInterfaceImp == null)
throw new Exception(“factory not initialized”);
else
return theSomeInterfaceImp;
}
set
{
if(theSomeInterfaceImp != null)
throw new Exception(“Already initialized”);
else
theSomeInterfaceImp = value;
}
}
}
The attentive reader will note that the last method is just another example of Dependency Injection. This time the dependency is injected into the factory.
One of the costs of Factories can be dealing with the issues surrounding the implementation of the factory as a Singleton, which has ramifications for testability and system coupling. An alternative approach that should be considered is to implement the Factories as using the Monostate pattern (Martin, 2003) with static setter methods to inject the dependencies.
Dependency Inversion Principle (DIP) and Packaging
Once a decision has been made to use interfaces, the next question that must be addressed is where the interface definition should live, i.e., which separately deployable unit (e.g., assembly or package) should contain the interface definition. Robert Martin defined the principle and has written extensively about it in (Martin, 2003). He defines it as follows:
Abstractions should not depend upon details. Details should depend upon abstractions.
An example provides a practical way to examine this principle. Imagine an application that records and displays changes in stock prices. One possible design of this system could include the following classes:

In this design (an implementation of the Observer pattern), two concrete observers (LogWriter and UIAlertManager) register with a StockPrice class and obtain notifications if the StockPrice changes. The LogWriter keeps an audit trail of price changes and the UIAlertManager updates a display that tracks stock prices on the user’s display. The abstraction in this design is the notion that a StockPrice can be observed and that the IStockPriceObserver can be notified. The detail in this design is that one type of observer writes these observations to a log and another updates a user interface element. One could easily imagine other types of StockPriceObservers (perhaps a VolatileStockWatcher that notifies the SEC if a price fluctuates by more than 10% over a given trading session). Note that StockPrice does not depend in any way upon LogWriter or UIAlertManager. The only dependency is upon the interface IStockPriceObserver.
How should these components be packaged? In particular, where does the IStockPriceObserver class get packaged (i.e. in what separately releasable unit should the interface be placed)?
In general, the proper place to put the interface definition is as close as possible to the clients (versus the implementers) of the interface. Unless the interface is shared by multiple clients, this will be in the same deployable unit as the client.
The rationale for these decisions is rooted in dependency management. If the interface is placed in the package that contains the implementation(s) of the interface:
then any change to the implementation will require a recompilation of the code that uses the interface and may require redeployment of the package containing the code that uses the interface because this packaging scheme introduces a cyclic dependency between the two packages. A better solution is to package IStockPriceObserver with StockPrice and thereby eliminate the cyclic dependency:
Note that the Observers themselves might be broken out into separate packages based upon the needs of the application or organization.
YAGNI (You Aren’t Gonna Need It) and DRY (Don’t Repeat Yourself)
Those who practice XP or many of the other Agile methods subscribe to the idea that you shouldn’t add an element to the system until it is actually needed to implement the current requirement. Rather than attempting to predict a solution structure up front, proponents of these methods expect the structure of the system to emerge as the code is written. The dynamic that prevents this from creating a poorly structured system is the use of the DRY (Don’t Repeat Yourself) principle (Hunt & David, 2000) in conjunction with the use of design principles and patterns such as those described above. If the process of “write the test, make it run, and refactor to remove any duplication” is followed rigorously, an agile developer will begin to refactor the code toward the design patterns that solve the problems that present themselves as duplicated code. See (Kerievsky, 2005) for an excellent reference on how this is done.
Application of these principles with respect to the preceding discussion means the agile developer will not introduce an interface until multiple implementations of a class are needed and will not add separate deployable units to contain those implementations until the need arises.
The concern that experienced developers will have at this point is how much will it cost me to make these types of changes once a substantial code base exists? The answer is that the level of agility the developer can exhibit will be directly impacted by the tools that they have available to manage these sometimes large refactorings. To the extent these tools are not available; the developer will be driven to attempt predictive design.
So…, Where Do Those Repositories Live?
Again, It Depends! But now we have some tools to come up with answers based upon the usage of the repositories:
If so, then you’ll probably have multiple implementations of your repositories. You’ll be driven to this by the overhead of executing the tests. If your repositories do significant file IO or use a relational database, you will probably end up using some form of in-memory representation of those data stores so that your tests will execute quickly.
If you have multiple implementations of the repositories, you’ll then need an interface definition for the repository and a separate deployable unit for the representations (since you won’t want to ship the in-memory repositories as part of your application).
You’ll also have to address the issue of how the implementations will be obtained by the client objects so you’ll choose between (or mix and match) Dependency Inversion and Factories.
You will probably package the in-memory data stores in a separate assembly from the “real” repository implementations.
Then you’ll need to look at a separate deployable unit that contains only the Repository interfaces.
This is a running debate in the DDD world that is influenced by how you decide aggregate objects are rehydrated and stored as well as what capabilities you assume exist in your infrastructure (e.g., are you using an ORM (Object Relational Mapper) like Hibernate, or are you rolling your own implementation). For example, one strategy for lazy loading is to have the aggregate call a repository to load the objects when they’re needed. A Customer object might call the OrdersRepository to obtain the customer’s orders for the last year.
If your domain objects do use repositories directly, then the interface definitions for the repositories will be depended upon by your domain objects and will reside within your domain package (or in a separate package if the answer to number 2 was yes).
If not, then repository interface definitions will reside in whatever component passes the domain objects to the repositories.
You’ll evolve your system to these types of structures, only adding them when you need them. If not, then you’ll probably start with a typical Domain/Data-Access Layer structure.
Bibliography
Hunt, A., & David, T. (2000). The Pragmatic Programmer: from jouneyman to master. Reading, MA: Addison Wesley Longman, Inc.
Kerievsky, J. (2005). Refactoring To Patterns. Boston, MA: Addison-Wesley.
Martin, R. C. (2003). Agile Software Development. Upper Saddle River, NJ: Prentice Hall (Pearson Education, Inc.).
Comments
First of all, you wrote a very nice article. I think you are touching very fundamental issues. I am studying DDD now, and having trouble with a few scenarios.
Where do Repositories live? I feel that I am very limited if the repositories live within the Application layer only. I think lazy loading becomes a must in one point, because you don’t want to bring too much data from the database.
I also feel, that if you were to implement Specification pattern (Evens), then again you need the Repositories. Having said that, I think you should only use interfaces to them, and never use their true implementation.
When coding I sometime have a “the million rule”, what if there is a million orders to a customer. Or what if there are a million history records for a transaction. We should never bring that type of data down. We should probably use Query Pattern, which will work with Repositories.
Now I wonder, if I was to create a Customer and Order classes… does this makes snese?
List orders = customer.GetOrders(); // use Repository to implement this
Or
List order = customer.GetOrders(DateTime fromDate) // use Repository with Query pattern
What about adding an order?
customer.AddNewOrder(order);
Does code like that is DDD? If yes, the Repositories do live within the domain…
What is your take on this?
(Let’s assume that you can have a million orders per customer)
Thanks for the complement on the article.
You raise an interesting (and common) problem. If I were to implement lazy loading here to deal with the customer who had placed a million orders I would allow the Customer class to interact with an interface to an order repository (IOrderRepository). It would obtain an implemenation of that interface by calling a repository factory object to get the concrete implemenation and then use that to retrieve a smaller subset of the data. Using your example of a date range I would implement the customer get order method like so:
public class Customer { IList GetOrders(DateTime from, DateTime to) { return RepositoryFactory.OrderRepository.GetOrders(this, from, to); } }Now since the Customer class is really referring to an IOrderRepository, I'd say that the Repository interface lives in the domain (with its client), and the concrete implementations live in their own namespaces (and probably assemblies).
Note that although I'm using a RepositoryFactory in this example, I might provide the indirection by just passing the IOrderRepository to the Customer when I created it. (This implies that whatever object created the Customer object would get the concrete implementation from the RepositoryFactory because we still want that run-time configurability.) I usually determine whether or not to pass the repository instances to the object on its constructor or have it use the RepositoryFactory directly based upon whether the object uses multiple repositories. If it uses one or two different repositories, I pass the instances, otherwise I just have the object use the RepositoryFactory to get the instances.
Does that make sense to you?
Thank you for taking the time to respond. Everything you said makes perfect sense, and it is nice to have this type of validation when coding DDD.
I believe passing the Repository to the constructor makes more sense because we should argue that the Customer does not have the “knowledge” to know which repository to create, and therefore it is logical for CustomerFactory to do this job (and it works better with unit tests). So I agree with all your point.
At this point, I am trying to use NHibrante to implement a prototype on this idea. When I am done, I will post it on my blog (http://mikeperetz.blogspot.com/) I would love your feedback on it.