Notes on Domain-Driven Design

Categories: Architecture

Overview

While I’ve been developing software for a long time, and have been generally aware of the concepts of Domain-Driven Design, I only recently read a couple of the classic/well-known books on this topic: Domain-Driven Design by Eric Evans and Implementing Domain-Driven Design by Vaughn Vernon. This article is a series of things I found note-worthy from these texts.

I wrote this article primarily to clear up some concepts for myself, and as notes to refer to in future years. It may be useful for you, but I make no promises..

There are three possible audiences for a book on DDD:

  • a software-developer/modeller (and I include architects here too; Evans makes a very good argument that “ivory tower architecture” is dangerous and that anyone with the role architect should also be actively involved in modelling and development of the project too)
  • a business-expert/modeller
  • project managers

This review is primarily from the perspective of the first audience. For the other cases, none of the books reviewed here are really appropriate; specific chapters (particularly the first few from Evans) might be partially helpful but a dedicated book would be best. Reviews of Vernon’s DDD Distilled suggest that perhaps that would be appropriate for the business/management view of DDD; I can’t confirm as I haven’t read it. A few alternate books (which I also haven’t read yet) are suggested in the Further Reading section at the end of this article.

Unless text below is a direct quote from a book (indented text), or placed in quotes prefaced by “Evans says..” or “Vernon says..” then all statements below are my own opinion/view/interpretation and may very well be wrong or incorrectly interpreting the author’s intentions. Read the books and draw your own conclusions!

Contents

About The Books

Evans’ book well deserves its reputation; it elegantly puts the case for DDD and defines many important concepts. Despite having been published in 2003, large parts of it are still very relevant. Evans argues that each problem domain needs its own language (vocabulary). Interestingly, DDD attacks the problem of software development by developing/defining a language related to exactly this problem-domain; terms include Entity, Service, Aggregate, Layered Archicture, Domain, Bounded Context, Context Map, and many more. This is a nice example of self-reference! Sadly, Evans fails to sufficiently define some of these terms. This is surprising given the book’s emphasis on the use of natural language. In particular, the later parts of the book which discuss dealing with larger projects and “strategic patterns” do not clearly define or consistently use words such as domain, subdomain, and module. Terminology in “modern DDD” also appears to interpret some words differently. This is discussed later in this article. This book provides no code-examples at all - which is good in that it applies to projects in any (object-oriented) language, and focuses on the ideas. However this often leads to very abstract descriptions whose possible concrete implementations are not clear.

Vernon’s book is newer, having been published in 2013. It has some useful contributions in it, in particular clarifying some of the larger-scale concepts that are ambiguous in Evans. It also adds information on Domain Events - a concept that was developed after Evans was published. Sadly this book is extremely verbose, with excursions into many topics that aren’t really DDD-related such as generic unit-testing techniques, options for allocating unique IDs, or how to implement the “transactional inbox” pattern - things that are well documented elsewhere. Through its extensive code-examples, it does make clear how various concepts can be implemented - but also means the content has dated far faster than the Evans book. I also found the attempts at humour and the “chatty” style more annoying than entertaining, particularly given the amount of space they take up. This book is around the same size as Evans, so clearly with its examples and padding is not nearly as information-dense.

There are also parts of Vernon which I simply disagree with - eg pages 100-108 which talks about open-host-services, rpc and rest. This is all a very strange diversion into architecture rather than design, and at the least very debatable content.

In short, I’m still looking for a truly good book on DDD. Nevertheless I learned a lot from both Evans and Vernon.

If you only have time to read one book, I would recommend Vernon. If you have time for both, I would recommend reading the first parts of Evans in detail, skimming the remainder (strategic patterns), then reading Vernon. Note that the books are of similar size (both around 500 pages) but there is far more padding in Vernon; the useful information density of Evans is higher.

I can also highly recommend this video presentation which shows the importance of the concept of bounded contexts (though I’m not convinced by the final conclusion).

Book Structure

Evans parts 1 and 2 describe what effect DDD can have on code of a smaller project, ie what the benefits of a model are. Part 3 talks about how to iteratively improve an initial model. Part 4 talks about how to apply DDD to larger projects.

I have heard that Evans himself has expressed some regret about the structure of this book, and suggested that the concept of “bounded contexts” belongs first and foremost, with the entire set of tactical patterns moved to an appendix. From the view of a pure architect, the latter parts focused on bounded contexts and structuring the model of larger projects might indeed be of more interest. The strategic patterns certainly have more impact; low level code decisions be refactored far easier than large-scale structure. However as a “coding architect” I found the layout very appropriate; the first two parts shake a coder out of their traditional thought-processes with concrete advice. Later parts are more abstract; the book convinces first with practicality and encourages the (technical) reader to read further - a good idea IMO. There are so many development process books around, and many are vague and abstract; this motivated me at least (as a pragmatist) to commit attention and time to later parts.

Vernon uses the alternative approach of starting with the high-level items and working down. The result is that the text is filled with “forward references” to concepts that are defined only later in the book. These are all nicely marked so you know when that is happening, but is still awkward - the reader really needs to read the book twice.

The Benefits of DDD

The core goal of DDD is to improve communication - between domain expert and developer, between developer and developer, between developer and tester, and between anyone else involved. The point of any project is to produce software that solves a problem - but without a shared understanding of what is being built, that won’t happen effectively. The developers need to know what the inputs are, what outputs are needed, and how to derive those outputs from the inputs. However the domain experts cannot provide that knowledge as algorithms - otherwise they would write the code themselves. And the domain experts cannot express the problem in their own language, as the developers won’t understand. An intermediate form is needed.

That intermediate form is exactly what domain driven design is about.

Or in other words, in any project with non-trivial business logic, the most important part is the communication between domain experts and software developers. Almost as important is the communication between developers, between developers and testers, and any other person involved. A decent model of the problem domain can also be useful for users, ie that model can also be represented in the user interface(s).

DDD is also about choosing good models - finding out which concepts are the most important for each group of stake-holders and then putting those concepts at the center of their interactions with the software. A good model can even improve understanding of the problem domain at the business level, potentially revealing better ways to do business, and even additional features that could be offered. Evans refers to this as “refactoring for deeper understanding”. Good communication is the most important tool for discovering good models.

None of this communication or model creation happens just once; it is an iterative process from the start of the project until release of the first version - and then continuing into the future as that software gets extended and improved. DDD is in agreement with Agile principles, including the expectation that the initial software requirements are never what is actually needed; what is needed gets discovered as users get their hands on software. It is also pointed out that developers can discover important concepts/insights/flaws related to the model, leading potentially to new features or a better user experience. Therefore feedback loops and a willingness to change in the presence of these discoveries are important.

As the communication doesn’t happen just once, and isn’t uni-directional, a requirements or design specification which doesn’t map relatively directly to an implementation is difficult to handle. A model is a “collection of abstract ideas organised into some kind of structure” - ie a possible representation of the project requirements or design. Any problem can be modelled in multiple ways, and at least some of those ways can map relatively directly to software; these (and only these) support long-term communication between the parties involved. The foundation of this model is an ubiquitous language - a vocabulary of terms which allows people to talk about the problem concisely and without ambiguity. The language often adopts terminology used by domain experts, but additional concepts can be defined for the purpose of defining the problem to be solved. However some words may have different meanings when talking about different parts of the problem; each bounded context is a “namespace” in which terms can be associated with different meanings.

From Evans page 49:

Model-driven design discards the dichotomy of analysis model and design to search out a single model that serves both purposes.

The problem domain almost always consists of things or people which have roles/responsibilities ie perform operations. Occasionally experts will express themselves in the passive tense, eg “this output is produced from this input by application of this operation”; this could be added to the “requirements” as is, but can also relatively easily be mapped to the active tense - we humans do this very naturally.

Evans points out that there are many different forms that a model can take; only a subset of these also make sense as code. If a project needs to solve a specific problem only once, and then it is done, then pretty much any form of model can be used to represent the problem, and developers can map this to just about any kind of code-style. However projects are seldom like this; instead:

  • the problem evolves as development continues ie the requirements get modified due to feedback from initial software versions
  • the initial problem is usually only “part 1” of a larger task
  • developers often need to understand the purpose of a piece of code (often years later) - ie “why is the code doing this?”
  • developers may run into problems implementing the domain-model as-is and need to propose changes ie “if we did this instead, would that make sense to you and still address the problem?”
  • developers also need to communicate with each other (agree on how to build the solution) and the model is a way to assist that

In all of these cases, if the code is structured entirely differently from the initial problem model then such communication is hard; the developers and the domain experts have no shared language or model. This can lead to slow development, incorrect results, and simply inflexible software that cannot easily be extended in interesting ways in later phases. An implementable model is important, otherwise developers will implement something that does not correspond to the model.

DDD provides a set of patterns for creating and managing models, and for creating code which works well with such models. The patterns are generally divided into two categories:

  • stragegic (large-scale) patterns - suggestions on how to manage abstract models, large models, and the interactions between multiple models
  • tactical (small-scale) patterns - low-level suggestions on how to model concepts that map very directly to code

These patterns in turn apply to models useful at different levels; models representing strategic aspects of the overall project (I call these strategic domain models), and models addressing tactical aspects (I call these tactical domain models). Strategic domain models correspond in some ways to architecture, while tactical domain models map directly to sourcecode.

A model can exist only in the heads of the relevant team and the corresponding source-code. It can also be a bunch of temporary notes and sketches. However in general, it is written down in a combination of diagrams and text.

In DDD, the chosen models must make sense to domain experts (with someone technical as guide) while also being representable in software (with a reasonably straight-forward mapping). A model is not a complete representation of the real world, but instead a minimal set of concepts needed to discuss and implement specific use-cases. However a well-designed model does reflect “reality” in some way; this makes it more likely to be applicable to not-yet-specified requirements.

Modelling is the process of structuring information about the problem domain. Design is the process of structuring code. Domain-driven design happens when a domain model works simultaneously as a good representation of the problem domain and a good representation of the sourcecode.

A system that has the needed functionality, but no meaningful abstractions, is the exact opposite of DDD, and is likely to be hard to maintain and extend.

Each model is a work-in-progress. It is unlikely that the first version is correct; instead feedback is needed. In addition, it will evolve over time as new requirements are added. The models are therefore iteratively improved - driven often by developers who also understand the domain noticing opportunities for refactoring in the code. The fact that the model corresponds both to business concepts and code concepts helps this bi-directional iterative process.

The concepts of bounded contexts, ubiquitous languages, tactical domain models, and context maps are pretty much the core of DDD; projects in which these are not central are probably not “doing DDD”. In addition, it is understood that all of these things are aids to communication and will evolve over the lifetime of a project - discovery of relevant requirements and appropriate models is an iterative process including lots of feedback based on experience. Just about everything else is optional; additional patterns are defined which can be applied to improve communication and make the requirements understandable - but should only be applied if they do that for the particular project.

Some Core Language Terms

Before discussing DDD in detail, it is important to understand the meanings of the dozen or so terms that are very important to DDD. I’ll do that here for two reasons - first, to make this article self-standing, but more importantly because sadly both of the DDD books referenced above (Evans and Vernon) fail to completely define some of these core terms clearly - or define them differently. Here I do my best to combine available information on these terms into consistent definitions, show the relations between terms, and point out cases where the term has different meanings in different contexts. Of course this is just my current understanding of these concepts.

The problems with definitions occur primarily in the “large scale” (strategic) concepts. A bounded-context is reasonably clear, and the “small scale (tactical) patterns applied within a context are reasonably obvious. The larger-scale patterns applied across contexts are unfortunately rather fuzzily defined - both in these books and in other DDD literature.

Both Evans and Vernon have implemented multiple projects successfully using the approach documented in their books, so it must work out somehow. However some more official definitions of various terms would really be helpful. Maybe some day…

Unfortunately it’s difficult to find an appropriate order for the definitions below; as a result there are quite a few “forward references” to things not yet defined. There is also some repeated information, eg domain models are discussed as part of defining domains and vice versa; this seems unavoidable.

Context and Model

Evans (page 336) states that a model is a collection of abstract ideas organised in some kind of structure, and is valid only in a specific context. A context can be a meeting, a throwaway prototype, a piece of example code. Many contexts, and thus models, are temporary ie useful only for a very short time.

Or in other words, the meaning of anything can be context dependent, so the context should be specified at the start of any discussion, document, etc. And in particular, any model should define the context in which it applies.

The context that we most often talk about is a subset of business functionality with a self-consistent view of business concepts which is implemented via a piece of software. In this case we talk about a bounded context with its ubiquitous language and associated domain model. The term domain model is defined nowhere as far as I can tell, but seems to mean “a model consisting of business concepts from the subject domain”. However the terms context and model are more general-purposes; a project may have many contexts and models of which only some are bounded contexts and domain models.

The term “the model” is sometimes used to mean (roughly) “all the concepts associated with the whole project”, ie a collection of all the (bounded context) domain-models, the context-maps, the domain/subdomain definitions, the layer definitions, and everything else.

Models (of any kind) can get large and hard to understand. There is a set of suggested techniques for dealing with this, including modules, layering, generic domains, abstract core, and more (see later notes on large-scale structure).

Domain and Subdomain

The words domain and subdomain have multiple meanings.

Evans (page 2) states:

Every software program relates to some activity or interest of its user. That subject area to which the user applies the program is the domain of the software.

Let’s call that the domain of a system under design.

However an organisation typically has multiple domains (things the company does to earn money and things it needs to do to stay in business). Twitter has one primary domain - managing user messages. However it also manages advertising, and has internal systems for things such as payroll and property management. Let’s call these organisational domains. Given that each organisational domain is typically supported by software, the domain of a system under design is likely to be a part of an organisational domain.

The DDD term domain model refers to a model that represents business concepts within some part of the domain of the system under design. However that can be on any scale; it might be a “tactical domain model” which is near to the code, or might be a “strategic domain model” in which the terms are high-level (ie deliberately vague and generic), giving an overview only. The term domain model by itself doesn’t imply that the model includes things such as entities or that it is appropriate to give to junior developers for implementation. Note that these terms “tactical domain model” and “strategic domain model” are my own invention; they aren’t used in either book. Instead both authors often use the term “domain model” or “the model” in ways that seem inconsistent with a domain model containing tactical concepts only; the only way I can see to make sense of this is to introduce the concept of a “high-level strategic domain model” with a high-level ubiquitous language; see later for more info on this.

In some cases an organisational domain has obvious parts to it (from the business view). These are often called subdomains. However sometimes a subdomain can actually be found in multiple domains, ie is a shared concept. When a subdomain is “generic” (multipurpose) then Evans tends to just call it a domain - and does this even for quite small blocks of functionality (for example a “date handling domain” or a “currency handling domain”). In Vernon, it seems that a subdomain is simply “a supporting domain” in the context of a project, ie what may be a domain (ie domain of a system under design) for one team (because they develop it) is a subdomain for another team because they use it.

The term domain layer refers to code that implements specific entities and related types defined by a tactical domain model. Within a deployable artifact that contains multiple bounded contexts, the domain layer is the union of all such code for all contained bounded contexts.

Then there is the term core domain, which represents “the most important parts of the domain this software is primarily addressing” (ie the domain of the system under design). This is described as the part of the system that really generates “business value”. The “core domain” is also intended to be small enough to summarize the overall project.

The interaction between organisational domains and bounded contexts isn’t entirely clear in Evans. Vernon explains it by saying that “domains are part of the problem space while bounded contexts are part of the solution space” although various other DDD experts aren’t convinced by this definition and I tend to agree; software is a solution to a problem, and so I don’t see why these “spaces” would be different.

Vernon (page 57) states:

It is a desirable goal to align Subdomains one-to-one with Bounded Contexts. .. In practice this is not always possible.

One possible interpretation is that the identified bounded contexts reveal the “real” subdomains and domains of the business, but that due to business inertia the “legacy” view of domains and subdomains cannot be modified - ie any failure for bounded contexts to nest cleanly within the identified subdomains/domains is solely due to “legacy thinking” within the business. Or in other words, perhaps organisational domains and subdomains represent blocks of functionality as business experts see them, while bounded contexts represent blocks of functionality as the code implements them. Given that bounded contexts are based on discussions between the developers and business experts, there is likely to be a strong correlation but they may not be identical; as a result occasionally a bounded context might end up implementing parts of multiple identified subdomains.

Another way of seeing domains and subdomains might be that these are stable regardless of the IT systems that implement them, ie if a company has multiple overlapping systems that implement parts or all of domains, the domain/subdomain definitions remain stable, regardless of how the software components draw their boundaries.

In the end, the concepts of domain and subdomain may not be all that important. What really matters is delivered code, and that code is based on bounded contexts and the interactions between bounded contexts. Domains and subdomains are useful at the start of a project for identifying bounded contexts; they do often nest into subdomains and business experts are typically good at describing the domains and subdomains of an existing business - but after the bounded contexts have been identified, the domain/subdomain definitions may not have a lot of other value. One effect to consider though is that domains and subdomains often also align with organisation management structures, ie lines of reporting, budgets, schedules, and employee-teams. Refactoring a codebase or a design is often easier than changing management structures, so pragmatically domain/subdomain boundaries might indeed influence how bounded contexts are defined.

Evans states that well-written code reveals the model - ie there is a clear mapping of code types to business concepts. Here model presumably means technical domain model, ie low-level concepts belonging to a concise bounded context, and not organisational domain. Whether code can reveal anything about the identified organisational domains is not clear - perhaps that is represented only in external documentation?

The relation between domains/subdomains/contexts and modules is currently unclear to me - and I don’t appear to be the only one

Bounded Context and Ubiquitous Language

A bounded context is the context in which a specific model is valid. The context can include the “responsible team”, the “code repository”, the module within a monolithic code-base, etc. However in most cases a bounded context is a cohesive set of business functionality.

An ubiquitous language is a set of words and expressions which have a consistent and specific meaning within a bounded context. More than just a “dictionary”, it is a tool for talking about the problem and the code, ie defines everything needed to talk about the bounded context’s behaviour. For each bounded context there is likely to be some document that acts as a dictionary/glossary, defining the meaning of specific nouns and verbs within that context as precisely as possible.

A bounded context can also be defined by “listening to the ubiquitous language”; if a particular name represents two different things (different properties/operations/relations) then either the meanings need to be unified or there are actually two different bounded contexts here, each with a distinct language.

Unfortunately Vernon states (page 25) that “there is one ubiquitous language per bounded context” while Evans appears to instead consider the whole system under design to have one ubiquitous language with each bounded context having its unique dialect of that language. It’s not a big deal in practice; just pick one interpretation for your project. However this inconsistency does make communication harder than it needs to be.

Vernon’s usage appears to be common industry practice. However I can imagine that when a group of people work on multiple bounded contexts, a set of “shared terms” will naturally evolve - ie the dialect model seems to be more natural to me.

One thing all definitions agree on is that a single set of meanings can never unambiguously capture all business concepts for an organisation or industry. There needs to be a “scope” for the language - a bounded context. Whether the resulting language is considered a dialect or a separate language is not important.

And although an ubiquitous language is applicable to a bounded context, it isn’t intended to be used just when choosing names for things in the code-base; it is there to be used in all written and verbal discussions involving any combination of business experts, developers, testers, acceptance testers, etc. It’s how people express what the application should do, what it does do, and how it does it.

Bounded contexts can have different levels of abstraction; the entire project can be considered a bounded context while something fine-grained such as “the currency-handling module” is also a bounded context; it’s just a matter of what is defined as being “in scope” for the context. A large-scoped bounded context must have a high-level ubiquitous language (otherwise definition conflicts will start to occur) and a generic model. At the largest scale, it is possible to have a single ubiquitous language for an entire project/domain (ie the bounded context is the whole project) as long as the language limits itself to high-level definitions; it can contain terms such as “customer” or “invoice” as long as they are defined generically. Of course high-level languages are not appropriate for actually writing code; only only more tightly-scoped bounded contexts will have a language precise enough to define a model which in turn is precise enough to be coded.

Different bounded contexts are generally referred to as “{name} context”, eg “the booking context” or “the inventory context”.

A bounded context usually corresponds to a particular set of domain experts (though some experts will contribute to multiple bounded contexts).

A bounded context usually corresponds to a set of use-cases. Those use-cases are probably not duplicated elsewhere, ie the set of domain experts for one bounded context are usually talking about different operations than those for a different bounded context.

A bounded context often supports “a set of workflows” - and it is not uncommon for a workflow to involve multiple deployed artifacts (aka physical subsystems) - particularly when some are “legacy” components.

A bounded context can also be considered an “IT system”, with a project that consists of multiple contexts being equivalent to a set of different interacting systems. This doesn’t mean each context has to be implemented as an independently deployable artifact; multiple contexts/systems could be multiple artifacts deployed simultaneously or could be a single modular artifact. Thinking of contexts as systems also suggests the option of phased delivery, where the most important bounded contexts are implemented and released to production first.

Because a bounded context can correspond to a distinct set of concepts (language and model), a distinct set of experts, a distinct set of usecases and a distinct set of workflows, each bounded context often also has a distinct set of users with their own view of how that software should work to make their lives easier. This also makes the case that a bounded context is an “IT system” dedicated to that set of users. Failing to separate these contexts/models/systems means creating a single system whose model is not appropriate for some (or maybe all) of its users.

A bounded context should never be defined using technical terms (eg process, socket, database); use only business terms.

Pre-existing systems (internal or external) might have a natural model (forming a bounded context) or might not - in which case it isn’t a bounded context.

Although the language defines a maximum size for any bounded context, it may be possible to divide it further. Whether this is useful can depend upon the team and project. A small highly productive team might be able to handle a lot of types in a single model, while a project with more staff might need multiple contexts to avoid conflicts.

Favouring larger bounded contexts:

  • Flow between user tasks is smoother when more is handled with a unified model
  • It is easier to understand one coherent model than two distinct ones plus mappings
  • Translation between two models can be difficult (sometimes impossible)
  • Shared language fosters clear team communications

Favouring smaller bounded contexts:

  • Communication overhead between developers is reduced
  • Continuous integration is easier with smaller teams and code-bases
  • Larger contexts may call for more versatile abstract models
  • Different models can cater to special needs

In general the recommendation is to size bounded contexts so they can be managed by a team of 10 people. Some bounded contexts are naturally smaller, with little coupling to other contexts (a small API) in which case smaller is ok.

Oddly, Vernon seems to assume that each bounded context is a system process. In Chapter 13, “Integrating Bounded Contexts” he talks only about inter-process communication - as if the “modulith” concept did not exist. However it may be that he thinks “integrating contexts” in a single process is too trivial to mention, and is just being casual with his terminology.

Domain Model

A domain model is a structured representation of the business concepts from a specific bounded context at a level of detail that corresponds to the level of the bounded context.

A bounded context has an associated ubiquitous language and the names for things in the domain model are expected to be taken from that language.

From Vernon (page 62):

A domain model expresses an Ubiquitous Language as a software model.

However large-scoped bounded contexts and their associated generic ubiquitous languages produce only imprecise domain models - things that may be useful on their own, but are not precise enough to map to code. I have already dared to define a new term for this earlier: a tactical domain model is one in which the bounded context and its ubiquitous language are precise enough that the model contains items from the DDD tactical patterns. This is the level at which they can guide the production of executable code.

A tactical domain model defines concepts relatively close to the code; typical items in such a model are entities, value objects, services, and aggregates, together with the operations these things support, the relations between them, and constraints on their state or behaviour.

Of course the corresponding term strategic domain model can be used to specify the more abstract, high-level types of model which still represent business concepts.

The term “context” is often used as a synonym for bounded context. The term “model” is often used as a synonym for domain model.

Any time something within a domain model (eg a type or operation) cannot be given a clear and unambiguous name, then that suggests that the responsibilities of that thing perhaps needs to be factored into multiple parts and/or moved to a different bounded context/domain model. In fact, a bounded context can better be defined by what it does not have than what it does; it does not have any terms whose meaning is ambiguous or inconsistent. Any time that happens, that conflict is resolved by moving responsibilities to the model of a different bounded context.

The appropriate size for a bounded context (and thus its domain model) depends very much on the circumstances. A bounded context has a clear upper limit: when conflicting definitions of the same approximate concept are encountered, one of them has to be pushed out into another context, together with the other terms that interact with it. A bounded context also has a clear lower limit: when a discussion of some term in the context’s language requires some other term, then that other term must also be defined within the language. However that leaves a lot of play in between; sometimes a term can be deliberately defined in an abstract way with the details of its internals pushed into another bounded context. And non-overlapping/non-interacting sets of terms can be included within a single bounded context or placed in separate contexts. The guiding principle here is: what is easiest to understand, and what contributes most to discussions? Obviously, a focused minimal bounded context makes understanding the context easier and helps discussions of the responsibilities of that context. In addition, changes to a domain model typically require agreement from the whole team maintaining that context - making smaller contexts beneficial. However “switching contexts” has its own price. So here some pragmatic aspects can be allowed.

It is highly beneficial when a bounded context (with a tactical domain model) is limited in size to be manageable by a single team of developers, eg 10 people or so. It is also often useful for each bounded context to be a dedicated codebase with reasonable build and test times. However excessive numbers of codebases has its own cost, as does each “boundary crossing” between bounded contexts/codebases. Refactoring is also easier within a single context/team - and easier within a single code repository.

A tactical domain model isn’t necessarily equal to a deployable artifact. An artifact can contain multiple bounded contexts (see modulith), or part of one (see fine-grained microservice). It can also be built from one (modularised) codebase/repository or from multiple codebases/repositories (via libraries). An artifact which contains parts of more than one bounded context is possible but dubious; it has the disadvantages of a modulith (team conflict, release cycles) with the disadvantages of a distributed architecture (complexity). One case where a context spans multiple components may be in integration with a legacy system, if the legacy system is well designed enough that its (tactical) domain model can be adopted by the calling bounded context.

Modules are a way to internally structure a model. In higher-level domain models, these are purely a way of making a complex model easier to understand. In a techical domain model modules can be reflected in the underlying language - eg Java packages or C++ namespaces, or even as distinct libraries.

There are of course code concepts which don’t get reflected in a domain model, eg some aspects of error-handling.

While there have been some attempts in the past to promote “model-driven code generation”, ie defining a model in some tool which then generates compilable code, this is generally considered an impractical/failed approach. While a domain model should map relatively directly to code (to promote long-term bidirectional communication), there are still significant technical aspects of behaviour which are needed in addition. The textual representation of code can also be considered as the real model, ie rather than “model generating the code”, we have “code revealing the model”; sometimes the best tool to use to define a domain model is in fact text/code - possibly with supplemental external diagrams.

Context Maps and Bounded Context Interactions

When a system under design is decomposed into multiple bounded contexts then it can be hard to understand the whole.

There are a number of suggested documents that can be created to resolve this. One is the context map which shows all the contexts and the nature of their interactions. This map may simply show the relation between the contexts, or might show the correspondence between particular types in different contexts.

A bounded context usually has some kind of API, ie has a set of types and operations that will be used by other contexts which interact with it. Evans (and Vernon) define different ways in which that interaction can occur. Even when multiple bounded contexts are embedded within the same (monolithic) artifact, this boundary still exists and must be respected.

Different kinds of inter-context relation imply different kinds of team interaction as well as different implications for code update and release. Having these documented clearly on a context map helps teams know who to involve in decision-making and time-planning. Collaboration patterns between contexts include:

  • unified model - a single bounded context with all developers/experts involved in defining and updating it
  • shared kernel - in which a shared context is extended by multiple other contexts
  • customer/supplier - in which one context defines types that another context relies on/extends
  • conformist - in which a “providing” context has a model good enough for a consuming context to just adopt as-is (and possibly extend)
  • anticorruption layers - in which a consuming context isolates itself from another context it must collaborate with by mapping model types
  • published language - a dedicated model just for the “api”
  • separate ways - no collaboration on model design, minimal links between them

The list above is sorted from highly-cooperative to highly-decoupled.

Of course the simplest solution is when the entire project has just one bounded context with a single associated domain model. Here, there are no cross-context issues and nothing needs to be done.

Bounded contexts may have a shared kernel, in which the exchanged types are identical in both contexts - same meaning, same state. This should only be done if the business meaning of the types in that shared kernel really are identical in all “importing” bounded contexts. This relationship obviously has benefits for the implementation (simplified communication), but also has costs: any change to that shared kernel needs to be agreed-on by the owners of all bounded contexts that rely on that shared kernel - and all implementations of those contexts then need to be updated appropriately. Note that this doesn’t necessarily mean those shared concepts are shared code - there might be a “shared module” or “shared library” defining those types, but other options are possible including a shared openapi specification from which the (exchanged) types are generated, or simply separate but compatible implementations. The context map tells anyone considering changing a shared kernel who they need to collaborate with.

Bounded contexts may alternatively have a customer/supplier relationship, in which case the “upstream” domain model belongs to one team, but changes are made in consultation with the “downstream” systems that rely on those definitions. Again, the context map tells the upstream team who to consult when they wish to change something, and tells the downstream team who they need to talk to if they want to request changes or new features.

The conformist relationship type is simpler: “downstream” users of types from a bounded context have no influence, and simply need to accept the model as it is. The context map tells the upstream team who they should inform when they change their model, but this is typically a one-way communication. This doesn’t mean the upstream domain model owners can arbitrarily make “breaking changes”; they should consider the consequences on downstream systems and perhaps provide backwards-compatibility mechanisms. While this pattern sounds “weaker” than the more collaborative approach, it can save a lot of headaches related to inter-team collaboration - and sometimes there just isn’t an option anyway.

In all of these kinds of inter-bounded-context relations, some types from an external context are “adopted” (more or less) into a different context. This doesn’t mean sharing code, just concepts and compatible implementations. In domain model diagrams, the “shared kernel” approach generally just means referencing the shared model documentation as “also part of this model”. For the others, that approach may be made or the relevant types could be copy/pasted into the downstream model’s documentation. Note that this isn’t just about implementation; relying on an external model is also a conceptual change that affects how a team talks about and thinks about the problems within their own bounded context.

Where the implementations of two bounded contexts need to communicate, but a single model of the exchanged data just doesn’t work effectively for both, then an anti-corruption layer can be used. This is just a block on the context map but means quite a lot for the implementation of the “downstream” model; the downstream model defines (in both docs and code) whatever types it finds most appropriate to address its problem, and a potentially complex piece of code then is responsible for translating/mapping between the API which the other context offers, and the local representation. This anti-corruption layer is named because it prevents corruption of the model concepts of the downstream domain. The implementation is not part of the domain model itself; it is considered a “technical detail” even though this can be a significant amount of work.

A published language is, as far as I can tell, a dedicated (and stable) domain model for an API. A system which exports data in a published language can conform to that language, ie can internally model data in the same way that the API represents it; alternatively it can use the anti-corruption layer pattern, ie have an internal model which it then maps to the API’s model when exporting that data. Similarly, a system which needs to use the API (consume that data) can conform (use the same model internally) or use an anti-corruption layer to map that API’s model to a form more appropriate for its internal use. The most important part is that regardless of which approach the implementation each context uses, and even if those implementations change their models over time, the API remains stable. Often published languages are official standards. The point of the pattern is really that design in this case is “api first”, with an appropriate model chosen, then the data producers and consumers can choose to adopt the same model or map it to a different internal model.

The separate ways pattern tries to eliminate relations between two bounded contexts.

All of these patterns apply even to “modulith” applications containing multiple domains, ie there might be an anti-corruption layer used when making direct in-memory function calls between code in separate bounded contexts. In a monolithic codebase, if code from one bounded context just reaches across programming-language-module boundaries to refer to types from a different context, then this effectively creates an informal and undocumented shared kernel between the contexts - something that should be an official decision, not an implementation detail.

A unified model requires all team members to be “peers” designing a single model together. A shared kernel requires collaborating teams to act as peers for the shared part while developing the models of their own bounded contexts independently. Customer/supplier requires the supplier team to take change requests from the customer seriously; the supplier team owns its context’s model but needs to take others into account and the customer teams can then “import” the model of the supplying context into their context. Confirmist is used when the supplier team does not collaborate at all, but does provide a model that is useful to the consuming context; the relationship is “take it or leave it” and sometimes “take it” is an acceptable option. Anti-corruption is used when collaboration on model design is not occuring (as with conformist), and the supplier’s model is quite different from the desired representation in the customer; a (possibly complex) translation layer is created to isolate the consuming context from the providing context. In separate-ways, there is no data-flow/communication/dependency between the projects at all; each implements functionality and sources data independent of the other. Sometimes multiple disconnected systems are indeed better than one integrated one.

Both Evans and Vernon also discuss a pattern called open host service but frankly I can make little sense of their descriptions. It seems initially as if they are assuming that a software component that provides an API creates a separate api for every other component that may interact with it - but why would anyone take such a labour-intensive approach, rather than create just a single general-purpose API? Perhaps they are also assuming some underlying technology such as EJBs which are not really “open” to clients using other technologies? Is this “api per consumer” pattern perhaps related to weaknesses in old coding approaches to authorization-management, using a different API per external system in order to be able to expose only operations that are permitted for that system, rather than using a generic API with fine-grained access-control? Any suggestions on what Evans and Vernon mean by this pattern would be very welcome..

Note that models are object-oriented concepts, and in-memory APIs are (usually) also object-oriented, while remote APIs might not be. Rest APIs do have a somewhat OO flavour, as the request path typically specifies a resource which is equivalent to an aggregate root (a type and an id, eg /users/123). The operation of course is typically limited to GET/PUT/POST/etc rather than specifying a method-name. Other API types (eg gRPC, GraphQL, or async messages) have a non-OO style and therefore tend to have a somewhat different implicit model. However there is still typically a model “revealed” via an API.

Core Domain

In some cases the label “core domain” is used to mark one or more of the domains of an organisation as “the most important ones” - at least for the purpose of this project. In other cases, it is a subset of the most important concepts from the most important domain (see Distillation).

In either case, this is one of the things that anyone new to the project will consult to get an “overview” of the project as a whole.

Vernon points out that which domain is “core” depends upon your viewpoint; different teams may see different domains as “core” depending on what they are working on - it is context dependent.

It is suggested that a vision statement be written for the core domain of a project, as a guiding principle for design: what problem is actually being solved?

Distillation and Large Scale Structure

Interestingly, Evans dedicates a whole chapter to distillation, while Vernon doesn’t mention it at all.

Evans defines this (page 512) as:

the abstraction of key elements in a model, or the partitioning of a larger system to bring the core domain to the fore.

Complicated problems can lead to quite large and complicated models. Evans mentions one project which started by licensing an “industry standard” model from a consortium which was 200 pages long.

DDD introduces two complementary approaches to make the overall design easier to grasp:

  • distilling
  • structure

In tactical domain models, distillation can be used to simplify the model, discarding things not needed for the problems to be solved (eg Evans page 13).

In strategic domain models, a different kind of distillation is applied, in which concepts of secondary importance are moved to subdomains or modules, thus making the most important concepts - the core domain and its core model - more obvious (Evans chapter 15). It is not about removing detail, but instead about making the important bits stand out. It can be as simple as marking specific types as important, but otherwise involves pushing less-relevant (but not useless) details into their own parts of the documentation/code so they don’t clutter the big picture. A domain vision statement is helpful in defining the overall problem domain in a short way.

When the process of distillation isn’t enough, ie the project is so complex that understanding the core and its relation to the other parts of the (overall) model is still hard, then Evans suggests a number of approaches to add structure to improve understandability.

In both cases, the point is to allow people to talk about specific parts of the overall model without hand-waving, and to find parts of the model which are relevant for any particular discussion - or when implementing a new feature.

Bounded contexts are obvious starting points; these typically have only a small number of commonalities with other parts of the model, so can be represented separately. A context map can then be used to make the links that do exist more understandable. However contexts are still part of the overall model.

Types with high cohesion can be grouped into modules, and this is a building-block of both distillation and structure but is not in itself sufficient for larger projects. As I understand it, modules in tactical domain models are usually real programming language features (also called packages or namespaces in some languages) while modules in strategic domain models are simply a grouping mechanism. I suspect the word “subdomain” and “module in strategic domain model” are pretty close to the same thing.

Structure can be applied within contexts and modules, or over them.

Some domains naturally fall into layers, in which case it can be helpful to structure types or modules of types in this way. Sometimes layering applies just within a single bounded context, while in other cases the same layering context might apply to multiple (or all) bounded contexts in a domain. These layers are domain concepts and should not be confused with technical layers such as a presentation-layer or persistence-layer.

A system metaphor can perhaps suggest an appropriate structure. We use the metaphor of a “desktop” for graphical UIs on personal PCs for example; if there is a suitable metaphor for the problem domain then defining modules according to this metaphor can make code discovery easier.

There is quite a lot of interesting information on these topics, and no summary can do them justice. See the books!

Physical Subsystem

This is a term from Vernon rather than Evans, and means a “deployable artifact”, a running application.

There is no direct relation between physical subsystems, bounded contexts, and domains/subdomains. A model of an existing system may draw boundaries that do not align with physical subsystems, and greenfield projects may choose to create physical subsystems that do not align with domain/subdomain/bounded-context boundaries. Nevertheless, as typically bounded-context == team, there might be some benefits to having a 1:1 relation between bounded context and physical subsystem (to simplify release processes) if that doesn’t conflict with other architectural goals.

Entities and Value Objects

There are a number of concepts defined for use in tactical domain models; in general these are well known in DDD and are well defined in both books so I won’t spend too much effort on them here. However for completeness, here’s a quick recap.

Entities and value-objects represent “things” in the problem domain, and have state (data) as well as behaviour. Entities have a concept of “identity” separate from their state, while value objects do not (two value objects with the same state are in every way the same thing).

Aggregates and Repositories

An aggregate is a set of entities and value-objects consisting of a root and some associated “child” objects. Aggregates are used to define:

  • atomic units of update
  • collections of objects which may have inter-object invariants
  • units of in-memory navigation

The root object can be referenced “by id” while other objects in the aggregate are only accessible via navigation (using normal features of the programming language) from the root.

A tactical domain model groups entities and value-objects into aggregates to represent their time-centric transactional relationships (something otherwise hard to show on a diagram).

A repository is a kind of service which provides persistence for an aggregate. It is generally considered inacceptable for entities and value-objects to perform persistence operations (ie access repository interfaces); the reasons for this are discussed in detail later. Repositories are sometimes accessed from domain services - eg business logic which is performing “bulk updates” or generating “data summaries”. However mostly repositories are used by application services to load appropriate aggregates which they then either pass to domain services or invoke methods on directly. Because repositories are sometimes accessed from domain services, their interfaces are considered part of the domain model. Their implementations, however, are not part of the domain model - and not even part of the domain layer.

A repository’s API is intended to mimic an in-memory collection of objects: put, fetch-by-id, and fetch-by-criteria; other operations are generally not needed. Fetch methods only return aggregate roots. For testing, a repository can be replaced with a real in-memory collection. Repositories map very naturally to object-oriented databases (document stores), but can be mapped to relational databases via various frameworks (eg for Java: JPA or JDO implementations such as Hibernate or DataNucleus). Evans (page 114) briefly discusses the advantages of separating the mechanism of persistence from the repository domain model level concept; sadly Evans says only “I won’t go into solutions to that problem, but they do exist”. Presumably he meant technologies such as JPA.

An aggregate is loaded and persisted as a single unit. When using a relational database this occurs in a “transaction”; when using a “document store” database, each aggregate is typically a single document which is by nature “atomic”.

An aggregate is considered to be a “consistent set” of objects, potentially obeying invariants. The methods of the individual types making up the aggregate need to enforce this. It can be useful to implement instantiation of an aggregate in a factory class or factory method in order to properly ensure invariants without having to add lots of complicated code to constructors; stateful model types often have complicated enough behaviour without being cluttered with once-per-lifetime inialisation code.

Factories might be represented as a “createFoo” operation on a domain model item, but are sometimes not relevant parts of the domain model. In either case, they are part of the domain layer in the code.

Many articles about domain modelling state that there should be “rich relationships” between entities in the domain models. This is true on a logical level; if some entity is related to another entity then it is important to know and represent that. However such relationships should be defined only if they are relevant for the problem being solved, as such relationships can complicate code - and therefore shouldn’t be represented when not relevant. And more importantly, the existence of a relationship between type A and B doesn’t mean that the code for A has to provide a method that returns the associated instance(s) of B; that can have all sorts of complicated consequences. Often, simply storing the id of associated entities is sufficient.

A general rule for aggregates is to make them as small as possible; each aggregate is updated as a transaction and so the smaller they are, the less likely update-conflicts will be. An aggregate is also loaded as a (logically) atomic unit, meaning that smaller aggregates result in less database traffic. When using an object-store database, each aggregate is typically a single “document”, so obviously smaller is better. When using a relational database, “lazy” relations can potentially allow parts of an aggregate to be loaded on-demand, but such loads still need to be within the same transaction in order to produce a consistent view of the aggregate state - and therefore has consequences for updates. Often an aggregate is just one type, ie the aggregate root is the entire aggregate.

When modelling relationships, it is important to be as constrained as possible. If a relation does not have to be bi-directional in order to solve the problem being addressed, then it is helpful to specify that in the domain model. In fact, as the model is implemented, that is likely a point that the developers will bring up as question/feedback to the model, as bi-directional navigation is harder to implement. This applies to both inter-aggregate and intra-aggregate relations.

Within an aggregate, references to other entities not belonging to the aggregate are stored only as IDs; this obviously makes loading of an aggregate from persistent storage reasonable. Types belonging to the aggregate may provide methods that return references to other objects within the same aggregate, but should not provide methods to return references to things outside of the aggregate; instead they just return the IDs and leave it up to the caller to map such IDs to objects if needed. This ensures that developers don’t accidentally write code that updates more than one aggregate at a time (by updating fields, then navigating via a reference to an object in a different aggregate and updating that too); each aggregate is a transactional unit so updating multiple aggregates always risks inconsistency (one transaction committing, the other failing).

As each aggregate is updated in a transaction, the overally system should not offer APIs that require multiple aggregates to be updated simultaneously; it just isn’t possible. What is possible is to update one aggregate and post commands or events that trigger asynchronous updates to other aggregates (ie eventual consistency).

Small aggregates also support distributed data storage; all objects belonging to an aggregate must be in the same database instance in order to support transactional update, but those referenced only by ID may be stored elsewhere.

Navigation can trigger database access if a relation is lazy, but that is a side-effect that is invisible to the calling code; the interaction pattern from outside is still navigation and not lookup.

From Evans (page 149):

Any object internal to an aggregate is prohibited from access except by traversal from the root.

An aggregate may contain multiple entities (ie things with a unique database ID), but all IDs except for that on the aggregate root should be considered “for internal use only”. Other entities should only reference an external aggregate via the ID of an aggregate root. To remember this, just think of the case where storage is an object database (document store).

To summarize: an aggregate should contain a small number of objects (ideally just one). Its members may reference others by ID, but should not hold in-memory references to external objects.

Services

The word “service” is sadly rather overloaded. It can mean:

  • part of the “application layer” which acts as a facade over domain model types to connect incoming requests to the domain logic
  • a technical service, eg an interface to an external SMTP server for the purpose of sending emails
  • a “stateless” piece of business logic

Application-layer services are not part of a domain model. Technical services may be represented as an interface definition within the domain model, with the implementation done external to the domain model layer. Only the last type (stateless business logic) is considered fully part of a domain model, with both interface and implementation in the domain layer.

A domain model needs to represent operations (in addition to data and relations) and the recommendation is to bind these tightly to entities and value-objects where possible, with (domain) services used only when no more appropriate place exists for the operation. This assists the discussions that the model exists in order to support.

The three primary “nouns” found in a tactical domain model are: services, entities, and value objects. Domain services (holding domain logic) therefore have an important place in models - though generally secondary to entity or value objects.

From Evans page 82:

[definition of entity and value object]. Then there are those aspects of the domain that are more clearly expressed as actions or operations, rather than as objects. Although it is a slight departure from object-oriented modelling tradition, it is often best to express these as services rather than forcing responsibility for an operation onto some entity or value object. A service is something that is done for a client on request. In the technical layers of the software, there are many services. They emerge in the domain also, when some activity is modeled that corresponds to something the software must do, but does not correspond with state.

In addition, Evans’ book has a whole section dedicated to (domain) services (see pages 104 - 108). There is a lot of text that could be quoted; here are a few examples:

In some cases, the clearest and most pragmatic design includes operations that do not conceptually belong to any object. Rather than force the issue, we can .. include services explicitly in the model.

There are important domain operations that can’t find a natural home in an entity or value object. Some of these are intrinsically activities or actions, not things. [..] Now, the more common mistake is to give up too easily on fitting the behaviour into an appropriate object, gradually slipping towards procedural programming. But when we force an operation into an object .. the object loses its conceptual clarity and becomes hard to understand or refactor. Complex operations can easily swamp a simple object, obscuring its role. And because these operations often draw together many domain objects, coordinating them and putting them into action, the added responsibility will create dependencies on all those objects, tangling concepts that could be understood independently.

Some concepts from the domain aren’t natural to model as objects. forcing the required domain functionality to be the responsibility of an entity or value object either distorts the definition of a model-based object or adds meaningless artificial objects.

Services are a common pattern in technical frameworks, but they can also apply in the domain layer.

Fine-grained domain objects can contribute to knowledge leaks from the domain into the application layer, where the domain object’s behaviour is coordinated. The complexity of highly detailed interaction ends up being handled in the application layer, allowing domain knowledge to creep into the application or user inteface code, where it is lost from the domain layer.

This use of services provides one mechanism for avoiding the blurring of application-services with domain logic; where the operations in an application service start to embody business rules, push them into a new service class and add that class to the domain model. It is at least now visible - and can perhaps then be given an appropriate name and become a useful concept. A useful guideline is to consider if a piece of code could usefully be discussed with business experts.

The strategy pattern can be used to pass policies to methods on entity and value types; such policies can be represented as domain services.

Suggestion from Evans: name services using verbs not nouns.

Domain Events

Vernon spends quite a lot of time addressing the topic of events emitted by domain models. This isn’t mentioned in Evans and appears to be a subject that was created/invented in the interval between these books.

As noted, an aggregate is the unit of transactional persistence. However business workflows often require multiple such aggregates to be updated, sometimes in the same system and database, and sometimes in different databases owned by different systems. These kinds of problems can potentially be solved by having aggregates emit asynchronous events as part of their updates, and having other systems listen and respond to these. See Vernon for more details on this.

Other Interesting Topics

Alternatives to Object Orientation

Although the books, and most of the literature, on DDD is focused on object-oriented programming languages, the principles still seem useable with other programming styles.

What is important is that the same model be useful for representing knowledge of the domain experts and representing significant features of the codebase that is developed to solve the problems. The model doesn’t need to map 1:1 to either (and particularly to code) but the mapping should be simple and obvious and navigable in either direction (given an item in the model identify the corresponding piece of code, or given a piece of code identify the corresponding item in the model).

When the language used is Object Oriented, then the model is centered around entities, value-objects, and services. And in fact this seems to be the most natural shared model between domain-experts and developers. This is no surprise as object-oriented programming actually has its roots in system modelling; the language SIMULA-67 was designed exactly for that (eg modelling ships, cargo, and ports to optimise harbour management) and the ideas followed into SMALLTALK and then other OO languages such as Java/C#/Python which are commonly used to build business systems today. Note however that even in the presented OO-centric form, DDD doesn’t rely on polymorphism/virtual-dispatch or interface/implementation distinctions ie doesn’t really need an OO language, just an “OO-like” one (‘OO-ish”?). Evans discusses the dominance of OO (and the reasons for it) on page 116; as well as being a good common ground between experts and devs, the extensive community, resources, and tools for doing DDD in OO makes it a good choice - but it doesn’t seem the only one.

Evans (page 119) does briefly mention the possibility of doing DDD using Prolog; presumably such domain models would look quite different (based on facts and rules) than the ones that are developed to support development in an OO language (based on entities and operations), but a “common ground” between experts and developers seems possible and therefore a single model can support both problem-description and code-description which is the important thing.

Evans does spend quite a lot of time talking about the benefits of side-effect-free functions, ie is at least sold on the concept of “functional in the small, OO in the large” in the context of DDD - something that hybrid languages such as Kotlin or Scala support.

A simple internet search shows that there is quite a lot of information (videos and books) on the subject of domain-driven-design in functional languages; I haven’t personally investigated yet but it seems entirely feasible to develop a model that works both for business and for functional code; see presentations from Marco Emrich or Scott Wlaschin for example.

Although Evans seems convinced that “procedural languages” are not practical to use with the concepts of DDD, I’m not 100% convinced it is impossible. It’s probably a moot point, as it’s likely not a good idea to do it, but what DDD even in its traditional format (as presented in the book) relies on is:

  • types representing a set of properties
  • operations on the types - which could be methods or plain functions
  • modularity, ie some way of grouping types and operations together (modules/namespaces) - classes are only one type of module
  • a way of enforcing invariants on types (methods/property-accessors are only one way to do that)

The primary benefit of OO, in the context of DDD, seems to be that classes provide “automatic” modularity, grouping sets of data and operations on that data together. This gives business and developers the opportunity to consider these “modules” as a concept (a user, an invoice, etc). As a secondary benefit, it provides a mechanism for enforcing invariants. However such modularity is available in non-OO languages in ways other than classes.

It seems to me that even “C” could achieve this, with a moderate amount of discipline. Each entity in the domain could be an opaque type definition in a header file, and the associated operations can be function declarations in that header file taking that type as its first parameter. The associated implementation file then defines the fields of the type and the function implementations - ie provides data-hiding so that the fields remain private. Alternatively, the fields could be declared in the header-file, but code-check tools could ensure they are never directly referenced - ie “private” data is enforced by an external tool rather than the compiler. This approach seems to make a “model” feasible - though it does seem to be using C as “pseudo object oriented”. I’m not seriously proposing doing DDD in “C”, just pointing out that if “C” could do it, then probably other non-OO languages can do it better.

In some places, Evans seems to use the expression “procedural” as a synonym for large functions that contain multiple concepts. This isn’t an intrinsic part of procedural programming though, or something that cannot happen in an OO method. Different tasks should be separated into different functions, and those functions should be assigned to appropriate modules, giving developers and domain-experts a chance to refer to them by name. That applies to all programming languages.

As we are talking about problems with significant domain logic, the programming language used will typically need to fulfil these anyway:

  • mainstream acceptance
  • developer performance over runtime performance
  • debuggability over runtime performance

In practice this probably means an OO language such as C#, Java, Kotlin, Python, or Javascript. However things such as the Lisp family would also theoretically be candidates; they seem to have the necessary features to also represent types, operators on those types, and modules. I have had a long discussion with someone doing DDD in F# and being very happy with it.

The Modelling Process

Much of Evans’ book (the later parts) talks about the process of finding/discovering/developing a good model. While there is lots of good info here, it’s hard to summarize.

As an architect or developer:

  • Read the literature for the domain
  • Listen to the domain experts
  • Ask lots of questions
  • Look for places where descriptions of things in the model are controversial (not everyone agrees), confusing, or vague.
  • Try to separate technical coordination from business logic; there may be things that can be “pushed down” into the model

Note: “design” is the technical code, “modelling” is the business aspect. The design should “reveal” the model; the model should “map to” the design.

Businesses used to be centered around paper forms before automation started. First a form would be filled out, with fields. A workflow would then be applied to this immutable form, in which the form flowed from department to department until reaching some end state. This suggests that “functional programming” is in some senses a better model for many businesses than object-oriented design. However it would be a poor idea to structure a domain model as a passive entity and a set of “department” objects each with a set of operations representing the workflows they can apply to a specific entity type. The reason is that departments are not stable; they regularly get re-organised. Workflows are far more stable than management structures, so instead consider creating modules consisting of the forms (entities) grouped together with the workflows that can be applied to those forms (operations). This is effectively equivalent to the OO representation: data grouped together with the operations that apply to that data. In other words, whether you are a fan of functional or OO programming, the domain-model representation is effectively the same. A model does not need to be mapped 1:1 to code; it just needs to be an obvious and consistent transformation (and one which can be followed in both directions). Data and associated operations need to be in some kind of module/namespace in order to be easily understood; a class is one kind of module/namespace but others are fine too.

Development Processes, Team Structures, and Human Interaction

Because a successful project needs a domain model and language which directly connects the domain experts and the developers, the role of “business analyst” (BA) becomes questionable. While traditional (non-coding) business analysts are good at helping domain experts develop a model, that model isn’t necessarily one that is helpful to the developers. Therefore either BAs need to become “coaches” to the model development process, or the developer representatives also needs to play the role of BA. Evans (page 48) goes into depth on the problems with a separated “analysis model”. To repeat a quote:

Model-driven design discards the dichotomy of analysis model and design to search out a single model that serves both purposes.

Evans and Vernon talk a lot about models evolving during development, and rightly so. This means that everybody involved in the project is in some sense doing analysis ie domain model refinement. However every project needs to have at least a partial model before coding starts, ie there will be an initial phase with a smaller pool of experts and just a few representatives from the sofware development side. In general, these developer representatives probably have the job-title of architect. However whoever is doing this must be:

  • willing to learn at least the core concepts of the problem domain
  • competent in domain modelling
  • and able to represent the developers properly here, ie is still a competent developer.

An architect who is an active developer is also helpful during the initial coding phase; even with a good model developed, there is still a potential communication gap between the model and the developers. Only a few domain experts were involved, and they had deep contact with the architect during creation of the model, ie no communication issues should remain there. But a larger pool of coders now need to try to map that model into code - and therefore may need help in understanding it. That help is best done by someone who truly speaks their language; and the help is often best provided via example code. It may also well be the case that the current model is not easily mappable to code, in which case the model needs to be revised. A “coding architect” is in the best place to tell whether implementation problems are due to coders misunderstanding the model, or an incorrect model.

As noted, developers are also expected to be what Evans terms “hands-on modellers”; responsibility for design is shared and communication from implementers up to influence the model is not only allowed but critical for success. When developers truly understand the model, then factoring out of duplicate code results in new and helpful concepts (items in the model) and not just technical code reuse. Those new concepts can lead to “breakthroughs” in functionality and performance.

Model development is iterative - particular in the case where a model must function both as a description of the problem domain and a skeleton for the code.

Evans is no fan of generating code from diagrams; the diagrams are just not expressive enough. At the same time, they have too much detail for many uses. A single diagram of “the entire system” can be overwhelming. Instead, he tends to write documents with lots of embedded diagrams each describing an aspect of the system. And the code is the “complete spec”; documents exist to provide additional info and context for the code, and to prompt the common understanding of all participants already established through discussions.

Having a ubiquitous language allows documents to be more concise; a single well-defined term can be used rather than a long vague reference.

General advice to developer-modellers (Evans page 321) is to keep trying to improve the model while implementing. To achieve this:

  • Live in the domain
  • Keep looking at things in a different way
  • Maintain an unbroken dialog with domain experts

Continuous integration (aka “trunk driven development”) is recommended in order to ensure that different developers don’t evolve the model in incompatible ways. Developers should merge their changes regularly (eg daily) into the “release branch” and then this branch should be built and all automated tests applied to it. Assuming that existing code has good unit tests that verify the desired behaviour, any change to the model that breaks an earlier developer’s assumptions will be detected at this point (if not earlier) and a discussion about the desired model behaviour can be had - first between developers and then if needed with domain experts. The result is either a correction of misunderstanding or an improved domain model.

Layered Architecture aka Isolating the Domain

While developers and domain experts need to talk about business concepts, there is a lot of code in an application which is purely technical and doesn’t need to be discussed. It is therefore important to structure the code-base to separate this technical infrastructure part from the domain-related part. without this, “it’s hard to see the forest for the trees”. Automated tests also become awkward; business logic typically requires large numbers of tests and there really is no need to mock infrastructure components when doing this (as long as separation is present).

Domain-driven design therefore often applies the Hexagonal Architecture (aka Ports and Adapters) pattern, or the similar Onion Architecture pattern. Both of these isolate the implementation of business logic from other concerns.

The Application Layer

DDD makes a useful distinction between the domain layer, the application layer, and “non-domain code”.

The application layer is where “use cases” are implemented by orchestrating domain layer operations. The domain layer is a kind of “toolbox” from which the use-cases are composed. Note however that the application layer methods should be simple and obvious - suggesting that complex use-cases should be seriously analysed to find logical operations which can then be pushed down into the domain layer. There does seem to be a fuzzy border between what is a “usecase in the application layer” and what should be a domain service.

It’s also likely that use-cases will be discussed with business experts, yet the application layer is not considered a topic for discussion with business experts - only the domain model. So this seems somewhat inconsistent.

In any case, “adapters” (in hex arch terms) such as embedded UI layers or remote service layers interact with the application layer and not the domain model directly. In particular, persistence transaction handling is part of the application layer meaning that at least all commands must go through the application layer.

A use-case might also potentially require orchestrating operations from multiple bounded contexts, so use-case discussions are about more than just a single bounded context. Note however that the aggregate is the unit of transactionality, so requests typically should not modify objects in more than one context; see section on aggregates for more info.

One of the points of modelling is to define a structure that is more representative of the problem domain than just a block-of-code-per-use-case. When developers are simply given a set of implementation tickets of form “given input X the system should return Y”, then the end result may be a system that fulfils exactly those requirements - but it will not be one with an elegant and minimal code-base, will be difficult to understand, and will not be easily adaptable to future requirements. In effect, the entire system will be implemented in the application layer. Software based upon a suitable model of the problem domain will be better in all these aspects - and that model needs to be kept separate from non-domain code.

The term “transaction script” is sometimes used for this approach where each use-case is implemented directly, with few or no “model” concepts present. This can be useful for simple/trivial applications but doesn’t scale well to complex domains.

Persistence

This article has already talked about aggregates and repositories in terms of “what DDD recommends”; however how to actually implement that is a non-trivial topic.

The whole of Evans chapter 6 is dedicated to “the lifecycle of a domain object”, including persistence issues.

The most recommended pattern (as far as I can tell) is for all persistence-related operations to be performed at the application service layer before the domain model types are invoked. The expected code pattern is roughly:

  • a network-api-handler decodes incoming data into an appropriate data-structure then calls an application service
  • the application service
    • starts a database transaction
    • loads an aggregate-root by id (via a repository)
    • calls an operation on that object
    • commits the transaction
    • returns the result to the network-api handler
  • the network-api-handler then converts that result into a suitable network-format response

The api-handler, application service, and repository, are outside of the domain model, ie are not part of the domain model documentation and are not referred to in discussions with domain experts. They should therefore do nothing business-relevant; if they aren’t part of discussions with the experts then the reasons for that should be obvious.

Sometimes application services need to load more than one aggregate, and sometimes they call a domain service rather than a method on an aggregate root. However these sections of code really must do nothing business-relevant. This also ensures that such methods have “consistent level of abstraction” ie don’t mix low-level and high-level concepts.

In general, methods on aggregates only access other objects on the same aggregate, so the need for an aggregate to access other objects is rare. Where this is needed, they can generally be passed in by the application service as a parameter. Where the object is determined based on data, the application service can potentially pass in a “provider” object (a kind of special-purpose repository) that can return the relevant object on demand.

The fact that this “application-level code” controls the database transaction is what gives it the alternative name “transaction script” - although that name typically refers to code that does all the work itself, whereas an application service instead delegates to the domain model.

From Vernon page 266 (sadly no reason for this guideline is given here):

As a rule of thumb, we should try to avoid the use of Repositories from inside Aggregates.

And from Vernon page 279:

A Service in the domain is welcome to use Repositories as needed

Having entities access repositories is generally unnecessary and suggests a design problem as aggregates are “atomic units” which should be concerned only with their own state. An entity might possibly provide a “factory method” which returns a new entity (eg “cloning”) - in which case the caller (an application service or domain service) is responsible for registering that new object with the appropriate repository. Services might need more sophisticated access, eg methods which produce summaries/reports - those encode business logic but perform access across multiple entities. In this case, a Repository interface may provide suitable “bulk” operations which the domain service then invokes.

Having entities access repositories is particularly difficult as they somehow need access to these “stateless” repository service(s); doing “dependency injection” into stateful types such as entities and value objects is difficult and generally not done. Passing repository references in to methods is possible, but ugly. Doing dependency injection on stateless domain services is, on the other hand, not a problem.

It is generally assumed that entities do not “save” changes. With document-centric databases, the caller (a service) should know whether the invoked method mutates something, and if so will save the entity by just storing the whole aggregate (document). With a relational database it is assumed that some framework is used which does some kind of “change tracking” so that the appropriate fields which need writing back to the database can be determined automatically.

It is of course possible to write arbitrary SQL statements that update data in a database directly, eg can insert a contact-phone-number for a customer without loading the customer object and its associated map of contact-phone-numbers. However comprehension and maintainability should override performance; load the aggregate (the customer profile), invoke methods on it, and persist it again rather than directly messing with its persistent state. This ensures the proper validation, logic, and invariants implemented by the model types are applied. Note: this is also more portable to an OO database if that is an option in the future; OO databases naturally persist aggregates and (unlike relational DBs) don’t allow random data to be updated.

Interestingly, on page 160 Evans suggests that it may be worth making some compromises in the domain model to make the mapping to a database “transparent”. This doesn’t mean that model-to-tables needs be be 1:1; some transformation is acceptable but it should be an understandable and consistent one. Alternatively, it is acceptable to have quite different database and memory models - but that should be a deliberate choice. One issue to watch out for is that refactoring of code is easy, but schema updates are not.

Dependency Injection into Domain Model Types

Domain services may well need references to other services - including Repository implementations. As these stateless services are singletons, this is pretty easy to do - either just manually (pass the objects to the domain service constructors) or via a standard dependency injection framework.

Stateful domain types (entities and value objects) should not need access to repositories (as explained earlier), but if they have complex logic in their methods then they might well need access to other domain services or to “infrastructure services” of various types. There are several options to deal with this:

  1. pass the required services in as parameters to methods which need them
  2. put logic requiring access to services only in other services, never in stateful types
  3. perform dependency injection on entities and value-objects as they are instantiated
  4. using global references to services (“lookup”)

Option (1) can be appropriate in some cases. However it does force internal implementation details to be exposed via the model type’s APIs; the fact that a particular service is relied on might otherwise be an “implementation detail”. The necessary parameter could also sometimes need to be passed through multiple methods to where it is needed.

Option (2) does move the domain model somewhat in the direction of an “anemic model”. The majority of methods can still be defined directly on the stateful types, but some things that might otherwise be on such types will be domain services instead - which might in turn require those stateful types to expose more of their internal state than they would otherwise do. This isn’t the end of the world in my opinion, ie can be appropriate in places.

Option (3) seems tempting, but is actually rather hard to implement. It requires hooking some dependency injection framework into the persistence framework (to inject as objects are loaded via repositories) and requires every other place where instances of the type are created to also use the dependency injection framework (eg using a DI-framework-provided factory such as JPA’s suport for Provider<T>).

Option (4) is one I find horrible. This is the pattern of accessing services via global variables - also known as “static fields” in some languages. The static singleton pattern, in which a type has a global/static method that returns the value of an (internal) global/static field, is just the same thing. This does allow any code to obtain a reference to any service at any time, but has a number of nasty consequences: it hides dependencies, it makes it impossible for different parts of the code to use different instances of a service, and it greatly complicates unit testing. Sadly it appears quite frequently in DDD examples (Evans even suggests applying this pattern on page 108). If this pattern is used, then I would recommend having just one global “service provider” object which in turn can return all the different services for which “lookup” is supported; this at least allows unit tests to configure just this one global field with an instance that provides only the services that the unit test is expected to need - and which fails on any attempt to fetch an unexpected service. Note that the static singleton pattern as originally described by the “gang of four” is fine; it’s only a problem when the instance is set by something else.

Here’s the kind of question and answer that leads to anaemic domain models due to persistence issues - if persistent model types are expected to have complex logic which relies on services (rich domain model), then it seems that dependency injection into entities would be helpful. Sadly as noted above, this just isn’t standard practice and in fact is hard enough to get right that despite such dependency injection feeling like the right solution (to me), options (1) and (2) are probably “less bad” given the lack of out-of-the-box injection support for entities. Certainly in the question referenced above, the developer fell back to option (2) when choosing option (1) would probably have been better.

Interestingly, the Apache Causeway framework supports injection into entity types. However it appears to use this to allow entity types to obtain repository references in order to do persistence operations - something that we discussed above and that DDD appears to discourage - so maybe this shouldn’t be taken as best practice for DDD.

The following (sub) section looks at how option (3), ie dependency injection into entities, might be implemented in Java environments. This doesn’t mean it is recommended - in fact, I would personally suggest choosing either (1) or (2) on a case-by-case basis, whichever happens to be the “least worst” for that particular method.

Dependency Injection for Entities with JPA, Hibernate, or JDO

The previous section looks at whether dependency injection for stateful types (DDD entities and value objects) is sensible. This sections looks at how this could be done (if you so choose). Note that I haven’t personally ever set up dependency injection for entities, just researched the possibility.

JPA has the concept of entity listeners (since early days) and these are CDI-enabled (since 2.1), ie can themselves rely on dependency-injection, including the ability to inject the “dependency injection context” itself. Listener logic can be executed in phase PostLoad, ie cannot control the instantiation of an entity, but can at least do field-injection after the entity has been created and populated with attributes from the database. Entity classes can be annotated with the appropriate listener as shown below, or a “default entity listener” for the whole persistence context can be defined, ie the EntityListeners annotation is no longer required on entities.

@Entity
@EntityListeners(DependencyInjectionListener.class)
public class ...

When using EJBs or spring-boot, injection support for EntityListener classes is auto-enabled. In other cases, you may need to pass a property to the persistence-manager during setup. That property is jakarta.persistence.bean.manager (legacy name: javax.persistence.bean.manager) and the value must be a CDI BeanManager. Alternatively if using Hibernate then pass property org.hibernate.resource.beans.container referencing an object of type org.hibernate.resource.beans.container.spi.ExtendedBeanManager.

Spring 5.1 added class org.springframework.orm.hibernate5.SpringBeanContainer which is the necessary implementation to support the above behaviour (in earlier versions you need to implement that yourself, though it isn’t a big job). And recent versions of spring-boot include class LocalContainerEntityManagerFactoryBean which set this up automatically, meaning that entity classes have normal Spring lifecycles. This is all available since 2019. It seems that spring-boot expects a bean of name “entityManagerFactory” of type LocalContainerEntityManagerFactoryBean, and wires this into the persistence-manager as it is being set up by spring-boot - ie the DI integration can be overridden by spring-boot reasonably easily by just providing a custom bean with this name.

Further hints for JPA, Spring and Hibernate setups can be found here (particularly last comment) and here.

If you wish to, or need to, hook into the Hibernate framework more directly, the Session type is created via a SessionFactory which is initialised via a Configuration instance, and this supports setInterceptor. The Interceptor has a method instantiate which appears to be a good point at which to do field injection.

Hibernate supports config option hibernate.cdi.extensions (ALLOW_EXTENSIONS_IN_CDI) which is documented as “create beans other than converters and listeners”. It isn’t clear what this means, and searching the internet provides no useful hits. If you wish to depenendy-inject entities, it might be worth looking into. See also Hibernate class AvailableSettings for config options such as JAKARTA_CDI_BEAN_MANAGER, BEAN_CONTAINER and ALLOW_EXTENSIONS_IN_CDI.

Note that even if creation of entities can be delegated to a DI framework, it still isn’t possible to create immutable entities with JPA; JPA insists on setting fields directly or calling setter-methods. Constructors taking the fields of the object are not supported. This means that fields cannot be marked final. It is of course possible to provide no setters, making the fields at least logically immutable.

JDO also supports entity-lifecycle-listeners. And because JDO is more “api-driven”, there is an API for registering these listeners. It is therefore not necessary for the JDO framework as such to “support injection” into these types; just create them with the appropriate parameters (eg a reference to the DI context) and then pass them to PersistenceManager.addInstanceLifecycleListener (or PersistenceManagerFactory) on persistence startup. However if you wish, CDI integration in JDO can also be done by passing the CDI framework as a config-param. Setup appears to be done via property datanucleus.cdi.bean.manager in the PersistenceManagerFactory properties. The same disadvantages as listed for JPA above apply: the bean is created with new and then loaded, and then can be injected with dependencies - workable, but not the most elegant solution.

There is of course always the manual approach: put a facade over the repository classes and do the injection there. However this doesn’t work with lazy-loading; here no facade is used to load the entities and therefore no dependency injection will occur.

Possibly Hibernate custom UserType mappings could be used to inject stuff, though it looks non-trivial.

Here is an interesting comment by Gavin King, the primary inventor of Hibernate:

I really think you’re setting up a false dichotomy here. I’ve heard this same argument before from other people. It is claimed that:

  • it is very much more OO to put business behavior with data, and
  • our entities hold most of the interesting data, so
  • therefore, we need dependency injection in our entities.

But this argument is missing a couple of steps. First, you need to show that:

  • the kind of interesting business behavior that makes sense to package with the persistent data is the same logic that we usually want to implement in injectable beans, and
  • there is no other natural way to get injectable beans into an entity other than injecting them directly.

But I’m simply incredibly skeptical on both of these points. IMO, the kind of business logic that really belongs on your entities is the stuff that uses the entity, and other entities that are related to it by graph traversal, and not “random other stuff” that you get by injection. And if you really can find me some cases where you need to use the random other stuff in an operation that truly does belong on an entity, I don’t see why you can’t just pass that stuff as method parameters from something else that you can inject into.

I wouldn’t necessarily agree; DDD is primarily about business applications, and I’m not sure how much Gavin has worked with these. Note that we are talking about injecting domain services into entities, not application services; the DDD book makes the difference between these very clear. However Gavin’s view here makes it clear why Hibernate and JPA don’t (easily) support injection into entity beans. IMO it’s a big call for a library (such as Hibernate/JPA) to dictate to an application’s architect how their code will be structured, when adding support for instantiating entity classes via a DI framework is almost trivial.

DDD, Distributed Systems, and Microservices

While Evans was clearly aware of the potential for DDD-based systems to be distributed, he appears to consider this mostly to be an architectural detail which is “out of scope” for DDD. There is a brief discussion of packaging software for deployment on page 387 but it doesn’t help much. Evans does mention (page 108) that a service may act as facade over a set of domain objects to represent an interaction point between systems, reducing the “chattiness” needed to interact with remote objects.

Although written 10 years later, Vernon doesn’t spend much time talking about the interactions between DDD and distributed systems either - though the discussions on domain events sometimes lean in this direction. However there is a vast amount of information on DDD and microservices available on the internet - possibly too much.

In Evans part 3 a lot of text is dedicated to the concept of iteratively improving the domain model. However this is most effective within a single code-base; if in the first phase of a project you divide a system into modules and create a microservice for each module, each with its own database, then this hinders such “model-level refactoring”. Moving code within a single code-base is relatively easy; moving data between tables is harder but doable. Moving code between repositories and changing component remote APIs is hard, and moving data between databases is harder still. This suggests:

  1. start any project with a monolith, and move to microservices as late as reasonably possible. This gives you a chance to learn about the domain, ie you get a better shot at a good subdomain partitioning.

  2. make the microservices as large as possible. This still gives you easy refactoring within each code-base.

As soon as components are separated by a network and have different release-cycles, a lot of effort needs to be put into designing stable and upgradable interfaces between them. Postponing this as long as possible seems like a good idea.

In my article on distributed read models I suggest passing “current state” messages between components using Kafka compacted topics. This does of course also complicate any model changes; these messages are aggregates (not events) and as recommended anyway by DDD, aggregates should be as small as reasonably possible. Given that the transactional constraints of a system are unlikely to change (are stable requirements), these messages should be reasonably stable as long as they truly are minimal sets of entities that need to be atomically updated.

In the case where all system code is built into a modulith, it doesn’t matter too much whether bounded contexts represent “vertical” slices of a system or “horizontal” ones. Vertical slices typically cause lower coupling between the teams which own the contexts but that depends on the system under design. However if creating a distributed system then it is important that the artifacts (which are based on bounded contexts) provide vertical slices of user-facing functionality, and not horizontal slices. The horizontal approach will result in either deep synchronous call chains, or deployable artifacts containing fragments of multiple bounded contexts. Given that bounded context == team, and deployable artifact == team, that obviously is not consistent. DDD’s focus on “domain experts” doesn’t necessarily address that - it depends on whether these are experts on user-facing behaviour or “back office” behaviour.

Supple Design Concepts

Evans dedicates a whole chapter (chapter 10) to the concept of “supple design”, ie making code nicely refactorable/recombinable/understandable. Below are a few “tactical” patterns related to supple design which I found particularly interesting.

Intention-revealing interfaces (Evans page 246): Name classes and operations to describe their effect and purpose, not their implementation. Names should use the ubiquitous language so that readers can understand its meaning. In the public interfaces of the domain model, state relationships and rules, but not how they are enforced; describe actions and events but not how they are carried out; formulate the equation but not the algorithm.

Side-effect-free-functions (Evans page 250): Use pure functions where possible. Keep concepts of commands and queries separate. Any functional-programming fan can tell you about the benefits of pure functions!

Assertions (Evans page 255): Where side-effects occur, document the changes via assertions. They include pre-conditions, post-conditions (what side-effects are expected), and invariants (what is always true). Assertions may be part of the model (particularly for aggregates). They should be validated either via explicit checks in the code, or unit-tests.

Evans documents several additional patterns which are all useful, but of less impact than the above.

Evolving Order

This term is used regularly through the later part of Evans’ book. The point being made is that nothing about the model, from its fine details (entities etc) through to its large-scale structure (eg layers), should be considered sacred and immutable. Creating an executable program is a process of design and discovery and not just mechanical implementation. No design will be correct on the first (or fifth) attempt. When problems are discovered, this needs to trigger re-evaluation of the model (which is equivalent to the code structure). Improvements in the model then need to trigger changes in the code structure (ie refactoring) and improvements in the code structure (refactoring) needs to trigger updates to any model documentation. This is particularly important in the large-scale structure used to represent the model.

This is actually why simply saying “the code is the model” is tempting in some ways. However this does make discussion with non-programmers more difficult. It can also be difficult to see the important parts and avoid the “clutter” when a model is simply “the code”.

Handling Duplication of Data and Logic

It is very important that each data item in a system has only one owner. Sadly this is something that Evans does not address at all; Vernon does do so implicitly (via discussions of domain events, CQRS, etc). Any time two different contexts have an entity with the same field, one of those fields needs to be marked as “owner” and the other(s) as “view-only”. The consequences of two different contexts thinking they have the right to modify that field are obvious - that can break invariants defined by the other entity, and cause race-conditions when using one database and divergence when using multiple.

Repositories and Aggregates don’t seem to resolve this. Yes, a repository returns a specific Entity. However there seems no rule that prevents a different bounded context from defining a repository that loads/saves a different Entity that happens to have similar attributes which map to the same tables.

Somewhat related, but less important, is duplication of operations. When some business rule is implemented separately in two different contexts, and that rule is changed, then what happens? In some senses, this can be considered acceptable: the rule might actually only be coincidentally the same in both contexts, and the fact that the rule in one context has evolved doesn’t necessarily mean it has evolved in the other.

A bounded context corresponds to a set of use-cases, and to a set of domain experts. It is unlikely that use-cases are duplicated, ie experts generally know that “this case is handled over there”. In particular, updating particular data fields hopefully falls into a set of use-cases that naturally falls into a single bounded context so that conflict over data ownership (who can write) is not common.

Nevertheless this suggests that fine-grained bounded contexts should be treated with care.

The Entity Service Antipattern

One danger waiting for people doing DDD is to think that a set of logical functionality automatically corresponds to a deployable artifact. For example, a large number of projects will have a concept of invoices and logic associated with issuing them and tracking payment for them. However mapping this directly to an artifact which provides APIs for creating and updating invoices may not be an optimal solution.

In general, it is best to create systems which are vertical slices of functionality from the user’s point of view, with a single request handled with as few inter-process communication points as possible. However following DDD without careful thought can sometimes lead to a horizontally sliced system in which handling a user’s request requires making a chain of synchronous RPC calls through a set of layered components. This creates single points of failure, and causes implementation of new requests (user features) to cross component boundaries and thus involve multiple teams (assuming components belong to teams). This problem is sometimes called the Entity Service Antipattern.

It may be possible to partially resolve the single-point-of-failure by making interactions with such components asynchronous, but that still doesn’t address the development-time coupling. It is therefore worth considering whether that central concept (g invoices) can actually be made domain-specific ie each domain (set of requests) may deal with that concept in its own way rather than rely on a central component. One possible approach is to design components (and possibly domains) around the concept of “value streams” or “customer journeys” (sets of customer-facing use-cases).

Nice Quotes from Evans

Evans has a very nice writing style, and the ability to create very effective statements. Here are a few that stood out for me..

This is a design book, but I believe that design and process are inextricable. Design concepts must be implemented successfully or else they will dry up into academic discussion. (Preface, xxii)

Continuous refactoring is a series of small redesigns; developers without solid design principles will produce a code base that is hard to understand or to change - the opposite of agility. And although fear of unanticipated requirements often leads to overengineering, the attempt to avoid overengineering can develop into another fear: a fear of doing any deep design thinking at all.

Every model represents some aspect of reality or an idea that is of interest. A model is a simplification. It is an interpretation of reality that abstracts the aspects relevant to solving the problem at hand and ignores extraneous detail.

A domain model is not a particular diagram; it is the idea that the diagram is intended to convey. It is not just the knowledge in a domain expert’s head; it is a rigorously organized and selective abstraction of that knowledge. A diagram can represent and communicate a model, as can carefully written code, as can an English sentence.

Domain modeling is not a matter of making as “realistic” a model as possible. It is more like moviemaking, loosely representing reality to a particular purpose.

A well-written implementation should be transparent, revealing the model underlying it.

If the (implemented) design, or some central part of it, does not map to the (analyst’s) domain model, that model is of little value, and the correctness of the software is suspect. At the same time, complex mappings between models and design functions are difficult to understand and, in practice, impossible to maintain as the design changes. A deadly divide opens between analysis and design so that insight gained in each of those activities does not feed into the other.

The natural course of events is for (context) boundaries to follow the contours of team organisation. People who work together closely will naturally share a model context. Most project managers intuitively recognise these factors and broadly organise teams around subsystems. (page 344)

Decisions about whether to expand (an existing model) or to partition (into) bounded contexts should be based on the cost-benefit trade-off between the value of independent team action and the value of direct and rich integration. In practice, political relationships (..) often determine how systems are integrated. (page 382)

References and Further Reading

Some potentially useful links..