Home > Entire FAQ

Domain-Driven Design

What is a domain?

The field for which a system is built. Airport management, insurance sales, coffee shops, orbital flight, you name it.

It's not unusual for an application to span several different domains. For example, an online retail system might be working in the domains of shipping (picking appropriate ways to deliver, depending on items and destination), pricing (including promotions and user-specific pricing by, say, location), and recommendations (calculating related products by purchase history).

What is a model?

"A useful approximation to the problem at hand." -- Gerry Sussman

An Employee class is not a real employee. It models a real employee. We know that the model does not capture everything about real employees, and that's not the point of it. It's only meant to capture what we are interested in for the current context.

Different domains may be interested in different ways to model the same thing. For example, the salary department and the human resources department may model employees in different ways.

What is a domain model?

A model for a domain.

What is Domain-Driven Design (DDD)?

It is a development approach that deeply values the domain model and connects it to the implementation. DDD was coined and initially developed by Eric Evans.

What is the blue book that everyone is talking about?

This one? It is the defining text on Domain-Driven Design, by Eric Evans, the founder of DDD. It comes highly recommended.

What is a ubiquitous language?

A set of terms used by all people involved in the domain, domain model, implementation, and backends. The idea is to avoid translation, because as Eric Evans points out,

Translation blunts communication and makes knowledge crunching anemic.

That is, every time we have to translate concepts between people — "oh, you're using 'user' in these cases where I'm using 'account'" — we lose a direct ability to think clearly about the thing we are building and to let new knowledge flow back and forth between domain and implementation.

Investing in a ubiquitous language pays off in that it makes communication clearer, and allows teams to see more opportunities.

What is a bounded context?

A division of a larger system that has its own ubiquitous language and domain model. The pricing, shipping, and recommendations aspects of an online retailer would count as separate bounded contexts, as they have significantly different concerns.

As with other DDD concepts, bounded contexts are most valuable when carried through into the implementation.

How do I go about identifying bounded contexts?

Some common things to look for are:

  • The natural boundaries in an organization (within a bounded context, most often you'll find that people collaborate and communicate closely; between bounded contexts the communication is less, and often asynchronous)
  • Where the same word is given different meanings (product to pricing is a thing with a price; product to shipping is a thing with a weight and dimensions, etc.)

Generally, good bounded contexts look like products (a pricing strategy product, a shipping calculation product, a product recommendation engine product, etc.) This aligns well with the products-over-projects team structure.

How isolated should a bounded context be from the rest of my system?

Quite strongly. In general, direct dependencies are best avoided. For example, in .Net separate assemblies would be fairly sensible. In a distributed paradigm, such as SOA or microservices, then finding process boundaries between bounded contexts would be normal.

How can I communicate between bounded contexts?

Exclusively in terms of their public API. This could involve subscribing to events coming from another bounded context. Or one bounded context could act like a regular client of another, sending commands and queries.

What are entities? What are value objects?

Entities or reference types are characterized by having an identity that's not tied to their attribute values. All attributes in an entity can change and it's still "the same" entity. Conversely, two entities might be equivalent in all their attributes, but will still be distinct.

Value objects have no separate identity; they are defined solely by their attribute values. Though we are typically talking of objects when referring to value types, native types are actually a good example of value types. It is common to make value types immutable. For example, String in many languages is immutable, and every time you want to "change" a string, you derive a new one.

From an event sourcing perspective, both entities and value objects play important roles in the domain, but only entities need be persisted, since only these change.

Commands and events

What is an event?

An event represents something that took place in the domain. They are always named with a past-participle verb, such as OrderConfirmed. It's not unusual, but not required, for an event to name an aggregate or entity that it relates to; let the domain language be your guide.

Since an event represents something in the past, it can be considered a statement of fact and used to take decisions in other parts of the system.

What is a command?

People request changes to the domain by sending commands. They are named with a verb in the imperative mood plus and may include the aggregate type, for example ConfirmOrder. Unlike an event, a command is not a statement of fact; it's only a request, and thus may be refused. (A typical way to convey refusal is to throw an exception).

What does a command or an event look like?

Commands and events are simply data structures that contain data for reading, and no behavior. We call such structures "Data Transfer Objects" (DTOs). The name indicates the purpose. In many languages they are represented as classes, but they are not true classes in the real OO sense. Here's an example of a command:

public class ConfirmOrder {
    public Guid OrderId;
}

And here's an example of an event:

public class OrderConfirmed {
    public Guid     OrderId;
    public DateTime ConfirmationDate;
}

What is the difference between a command and an event?

Their intent.

What is immutability? Why are commands and events immutable?

For the purpose of this question, immutability is not having any setters, or other methods which change internal state. The string type in Java and C# is a familiar example; you never actually change an existing string value, you just create new string values based on old ones.

Commands are immutable because their expected usage is to be sent directly to the domain model side for processing. They do not need to change during their projected lifetime in traveling from client to server.

Events are immutable because they represent domain actions that took place in the past. Unless you're Marty McFly, you can't change the past, and sometimes not even then.

What is command upgrading?

Upgrading commands becomes necessary when new requirements cause existing commands not to be sufficient. Maybe a new field needs to be added, for example, or maybe an existing field should really have been split into several different ones.

How do I upgrade my commands?

How you do the upgrade depends how much control you have over your clients. If you can deploy your client updates and server updates together, just change things in both and deploy the updates. Job done. If not, it's usually best to have the updated command be a new type and have the command handler accept both for a while.

Could you give an example of names of some versioned commands?

Sure.

UploadFile
UploadFile_v2
UploadFile_v3

It's just a convention, but a sane one.

Command/Query Responsibility Segregation

What is CQRS?

CQRS means "Command-query responsibility segregation". We segregate the responsibility between commands (write requests) and queries (read requests). The write requests and the read requests are handled by different objects.

That's it. We can further split up the data storage, having separate read and write stores. Once that happens, there may be many read stores, optimized for handling different types of queries or spanning many bounded contexts. Though separate read/write stores are often discussed in relation with CQRS, this is not CQRS itself. CQRS is just the first split of commands and queries.

CQRS sounds like one of those newfangled diets. Who made up the term?

Greg Young.

He has been complaining for years about search engines innocently asking "Did you mean CARS?" when one searches for CQRS.

I've heard there's something called CQS too. What is it, and how does it relate to CQRS?

CQS means "Command-query separation". It was introduced by Bertrand Meyer as part of his work on the Eiffel programming language.

It means that a method is either a command performing an action, or a query that returns data, but not both. Being purely action-performing methods, commands always have a void return type. Queries, on the other hand, should not have any observable side effect on the system itself.

Originally, CQRS was called "CQS", too. But it was determined that the two are different enough for CQRS to have its own name. The main distinguishing feature is this:

  • CQS puts commands and queries in different methods within a type.
  • CQRS puts commands and queries on different objects.

Can CQRS ever be a simplification?

Sure. Generic repositories are a common sight in many systems. They work well in CRUD scenarios - typically, those you may not be applying DDD to. They tend to work out fine for creates, updates, deletes, and reading individual entities. But as soon as there's a query that spans multiple entities where should it go?

Rather than agonizing over it, and trying to shoe-horn queries into the generic repository arrangement, it's far easier to put them on a separate object. No questions where they go, and they can return simple, lightweight DTOs of data.

CQRS doesn't have to mean doing event sourcing, introducing commands, event, read sides, sagas, and so forth.

Will CQRS not make my application more complex?

A typical CQRS + Event Sourcing system will seemingly have more components, since commands, events, exceptions, and queries become part of the public interface. Aggregates, command handlers, read side projections, sagas, and clients further contribute to the proliferation of components.

However, each component is neatly uncoupled from the rest. Originally, "complex" means "braided together". The components in a CQRS+ES system are independent in a way that favors reasoning about the system, and responding to changing requirements:

  • The public interface of message types forms a layer of your application that encourages you to think in terms of user intent, not updating data.

  • The division of the system into client, write side, and read side makes it easy to divide work between various teams.

  • Perhaps most importantly, testing becomes very natural, even of the most important and complex parts of the business logic.

Should the write side always be independent of the read side?

No. But it often helps - for example, by enabling event sourcing to be used on the write side, which can offer a lot of benefits.

Event sourcing

What is event sourcing?

Storing all the changes (events) to the system, rather than just its current state.

Why haven't I heard of event stores before?

You have. Almost all transactional RDBMS systems use a transactional log for storing all changes applied to the database. In a pinch, the current state of the database can be recreated from this transaction log. This is a kind of event store. Event sourcing just means following this idea to its conclusion and using such a log as the primary source of data.

What are some advantages of event sourcing?

  • Ability to put the system in any prior state. Useful for debugging. (I.e. what did the system look like last week?)
  • Having a true history of the system. Gives further benefits such as audit and traceability. In some fields this is required by law.
  • We mitigate the negative effects of not being able to predict future needs, by storing all events and being able to create arbitrary read-side projections as needed. This allows for more nimble responses to new requirements.
  • The kind of operations made on an event store is very limited, making the persistence very predictable and thus easing testing.
  • Event stores are conceptually simpler than full RDBMS solutions, and it's easy to scale up from an in-memory list of events to a full-featured event store.

Is event sourcing a requirement to do CQRS?

No. You can save your aggregates in any form you like. However, event sourcing works well with CQRS, and brings a number of additional benefits.

What if an event in the event queue turns out to be wrong?

In an event queue, new events are added to the end of the queue. Events are never removed or changed. (Just as in an accountant's ledger, incidentally.) Compensating actions are what you can add in order to correct actual mistakes. They are simply events which cancel out earlier events.

Won't the use of event sourcing make my system slow?

No.

It takes more time to apply events to build up the current state. But processors are really fast; applying events takes on the order of microseconds. For most domains, performance isn't a problem.

Furthermore, the tight aggregate boundaries that come hand in hand with event sourcing should lead to systems that will scale well horizontally.

What is snapshotting?

An optimization where a snapshot of the aggregate's state is also saved (conceptually) in the event queue every so often, so that event application can start from the snapshot instead of from scratch. This can speed things up. Snapshots can always be discarded or re-created as needed, since they represent computed information from the event stream.

Typically, a background process, separate from the regular task of persisting events, takes care of creating snapshots.

Snapshotting has a number of drawbacks related to re-introducing current state in the database. Rather than assume you will need it, start without snapshotting, and add it only after profiling shows you that it will help.

How do I version/upgrade my events?

You leave them as-is in the event-store, because it is conceptually an append-only list. However, both write side and read side can "upgrade" incoming events in their handlers. An event can always be upgraded to a newer version... if not, it was probably not a newer version after all, but a completely different event type.

How do I handle a growing/large event store over time?

Events are usually quite small, and you can easily store, index, and search millions of them on a low-end relational database.

That said, it's always good to plan ahead, and pick a serialization format that serves you well in terms of size. JSON tends to be smaller than the corresponding XML, for example.

If you feel the need to algorithmically compress your events, that's also an option. Google's protocol-buffers are a modern example of a compressed serialization to use.

For the cases where you actually literally run out of hard drive space: disks are cheap nowadays. Consider saving historical events in some permanent storage. The events carry important business value; do not throw them away.

If the event store outgrows a single machine, then it is easy to shard first by aggregate type, and with a little content-based routing even at the level of aggregates themselves.

Could I persist commands, too?

It's often useful to log your commands, because they contain important information about the requests made on the domain model.

But commands are not events, and they don't belong in the event store. Simply consider logging of the commands as an additional aspect to be wrapped around your command handlers.

CAP and eventual consistency

What is the CAP theorem?

The CAP theorem states that in a distributed system, you can have two out of the following three properties at a given point in time:

  • Consistency
  • Availability
  • Partition tolerance

To understand why, imagine what happens when two nodes on either side of a partition try to update the whole system.

At what level does CAP apply?

CAP is fine-grained. You can make different choices in different parts of your system. For example, for accepting orders, usually availability is desirable as you don't want to lose the orders!

What is eventual consistency?

A de-emphasizing of immediate consistency (that is, everything having the same view of the data all the time) in a system, in exchange for higher availability and greater autonomy of components.

Messaging

How do I handle duplicate command/event issues?

In the transport layer.

Should I use push or pull when publishing my events?

Push has the advantage that events can be pushed as they happen. Pull has the advantage that read sides can be more active and independent. Pull with a local event cache on the read side seems to us to be the nicest and most scalable solution. Push can work nicely with reactive programming and web sockets, however. Again, you needn't make the same choice everywhere in a system.

Testing

How can I test my CQRS application?

Using exclusively the commands, events, and exceptions.

What is behavioral testing?

Testing purely based on an object's behavior, without talking about its state. Concretely, this means that we only ever call methods. This fits well with testing in terms of commands and events, since applying events and handling commands are part of an aggregate's public API.

What does "Tell, don't ask" mean?

Decisions should be made inside of encapsulation boundaries, where the data is. The object or aggregate is the "expert", and things on the outside shouldn't ask for its state and then make decisions for it.

"Tell, don't ask" is considered a good principle of object-oriented design.

The testing encouraged by a CQRS application is an excellent example of "Tell, don't ask". The only thing we can do to test the behavior of our aggregates is to set them up (using events), tell them to do something (using a command), and then observe the results (more events, or an exception).

How do I know a command failed for the right reason?

Use typed exceptions to indicate the mode of failure, and except that type of exception in the test.

So I know I get the correct event, but how do I know it meant something?

Testing that a given command leads to an expected event is only half the job. To make sure the event's application actually means something, write a test with that event in the history. For example, to test that an event indicating an appointment was made actually took effect, put it in the history and try to make a conflicting appointment.

Aggregates

What is an aggregate?

A larger unit of encapsulation than just a class. Every transaction is scoped to a single aggregate. The lifetimes of the components of an aggregate are bounded by the lifetime of the entire aggregate.

Concretely, an aggregate will handle commands, apply events, and have a state model encapsulated within it that allows it to implement the required command validation, thus upholding the invariants (business rules) of the aggregate.

What is the difference between an aggregate and an aggregate root?

The aggregate forms a tree or graph of object relations. The aggregate root is the "top" one, which speaks for the whole and may delegates down to the rest. It is important because it is the one that the rest of the world communicates with.

I know aggregates are transaction boundaries, but I really need to transactionally update two aggregates in the same transaction. What should I do?

You should re-think the following:

  • Your aggregate boundaries.
  • The responsibilities of each aggregate.
  • What you can get away with doing in a read side or in a saga.
  • The actual non-functional requirements of your domain.

If you write a solution where two or more aggregates are transactionally coupled, you have not understood aggregates.

Why is the use of GUID as IDs a good practice?

Because they are (reasonably) globally unique, and can be generated either by the server or by the client.

How can I get the ID for newly created aggregates?

It's an important insight that the client can generate its own IDs.

If the client generates a GUID and places it in the create-the-aggregate command, this is a non-issue. Otherwise, you have to poll from the appropriate read side, where the ID will appear in an eventually consistent time frame. Clearly this is much more fragile than just generating it in the first place.

Should I allow references between aggregates?

In the sense of an actual "memory reference", absolutely not.

On the write side, an actual memory reference from one aggregate to another is forbidden and wrong, since aggregates by definition are not allowed to reach outside of themselves. (Allowing this would mean an aggregate is no longer a transaction boundary, meaning we can no longer sanely reason about its ability to uphold its invariants; it would also preclude sharding of aggregates.)

Referring to another aggregate using a string identifier is fine. It is useless on the write side (since the identifier must be treated as an opaque value, since aggregates can not reach outside of themselves). Read sides may freely use such information, however, to do interesting correlations.

How can I validate a command across a group of aggregates?

This is a common reaction to not being able to query across aggregates anymore. There are several answers:

  • Do client-side validation.
  • Use a read side.
  • Use a saga.
  • If those are all completely impractical, then it's time to consider if you got your aggregate boundaries correct.

How can I guarantee referential integrity across aggregates?

You're still thinking in terms of foreign relations, not aggregates. See last question. Also, remember that just because something would be a two tables in a relational design does not in any way suggest it should be two aggregates. Designing in aggregates is different.

How can I make sure a newly created user has a unique user name?

This is a commonly occurring question since we're explicitly not performing cross-aggregate operations on the write side. We do, however, have a number of options:

  • Create a read-side of already allocated user names. Make the client query the read-side interactively as the user types in a name.
  • Create a reactive saga to flag down and inactivate accounts that were nevertheless created with a duplicate user name. (Whether by extreme coincidence or maliciously or because of a faulty client.)
  • If eventual consistency is not fast enough for you, consider adding a table on the write side, a small local read-side as it were, of already allocated names. Make the aggregate transaction include inserting into that table.

How can I verify that a customer ID really exists when I place an order?

Assuming customer and order are aggregates here, it's clear that the order aggregate cannot really validate this, since that would mean reaching out of the aggregate.

Checking up on it after the fact, in a saga or just in a read side that records "broken" orders, is one option. After all, the most important thing about an order is actually recording it, and presumably any interesting data about the recipient of the order is being copied into the order aggregate (referring to the customer to find the address is bad design; the order was always made to be deliverd to a particular address, whether or not that customer changes their address in the future).

Being able to use what data was recorded in this broken order means you've a chance to rescue it and rectify the situation - which makes a good bit more business sense than dropping the order on the floor because a foreign key constraint was violated!

How can I update a set of aggregates with a single command?

A single command can't act on a set of aggregates. It just can't.

First off, ask yourself whether you really need to update several aggregtes using just one command. What in the situation makes this a requirement?

However, here's what you could do. Allow a new kind of "bulk command", conceptually containing the command you want to issue, and a set of aggregates (specified either explicitly or implicitly) that you want to issue it on. The write side isn't powerful enough to make the bulk action, but it's able to create a corresponding "bulk event". A saga captures the event, and issues the command on each of the specified aggregates. The saga can do rollback or send an email, as appropriate, if some of the commands fail.

There are some advantages to this approach: we store the intent of the bulk action in the event store. The saga automates rollback or equivalent.

Still, having to resort to this solution is a strong indication that your aggregate boundaries are not drawn correctly. You might want to consider changing your aggregate boundaries rather than building a saga for this.

What is sharding?

A way to distribute large amounts of aggregates on several write-side nodes. We can shard aggregates easily because they are completely self-reliant.

We can shard aggregates easily because they don't have any external references.

Can an aggregate send an event to another aggregate?

No.

The factoring of your aggregates and command handlers will typically already make this idea impossible to express in code. But there's a deeper philosophical reason: go back and re-read the first sentence in the answer to "What is an aggregate?". If you manage to circumvent command handlers and just push events into another aggregate somehow, you will have taken away that aggregate's chance to participate in validation of changes. That's ultimately why we only allow events to be created as a result of commands validated by a command handler on an aggregate.

Can I call a read side from my aggregate?

No.

How do I send e-mail in a CQRS system?

In an event handler outside of the aggregate. Do not do it in the command handler, as if the events are not persisted due to losing a race with another command then the email will have been sent on a false premise.

Command handlers

What does a command handler do?

A command handler receives a command and brokers a result from the appropriate aggregate. "A result" is either a successful application of the command, or an exception.

This is the common sequence of steps a command handler follows:

  1. Validate the command on its own merits.
  2. Validate the command on the current state of the aggregate.
  3. If validation is successful, 0..n events (1 is common).
  4. Attempt to persist the new events. If there's a concurrency conflict during this step, either give up, or retry things.

Should a command handler affect one or several aggregates?

Only one.

Do I put logic in command handlers?

Yes. Exactly what logic depends on your factoring.

The logic for validating the command on its own merits always goes in the command handler. If the command handler is just a method on the aggregate, then the next step is simply to use the state of the aggregate to do further validation. In a more functional factoring, where the aggregate exists independently of the command handlers, the next step would be to load the aggregate and do validation against it.

Provided validation is successful, the command handler should then produce events. Depending on the factoring, it may also take a further step to try and persist them.

In the Edument CQRS starter kit, command handlers are methods that return events. The loading of events, building up of the aggregate, and persisting of events is completely factored out of command handlers. This keeps them very clean and focused, and thus completely decoupled from persistence mechanisms.

However you have it, the logic boils down to validation and some sequence of steps that lead to the command becoming an exception or event(s). If you're tempted to go beyond this, see the rest of the questions in this section.

Can I call a read side from my command handler?

No.

Can I do logging, security, or auditing in my command handlers?

Yes. The decorator pattern comes in handy here to separate those concerns neatly.

How are conflicts between concurrent commands handled in the command handler?

The place where the new events for the aggregate are persisted is the only place in the system where we need to worry about concurrency conflicts. The event store knows the sequence number of the latest event applied on that aggregate, and the command handler knows the sequence number of the last event it read. If these numbers do not agree, it means some other thread or process got there first. The command handler can then load up the events again and make a new attempt.

Should I do things that have side-effects in the outside world (such as sending email) in a command handler?

No, since a concurrency conflict will mean the command handler logic will be run again. Do such things in an event handler.

Read sides

What is a read side?

A read side listens to events published from the write side, projects those events down as changes to a local model, and allows queries to be made on that model.

What practical problems do read sides solve?

They make the cost of correlating model data (called JOIN in SQL lingo) from being per-read to being per-write. A query on a read side is just a straight SELECT, because data is already in the shape the client wants.

This is a net win, because usually, the ratio of reads to writes in a system is usually 10 or more. The idea is quite similar to "views" in SQL databases.

What if my domain has more writes than reads?

Are you sure? Make sure you measure before replying in the affirmative.

Some domains (telecommunications, for example) are very write-intense during short periods, and require much from the write side. But then the read side usually catches up and reads take over.

Some domains (real-time stock markets, for example) are completely dominated by incoming data, and the write side has to be optimized to apply commands in real time.

What is a projection?

A set of event handlers that work together to build and maintain a read model.

What if I build a read side and the projections turn out to be wrong somehow?

If you can't easily correct it in-flight, then build a new version of the read side with fixed projections, deploy it, have it re-process all the events from the event store so it's up with the latest data, and switch queries over to using it.

Sagas

What is a saga?

An independent component that reacts to domain events in a cross-aggregate, eventually consistent manner. Time can also be a trigger. Sagas are sometimes purely reactive, and sometimes represent workflows.

From an implementation perspective, a saga is a state machine that is driven forward by incoming events (which may come from many aggregates). Some states will have side effects, such as sending commands, talking to external web services, or sending emails.

Isn't a saga just leaked domain logic?

No.

Sagas are doing things that no individual aggregate can sensibly do. Thus, it's not a logic leak since the logic didn't belong in an aggregate anyway. Furthremore, we're not breaking encapsulation in any way, since sagas operate with commands and events, which are part of the public API.

How can I make my saga react to events that did not happen?

The saga, besides reacting to domain events, can be "woken up" by recurrent internal alarms. Implementing such alarms is easy. See cron in Unix, for example.

How does the saga interact with the write side?

By sending commands to it.

Occasionally connected systems

What about offline clients?

Clients can be made to work offline, allowing you to issue commands locally, which are synchronized with the write side when reconnecting.

A client has a tendency to pull in features of the write side (for doing local validation) and of the read sides (for updaing faster than eventual consistency allows). In some sense, since the client is the user's window to the system, it always has a tendency to grow until it looks like a small copy of the whole system including write side and read sides.

What is command merging?

Sometimes in a highly collaborative domain, commands arrive "too late" and the current state of an aggregate has already changed so that the command does not apply cleanly. Command merging is the act of extracting the underlying intent from the command, and then creating and applying a new command from that intent.

How can command merging be done in practice in an occasionally connected client?

The git merging model seems an appropriate one to steal.

About this FAQ

Authors

img

Tore Nestenius started Programmers Heaven - a portal with over 750 000 monthly users. He's behind several other successful projects like CodePedia - a wiki for developers, the Open Source project TNValidate, and the C# School e-book with over 100 000 downloads. These projects, among others, has helped Tore build a unique network with providers of development tools and components.

img

Whether it's architecting, coding, teaching or mentoring, Jonathan Worthington is at home. He has an eye for elegance and simplicity. He loves to take something that seems difficult, get to the heart of the complexity and make an easy to understand solution.

img

Carl Mäsak is an open source software developer with a decidedly agile bent. Finds CQRS and DDD fascinating. Considers himself on a constant journey to be a better programmer, improving tools, techniques, and his own knowledge.

Licence

This FAQ is licensed under a Creative Commons Attribution. This means that you can freely use the material, as long as you provide a link back to this document

Feedback

We are happy to receive comments and further questions about the contents of this FAQ. Feel free to write to us at info@cqrs.nu

We have other great courses!

Edument provides worldwide training and mentoring within the following fields:

   Copyright © 2014 Edument AB Contact Us   Edument