Introduce Data Transfer Object

Updated:

 Note: This is a preview release and subject to change. Feedback welcome! Contact Information and Background (PDF)

also known as: Wrap Representation Structure

Context and Motivation

An API offers one or more endpoints that return data elements from their operations. For example, the API of a customer relationship management service might contain an endpoint that returns detailed information about customers and the interactions with them. The structure of these responses might have been derived from the API implementation classes and data structures, with possibly deeply nested structures of elements. This could mean that the API implementation uses an Object-Relational Mapper (ORM) to manage the data and returns the ORM classes in the response messages.

As an API provider, I want to encapsulate my internal data structures so that I can freely change them without breaking backwards compatibility for my clients. In domain-driven design terms, I want to keep the integration-level “published language” separate from the application-level “ubiquitous language”.

Stakeholder Concerns (including Quality Attributes and Design Forces)

#modifyability
API providers want the freedom to change API implementation details without revealing these changes to clients.
#developer-experience
API clients want to navigate to the data they require with minimum effort. This should take as few coding steps (such as statements and expressions) as possible.

Initial Position Sketch

The following excerpt shows a prototypical implementation of a Java Spring Boot endpoint that fetches an entity from a database repository and directly returns it as the response:

@GetMapping(value = "/{id}")
public ResponseEntity<MyEntity> getMyEntity(
        @ApiParam(value = "the entity's unique id") 
            @PathVariable MyEntityId id) {

    MyEntity myEntity = myEntityRepository.getMyEntity(id);
    if (myEntity == null) {
        throw new ResponseStatusException(
            HttpStatus.NOT_FOUND, "Entity Not Found");
    }
    return ResponseEntity.ok(myEntity);
}

The MyEntity class is also used in the object-relational mapping (here: Java Persistence API, JPA):

@Entity
@Table(name = "my_entities")
public class MyEntity {
  ...
}

The refactoring is eligible for any API operation that uses an implementation type (for example, a domain class or a database entity) in their response message. Note that the implementation type can also be nested somewhere within the response message structure.

Smells / Drivers

Leaky encapsulation
Domain layer classes and classes defined in the persistence layer are directly exposed in the API. Such leaky or even completely missing encapsulation of API internal data structures makes an API harder to evolve because it introduces coupling and harms backwards compatibility.
Tight coupling of data contract
Anything that is exposed will be used according to Hyrum’s Law. Hence, leaky encapsulations cause undesired coupling which in turn may slow down development and decrease #modifyability and #flexibility.

Instructions (Steps)

  1. Create a new data-centric class (or equivalent abstraction in non-object oriented languages) that mirrors the attributes of the current response message. Data Transfer Objects (DTOs) are typically implemented as immutable value objects with structural, value-based equality.
  2. Depending on the implementation framework, additional serialization logic or configuration might be needed. An example is the JSON-to-Java data binding offered by libraries such as Jackson.
  3. Write unit tests for the DTOs and the mapping logic. If the code is generated by the framework, such tests might be unnecessary (or already provided by the framework).
  4. Adjust the implementation of the operation to create an instance of the DTO and fill it with the necessary data.
  5. Return the DTO from the API operation, adjusting any method return types if necessary.
  6. Adapt and run the tests to make sure the structure of the message was not changed accidentally.
  7. Update the API change in the external documentation if it is visible there.
  8. Align sample data in API documentation or tutorials to the new structure.

A variant of this refactoring is to wrap domain type(s) already implemented in the API in a new DTO instead of creating an entirely new one with copies of all attributes; the attribute values then have to be mapped to and from the new DTO. This is an acceptable solution if all that is needed is a way to return additional data. However, it resolves fewer of the underlying coupling/information hiding issues, and the wrapper also changes the response message structure.

Target Solution Sketch (Evolution Outline)

In the initial position sketch, MyEntity was returned directly; a MyEntityDto is returned instead now:

@GetMapping(value = "/{id}")
public ResponseEntity<MyEntityDto> getMyEntity(
        @ApiParam(value = "the entity's unique id") 
            @PathVariable MyEntityId id) {

    MyEntity myEntity = myEntityRepository.getMyEntity(id);
    if (myEntity == null) {
        throw new ResponseStatusException(
            HttpStatus.NOT_FOUND, "Entity Not Found");
    }
    MyEntityDto myEntityDto = MyEntityDto.toDto(myEntity);
    return ResponseEntity.ok(myEntityDto);
}

Now that the internal implementation has been decoupled from the response entity, the DTO can also be used to transfer additional data such as Links or Metadata. If the internal implementation changes, the DTO can implement more complex mapping logic to maintain backwards compatibility.

Example(s)

The JHipster application generator has an option to generate the Spring Boot code with DTOs. Enabling this results in the following change in the service class:

-    public Optional<Customer> findOne(Long id) {
+    public Optional<CustomerDTO> findOne(Long id) {
         log.debug("Request to get Customer : {}", id);
-        return customerRepository.findById(id);
+        return customerRepository.findById(id).map(customerMapper::toDto);
     }

The CustomerDTO is a simple Java bean with attributes, getters, and setters. For the mapping from entity to DTO and vice-versa, JHipster uses MapStruct, an annotation processor that frees the developer from writing trivial mapping code:

@Mapper(componentModel = "spring", uses = {})
public interface CustomerMapper extends EntityMapper<CustomerDTO, Customer> {}

Hints and Pitfalls to Avoid

Do not over-eagerly apply this refactoring to all API operations, but rather use it when its value is higher than its cost (a general advice that makes sense in a lot of cases). A good reason might be that the implementation is changing, but this change should not be reflected in the response message structure.

DTO classes and mappings are often straightforward to create, so various libraries and code generation tools exist to automate this task. In the Java ecosystem, for example, Lombok is a further option to MapStruct mentioned above. The recently introduced Java Records also addresses this topic. When using code generation, make sure you know what’s going on behind the scenes and that surprises could be waiting for you (for example, see the third bullet item in How DTOs work in JHipster).

When receiving DTOs, be a Tolerant Reader by making “minimum assumptions about the structure” and only consuming the data you need. This has the advantage that your code will not be affected if unused parts of the DTO change.

Another motivation for the refactoring can be that additional (meta-)data has to be returned, for example when applying Introduce Pagination.

There is a potential risk of introducing memory leaks in API implementations where developers allocate and release memory manually (but also if garbage collection takes place). Marshalling and unmarshalling of request and response data is often handled by frameworks (example JSON to Java and Java to JSON in Jackson when using Spring); caching might take place. Unit tests usually will not catch memory bugs; this requires dedicated reliability tests.

The DTOs and their mapping logic are usually placed in a Service Layer [Fowler 2002].

Domain Driven Design (DDD) offers additional techniques and patterns on how to structure your domain classes. See this brief introduction to Tactic DDD.

Step 4 of the Stepwise Service Design in [Zimmermann and Stocker 2021] advises to “foresee a Remote Facade that exposes Data Transfer Objects (DTOs) in the request and response messages of its API operations to decouple the (published) languages of frontends and backends and to optimize the message exchange over the network w.r.t exchange frequency and message size”. The Remote Facade that is mentioned in the quote helps to “translate coarse-grained methods onto the underlying fine-grained objects”. Here, this means that the DTO can be used to reformat the data in such a way that clients can easily interact with it.

Martin Fowler describes the code-level refactoring Introduce Parameter Object in “Refactoring”[Fowler 2018]. The Refactoring.Guru website features this refactoring here.

References

Fowler, Martin. 2002. Patterns of Enterprise Application Architecture. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.

———. 2018. Refactoring: Improving the Design of Existing Code. 2nd ed. Addison-Wesley Signature Series (Fowler). Boston, MA: Addison-Wesley.

Zimmermann, Olaf, and Mirko Stocker. 2021. Design Practice Reference - Guides and Templates to Craft Quality Software in Style. LeanPub. https://leanpub.com/dpr.