Introduce Data Transfer Object

Updated: Published: EuroPLoP 2023

also known as: Map and Wrap Representation Structure, Ubiquitous Language Wrapper

Context and Motivation

An API offers one or more operations that return Data Elements [Zimmermann et al. 2022]. For example, the API of a customer relationship management service might contain an endpoint that returns detailed information about customers and the interactions with them. The structure of these responses might have been derived from the API implementation classes and domain model data structures, with possibly deeply nested structures of elements [Singjai, Zdun, and Zimmermann 2021]. For example, an API implementation might use an Object-Relational Mapper (ORM) to manage the data and might return serializations of instances of the ORM classes in the response messages.

As an API provider, I want to encapsulate my internal data structures so that I can freely change them without breaking backward compatibility of my clients. In domain-driven design terms, I want to keep the integration-level published language separate from the application-level ubiquitous language1.

Stakeholder Concerns (including Quality Attributes and Design Forces)

#modifiability, #evolvability, and #information-hiding
API providers want the freedom to change the API implementation without revealing such changes to clients. Such information hiding [Parnas 1972] is crucial for the independent evolvability of API providers and clients.
#developer-experience
API clients want to navigate the required data with minimum effort, taking as few coding steps (expressed as statements and expressions) as possible.
#cohesion, #coupling
API providers strive for low coupling and high cohesion in their endpoint, operation, and message designs.

Initial Position Sketch

The refactoring is eligible for any API operation that uses an implementation type (for example, a domain class or a database-mapped entity) in a request or response message. Note that this implementation type may also appear as a subordinate element within a message structure hierarchy.

For instance, an API provider might return data from a repository directly to the client, as shown in Figure 1. The structure and content of the data are not changed; it is simply passed through.

Introduce Data Transfer Object: Initial Position Sketch. The API provider responds to a client request (1) with a message (2) that contains some data elements.

Figure 1: Introduce Data Transfer Object: Initial Position Sketch. The API provider responds to a client request (1) with a message (2) that contains some data elements.

Design Smells

Leaky encapsulation
Domain layer language constructs (e.g., classes) or abstractions defined in the persistence layer are directly exposed in the API. Such permeable or even completely missing encapsulation of API internal data structures makes an API harder to evolve because it introduces coupling and harms backward compatibility.
Tight coupling of data contract
Anything exposed will be used according to Hyrum’s Law. Hence, leaky encapsulations cause undesired coupling, which may slow development and decrease modifiability and flexibility.
Confetti design
Clients might have to issue many requests to get all the needed data. Fine-grained APIs that rain many small Data Elements on clients are rather flexible but can be tedious to use.

Instructions (Steps)

To replace an implementation type in a response message with a DTO, perform the following steps:

  1. Create a new data-centric wrapper (e.g., a class in object-oriented languages) that mirrors the attributes of the current message representation. Such Data Transfer Objects (DTOs) [Fowler 2002] are typically implemented as immutable Value Objects [Evans 2003] with structural, value-based equality.
  2. Depending on the implementation framework, add additional serialization logic or mapping configuration information. An example is the JSON-to-Java data binding offered by libraries such as Jackson.
  3. Write unit tests for the DTOs and the mapping logic. If the framework generates the code, such tests might be unnecessary (or already provided by the framework).
  4. Adjust the implementation of the operation to create an instance of the DTO and fill it with the necessary data.
  5. Return the DTO from the operation implementation, adjusting any return types if necessary (so that the new serialization logic can pick them up).
  6. Run the tests to ensure the message structure was not changed accidentally. For example, an integration test might check whether all attributes expected in a JSON object are present.
  7. Include the API change in the API Description if it is visible there; for example, adjust the JSON-Schema part of the OpenAPI description of the API.
  8. Align the sample data in supplemental API documentation artifacts such as tutorials so that it the new structure is featured.

These instructions assume that the DTO is introduced in a response message. If a request message is the target, steps 4 and 5 must be adapted. Instead of creating and returning a DTO, adjust the implementation of the operation to take the DTO as a parameter and convert it back to the API-internal data representation. This refactoring is fully backward-compatible because it only changes the implementation, not the structure of the message.

Target Solution Sketch (Evolution Outline)

When the refactoring has been applied, a mapping step takes place that copies the data to/from the DTO structure. The mapping can preserve the structure or adjust it, depending on the information needs of the message recipient. Figure 2 shows a refactored response message.

Introduce Data Transfer Object: Target Solution Sketch. Instead of passing through the data for the client's request (1), an additional DTO mapper transforms the data elements before they are returned (2). The implementation types can be changed without affecting the API.

Figure 2: Introduce Data Transfer Object: Target Solution Sketch. Instead of passing through the data for the client’s request (1), an additional DTO mapper transforms the data elements before they are returned (2). The implementation types can be changed without affecting the API.

Now that the internal implementation has been decoupled from the response entity, the DTO can transfer additional data, such as Embedded Entities, Link Elements, or Metadata Elements. Such richer messages help against confetti design.

If the internal implementation type evolves, the DTO can implement more complex mapping logic to maintain backward compatibility. Additional metadata, such as a Version Identifier, can also be added to the DTO.

Example(s)

The following excerpt from a Java Spring Boot controller shows an implementation of an operation getMyEntity that fetches an entity from a database repository and directly returns it in its response:

@GetMapping(value = "/{id}")
public ResponseEntity<MyEntity> getMyEntity(
        @ApiParam(value = "the entity's unique id") 
            @PathVariable MyEntityId id) {

    MyEntity myEntity = myEntityRepository.getMyEntity(id);
    if (myEntity == null) {
        throw new ResponseStatusException(
            HttpStatus.NOT_FOUND, "Entity Not Found");
    }
    return ResponseEntity.ok(myEntity);
}

The Spring GetMapping annotation turns the getMyEntity method into an API operation (HTTP GET) that receives an id parameter and returns a ResponseEntity. The ResponseEntity class has helper methods – such as ok – to generate HTTP messages (200 OK in this case).

The MyEntity class is also used in the object-relational mapping in this example. More specifically, the Java Persistence API (JPA) is used:

@Entity
@Table(name = "my_entities")
public class MyEntity {
  String attribute;
  ...
}

A sample response could look like this:

HTTP/1.1 200
Content-Type: application/json;charset=UTF-8
{
  "attribute" : "1c184cf1-a51a-433f-979b-24e8f085a189"
}

When the refactoring has been applied, a MyEntityDto is returned (instead of returning MyEntity directly):

@GetMapping(value = "/{id}")
public ResponseEntity<MyEntityDto> getMyEntity(
        @ApiParam(value = "the entity's unique id") 
            @PathVariable MyEntityId id) {

    MyEntity myEntity = myEntityRepository.getMyEntity(id);
    if (myEntity == null) {
        throw new ResponseStatusException(
            HttpStatus.NOT_FOUND, "Entity Not Found");
    }
    MyEntityDto myEntityDto = MyEntityDto.toDto(myEntity);
    return ResponseEntity.ok(myEntityDto);
}

The MyEntityDto DTO is implemented as follows:

public class MyEntityDto {
  // Attributes that mirror those in MyEntity 
  String attribute;
  ...
  
  static MyEntityDto toDto(MyEntity myEntity) {
    // Copy attributes from myEntity to new DTO instance
  }
}

Because the DTO mirrors the attributes of MyEntity, the resulting HTTP response remains unchanged.

Another example comes from the rapid prototyping framework JHipster. The application generator provides the option to generate the Spring Boot code with DTOs. Enabling this option changes the signature of the service class (the + and - stand for added and removed lines, respectively):

-    public Optional<Customer> findOne(Long id) {
+    public Optional<CustomerDTO> findOne(Long id) {
         log.debug("Request to get Customer : {}", id);
         return customerRepository
-            .findById(id);
+            .findById(id).map(customerMapper::toDto);
     }

The CustomerDTO that replaces Customer as the response type in this example is a simple Java bean with attributes, getters, and setters. For the mapping from entity to DTO and vice-versa, JHipster uses MapStruct, an annotation processor that frees the developer from writing trivial mapping code:

import org.mapstruct.Mapper;
@Mapper(componentModel = "spring", uses = {})
public interface CustomerMapper 
    extends EntityMapper<CustomerDTO, Customer> {}

Note that the refactoring can also be applied to request messages.

Hints and Pitfalls to Avoid

Do not over-eagerly apply this refactoring to all API operations, but use it only when its value is higher than its cost (note: this is general advice that makes sense in most cases). A good reason might be that the implementation data structures change oftzen and these changes should not be reflected in the structure of the API-level response messages.

DTO classes and mappings are often straightforward to create, so various libraries and code-generation tools exist to automate this task. For example, Lombok is an alternative to MapStruct in the Java ecosystem. The recently introduced Java Records also address this topic.

When using code generation, ensure you know what’s going on behind the scenes and that surprises could be waiting for you (for example, see the third bullet item in How DTOs work in JHipster).

When receiving data, be a Tolerant Reader by making “minimum assumptions about the structure” and only consuming the data you need. This approach has the advantage that recipient code will not be affected if unused parts of the DTO change.

Another motivation for the refactoring can be that additional (meta-)data has to be returned, for example, when applying the Introduce Pagination refactoring.

There is a potential risk of introducing memory leaks in API implementations when developers allocate and release memory manually. Marshaling and unmarshaling of request and response data is often handled by frameworks (for example, JSON to Java and Java to JSON in Jackson when using Spring); caching might occur. Unit tests usually will not catch memory bugs; this requires dedicated reliability tests.

DTOs are meant for communicating with external clients and should not be used internally in the API implementation. If the API implementation needs to pass around data internally, it should use the existing data structures directly. See the article about Internal Data Transfer Objects by Phil Calçado for reasons why DTOs should not be used internally.

The DTOs and related mapping logic are usually placed in an implementation-level Service Layer [Fowler 2002].

Chapter 8, “Evolve APIs”, in [Zimmermann et al. 2022] discusses evolution strategies for APIs. Refactorings such as Introduce Version Identifier, Introduce Version Mediator, Relax Evolution Strategy, and Tighten Evolution Strategy provide guidance on refactoring an API towards those patterns.

Domain Driven Design (DDD) offers additional techniques and patterns to structure domain classes. A brief introduction to Tactic DDD can be found in the Design Practice Repository (DPR) on GitHub and the corresponding DPR eBook [Zimmermann and Stocker 2021].

Many Enterprise Integration Patterns [Hohpe and Woolf 2003] are related. For instance, Content Enricher and Content Filter can be used to wrap and map implementation-internal data.

Step 4 of the Stepwise Service Design in DPR [Zimmermann and Stocker 2021] advises to “foresee a Remote Facade that exposes Data Transfer Objects (DTOs) in the request and response messages of its API operations to decouple the (published) languages of frontends and backends and to optimize the message exchange over the network w.r.t exchange frequency and message size.” The Remote Facade that is mentioned in the quote helps to “translate coarse-grained methods onto the underlying fine-grained objects.” This means that the DTO can be used to restructure the data so that clients can easily interact with it while using the network efficiently.

Martin Fowler describes the code-level refactoring Introduce Parameter Object in “Refactoring – Improving the Design of Existing Code”[Fowler 2018]. The Refactoring.Guru website features this refactoring as “Introduce Parameter Object”.

References

Evans, Eric. 2003. Domain-Driven Design: Tacking Complexity in the Heart of Software. Addison-Wesley.

Fowler, Martin. 2002. Patterns of Enterprise Application Architecture. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.

———. 2018. Refactoring: Improving the Design of Existing Code. 2nd ed. Addison-Wesley Signature Series (Fowler). Boston, MA: Addison-Wesley.

Hohpe, Gregor, and Bobby Woolf. 2003. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley.

Parnas, D. L. 1972. “On the Criteria to Be Used in Decomposing Systems into Modules.” Commun. ACM 15 (12): 1053–58. https://doi.org/10.1145/361598.361623.

Singjai, Apitchaka, Uwe Zdun, and Olaf Zimmermann. 2021. “Practitioner Views on the Interrelation of Microservice APIs and Domain-Driven Design: A Grey Literature Study Based on Grounded Theory.” In 18th IEEE International Conference on Software Architecture (ICSA 2021). https://doi.org/https://doi.org/10.5281/zenodo.4493865.

Zimmermann, Olaf, and Mirko Stocker. 2021. Design Practice Reference - Guides and Templates to Craft Quality Software in Style. LeanPub. https://leanpub.com/dpr.

Zimmermann, Olaf, Mirko Stocker, Daniel Lübke, Uwe Zdun, and Cesare Pautasso. 2022. Patterns for API Design: Simplifying Integration with Loosely Coupled Message Exchanges. Addison-Wesley Signature Series (Vernon). Addison-Wesley Professional.

  1. Ubiquitous Language is one of the core patterns in Domain Driven Design [Evans 2003].