Extract Information Holder

Updated:

 Note: This is a preview release and subject to change. Feedback welcome! Contact Information

also known as: Extract Resource Representation

Context and Motivation

An API operation returns multiple related, possibly deeply nested data structures to provide clients with a rich dataset in a single response. The Microservice API Patterns call such data elements Embedded Entities. For example, in an e-commerce application, the request for the profile of a customer might also return their complete purchasing history. While this API is very convenient for clients that process all the information at once, it is not appropriate for all use cases. A Linked Information Holder might suit some clients better as they can retrieve selected data on demand through subsequent individual requests.

As an API client, I want to be able to retrieve related data elements on demand instead of large structured data sets arriving in a single message so that I can process subsequent responses and the data in them quicker.

Stakeholder Concerns (including Quality Attributes and Design Forces)

#performance
Assembling, transferring, and processing a response utilizes resources both on the provider and on the client side. These resources should not be wasted but treated with care. Bandwidth and computing power are examples of precious (and sometimes costly) resources.
#usability including #developer-experience
The implementation effort on the client side decreases if fewer requests and less client-side state management are required to fetch the desired data.
#evolvability
Systems and components evolve at different speeds. Hence, they should not depend on each other unless this is justified in the business requirements. Data dependencies often introduce undesired, hard-to-spot coupling.
#data-currentness
Data returned by an API might age at different rates. In the e-commerce shop scenario, for instance, the master data of customers (e.g., names, shipping addresses) will change less frequently than transactional data such as orders. API clients might want to cache some of the data retrieved, which is harder if faster-changing data is embedded in a slower-changing data.
#security
Not all API clients have the same access privileges. More fine-grained data retrieval operations make it easier to enforce related controls and rules, avoiding the risk that restricted data “slips through” accidentally. To revisit the e-commerce scenario, what if the shop software also includes public ratings of products that show the name and picture of the rating customer? Here, only limited and carefully selected information about the customer should be returned.

Initial Position Sketch

The API implementation returns Data Elements, that contain further nested data, represented by the various icons from MAP [Zimmermann et al. 2020].

The refactoring targets response message in operations and their data representation elements.

Smells / Drivers

God endpoint
The endpoint offering this operation might have to access many data sources or backend systems to assemble the response. A large amount of such dependencies on external systems and data makes the API implementation harder to evolve.
Data lifetime mismatches
Conflating data elements with different life times makes caching and especially the cache invalidation harder. This may happen when slow-changing master data contains fast-changing transactional data, but also if the relation is reversed and transactional data that is often refreshed by clients contains embedded master-data that infrequently changes.
Overfetching
Clients make multiple API calls to get all data they require because these calls do not offer any way to define the targeted representation elements (publishing parts or all of a domain model’s entities and their attributes).

Instructions (Steps)

Preparation/Preconditions:

  1. Ensure that the API offers a separate Retrieval Operation for data. If this is not already the case, apply the Split Operation refactoring first.
  2. Add a Linked Information Holder to the refactored response message so that clients know how to fetch the linked data.1
  3. If the API operation does not already use a dedicated Data Transfer Object (DTO), apply the Introduce Data Transfer Object refactoring.

Depending on how deep the Embedded Entity is nested in the response data structure, the refactoring may have to be applied several times.

To replace an Embedded Entity with a Linked Information Holder:

  1. Add a Link Element to the DTO.
  2. Adjust the tests to the new response structure and run them to observe the changed responses.
  3. Deprecate or remove the Embedded Entity.
  4. Clean up the implementation code. For example, service/utility classes or repositories previously used to retrieve the embedded data might not be required anymore.
  5. Check security policies to ensure that clients can still access the linked data.
  6. If under your control, adjust API clients to issue additional API calls to retrieve the data available at the endpoint referenced in the new Link Element.
  7. Update API description, version number, sample code, tutorials, etc. as required. API directories and gateways might have to be updated as well.

Target Solution Sketch (Evolution Outline)

The client can use the link returned in the initial request to retrieve the related data:

To reap the full benefits of this refactoring, backwards compatibility has to be given up. In a first step, the Embedded Entity could be marked as deprecated to give the clients time to adjust.

Example(s)

The following API description shows an endpoint to retrieve the CustomerProfileDTO, which includes the Embedded Entity PurchaseOrderDTOs.

API description ECommerceAPI

data type CustomerProfileId {"id": ID<string>}

data type CustomerProfileDTO {
  "id": CustomerProfileId,
  "givenName": Data<string>,
  "familyName": Data<string>,
  <<Embedded_Entity>> "purchaseHistory": PurchaseOrderDTO*
} 

data type PurchaseOrderDTO "ToBeContinued"

endpoint type CustomerProfileEndpoint serves as INFORMATION_HOLDER_RESOURCE
exposes 
  operation getCustomerProfile with responsibility RETRIEVAL_OPERATION
    expecting payload CustomerProfileId
    delivering payload CustomerProfileDTO
    
API provider ECommerceAPIProvider
  offers CustomerProfileEndpoint

API client ECommerceClient
  consumes CustomerProfileEndpoint

Having applied the refactoring, the client will now receive a link (notice the purchaseHistory link in CustomerProfileDTO).

API description ECommerceAPI

data type CustomerProfileId {"id": ID<string>}

data type CustomerProfileDTO {
  "id": CustomerProfileId,
  "givenName": Data<string>,
  "familyName": Data<string>,
  <<Linked_Information_Holder>> "purchaseHistory": Link<string>
} 

data type PurchaseOrderDTO "ToBeContinued"

endpoint type CustomerProfileEndpoint serves as INFORMATION_HOLDER_RESOURCE
exposes 
  operation getCustomerProfile with responsibility RETRIEVAL_OPERATION
    expecting payload CustomerProfileId
    delivering payload CustomerProfileDTO
    
endpoint type PurchaseHistoryEndpoint serves as INFORMATION_HOLDER_RESOURCE
exposes
  operation getPurchaseHistory with responsibility RETRIEVAL_OPERATION
    expecting payload CustomerProfileId
    delivering payload PurchaseOrderDTO*
    
API provider ECommerceAPIProvider
  offers CustomerProfileEndpoint 
  offers PurchaseHistoryEndpoint

API client ECommerceClient
  consumes CustomerProfileEndpoint

Hints and Pitfalls to Avoid

Comparing the Target Solution sketch with Initial Position shows that the first resource now accesses fewer repositories to assemble the response message. This enables further architectural refactorings such as Split Application Kernel.

A deeper discussion of the benefits and liabilities of these two patterns can be found in the Embedded Entity and Linked Information Holder patterns.

The inverse API refactoring is Inline Information Holder.

If there is no operation to retrieve the linked data, the Split Operation refactoring can be used to create one.

After a Split Operation refactoring, Extract Information Holder can be used to “split” the response messages of the operations.

The Wish List and Wish Template patterns in MAP (and related refactorings) offer alternative solutions to the problem of how an API client can inform the API provider at runtime about the data it is interested in.

References

Zimmermann, Olaf, Mirko Stocker, Daniel Lübke, Cesare Pautasso, and Uwe Zdun. 2020. “Introduction to Microservice API Patterns (MAP).” In Joint Post-Proceedings of the First and Second International Conference on Microservices (Microservices 2017/2019), edited by Luı́s Cruz-Filipe, Saverio Giallorenzo, Fabrizio Montesi, Marco Peressotti, Florian Rademacher, and Sabine Sachweh, 78:4:1–17. OpenAccess Series in Informatics (OASIcs). Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. https://doi.org/10.4230/OASIcs.Microservices.2017-2019.4.

  1. A Linked Information Holder comes as a Link Element pointing at another endpoint, typically an Information Holder Resource