Refactorings by Smells

There also is a Refactorings by Stakeholder Concerns index.

This index lists refactorings by the smells they address (Show / Hide all details):

Api does not get to the point

  • Add Wish List  (Show Details ) (Hide Details )

    According to the POINT principles for API design, any operation should have a purpose. It should also be T-shaped (both broad and deep, that is). Underfetching and overfetching indicate that these two principles are violated or only partially met.

  • Merge Endpoints  (Show Details ) (Hide Details )

    The I in POINT stands for Isolation. API operations should be free of unexpected side effects; they should not interfere with calls to other operations in the same or other APIs. See the blog post “APIs should get to the POINT” for further explanations.

Atomicity and consistency management issues

  • Bundle Requests  (Show Details ) (Hide Details )

    Some clients might want to make sure that either all or none of the requests and corresponding responses complete successfully. This can be much harder if they arrive one by one; most remote APIs do not offer global transactions (for instance, distributed two-phase commits) nowadays — for good reasons such as operation overhead, testing effort, and technical risk over time.

Change log jitter

  • Rename Representation Element  (Show Details ) (Hide Details )

    The name has been, and continues to be, frequently changed according to the logs kept by the version control system. Frequent changes indicate that the domain language is not yet stable or has not yet been defined, communicated, and agreed upon sufficiently.

  • Split Operation  (Show Details ) (Hide Details )

    The operation has been, and continues to be, modified frequently, according to the commit logs kept by the version control system. Frequent changes may indicate that the operation has too many responsibilities and is not focused enough.

Client community smaller than expected

  • Tighten Evolution Strategy  (Show Details ) (Hide Details )

    The provider receives less client traffic than expected and hoped for. Client feedback indicates that the API functionality is appreciated, but API usage is not considered a viable design option due to missing stability and support guarantees.

Cloud-native traits violated

Combinatorial explosion of input options

  • Split Operation  (Show Details ) (Hide Details )

    Boolean parameters or other flags that determine the execution path lead to a combinatorial explosion of possibilities. Explaining these options bloats the API Description and is problematic for the client, who has to understand this complex option space to prepare valid requests, and the provider, who has to validate and process the parameter handling. API testing on the client and provider side is also complicated.

Confetti design

  • Introduce Data Transfer Object  (Show Details ) (Hide Details )

    Clients might have to issue many requests to get all the needed data. Fine-grained APIs that rain many small Data Elements on clients are rather flexible but can be tedious to use.

Cryptic or misleading name

  • Rename Endpoint  (Show Details ) (Hide Details )

    The chosen name is not straightforward to understand or is not part of the domain terminology. It raises wrong, unexpected or undesired associations.

  • Rename Representation Element  (Show Details ) (Hide Details )

    The chosen element name is difficult to understand for stakeholders unfamiliar with the API implementation. For instance, it might not be part of the agreed-upon domain vocabulary or unveil implementation details such as column names in database tables. It might also be ambiguous and overloaded with different meanings (in the same context).

Curse of knowledge

  • Rename Operation  (Show Details ) (Hide Details )

    The operation name is easy to comprehend — but only for the developers of the API implementation on the provider side. On the contrary, client developers miss required context information.The term “curse of knowledge” originates from technical writing (see for instance hint 5 in “Technical Writing Tips and Tricks” and the video lecture by Steven Pinker referenced in that post).

Data lifetime mismatches

  • Extract Information Holder  (Show Details ) (Hide Details )

    Conflating Data Elements with different life times makes caching and especially the cache invalidation harder. This may happen when slow-changing master data contains fast-changing transactional data (for example, in an Operational Data Holder), but also if transactional data that is often refreshed by clients contains embedded master-data that infrequently changes.

  • Move Operation  (Show Details ) (Hide Details )

    Two operations in an endpoint deal with data that changes differently, both on the data definition and on the data manipulation level. For instance, one operation may expose master data and the other one may expose operational data. This causes undesired constraints on endpoint evolution.

Endpoint implementation spaghetti

  • Move Operation  (Show Details ) (Hide Details )

    There are several n to m relations between endpoints and implementation parts. At least one endpoint works with many implementation parts that evolve independently.

Evolution strategy does not meet client expectations

Extreme decomposition

  • Merge Endpoints  (Show Details ) (Hide Details )

    The desire to decompose an API and its implementation into independently deployable units went too far. There are numerous endpoints exposing narrowly-scoped operations that call each other, or have to be orchestrated on the client side.

Feature/release inertia a.k.a. stale roadmap

  • Relax Evolution Strategy  (Show Details ) (Hide Details )

    Providers are reluctant to introduce new features because of the commitments made and guarantees given to existing API users (for example, in an SLA).

  • Segregate Commands from Queries  (Show Details ) (Hide Details )

    An endpoint provides both read and write operations; there might be many read, but only few write operation calls. These types of operations evolve at different speeds; possibly, different development teams are responsible for them. For instance, new query options in a customer relationship management application may be introduced in every two-week iteration in response to frequently arriving customer inquiries and client insights. In contrast, commands evolve with a frequency imposed by a master data management or Enterprise Resource Planning (ERP) package in the backend. The operations also differ in the amount of design and test work required; write operations change state and, therefore may have nontrivial “given” preconditions and “then” postconditions and require consistency management. The conceptual integrity of the endpoint and all of its read and write operations has to be preserved during each evolution step. As a result, it takes longer than desired to introduce new features, new queries in particular.

God endpoint

  • Extract Information Holder  (Show Details ) (Hide Details )

    The endpoint offering this operation might have to access many data sources or backend systems to assemble the response. A large amount of such dependencies on external systems and data makes the API implementation harder to operate and evolve.

  • Move Operation  (Show Details ) (Hide Details )

    One endpoint contains a large number of operations that do not serve related purposes.

High coupling

  • Merge Operations  (Show Details ) (Hide Details )

    Two or more operations perform narrowly focused, rather low-level activities. Clients have to understand and combine all of these activities to achieve higher goals, leading to a degraded developer experience and coordination needs. This causes these operations to be coupled with each other implicitly.

High latency/poor response time

  • Bundle Requests  (Show Details ) (Hide Details )

    Responses take a long time to be returned to clients because many small requests cause high workload for the communications infrastructure and protocol endpoint(s) on the provider side.

  • Introduce Pagination  (Show Details ) (Hide Details )

    Responses take a long time to arrive at the client because a lot of data has to be assembled and transmitted. This might be evident in a provider-side log file analysis or client-side performance metrics.

  • Make Request Conditional  (Show Details ) (Hide Details )

    Load on the API provider is unnecessarily high because the same data is processed and transferred many times over.

  • Segregate Commands from Queries  (Show Details ) (Hide Details )

    Poor performance may be caused by too tight operation coupling. Expensive queries slow down the execution of write operations (for instance, operations performing state creation or transition). Transactional isolation is insufficient.

Lack of trust and confidence

  • Tighten Evolution Strategy  (Show Details ) (Hide Details )

    Client developers decide not to use the API because it comes across as unstable and subject to change often (or disappear).

Large and/or partially unknown user base

  • Introduce Version Mediator  (Show Details ) (Hide Details )

    API providers are not in control of their users and lack information about them. The less information and control a provider has, the higher the risk of impacting clients negatively (or losing them) when making breaking changes in upgrades.

Leaky encapsulation

  • Inline Information Holder  (Show Details ) (Hide Details )

    The implementation data model is leaking through the API. For example, the relational database schema has been exposed with one endpoint per table, and now clients have to resolve the foreign key relationships themselves.

  • Introduce Data Transfer Object  (Show Details ) (Hide Details )

    Domain layer language constructs (e.g., classes) or abstractions defined in the persistence layer are directly exposed in the API. Such permeable or even completely missing encapsulation of API internal data structures makes an API harder to evolve because it introduces coupling and harms backward compatibility.

  • Rename Representation Element  (Show Details ) (Hide Details )

    Program-internal names or identifiers might accidentally have leaked into the API. For example, the initial API could have been generated from internal classes. For API clients, such internal names might be hard to understand.

Overfetching

  • Add Wish List  (Show Details ) (Hide Details )

    Clients throw away large parts of the received data because the API design follows a one-size-fits-all approach, and the provider includes all data in responses that any present or future client might be interested in. For example, in an e-commerce API, product procurement information might only interest a few clients, while most want to learn about current prices and items in stock. Another phenomenon is “sell what is on the truck”: implementation data is exposed just because it is there, without any client-side use case.

  • Add Wish Template  (Show Details ) (Hide Details )

    Clients may throw away large parts of the received data because the API design follows a one-size-fits-all approach and includes all data that any present or future client might be interested in (according to the smell description in Add Wish List). Another phenomenon is “sell what is on the truck”: implementation data is exposed just because it is there, without a client-side use case.

  • Extract Information Holder  (Show Details ) (Hide Details )

    Clients call multiple API operations to get all data they require because these calls do not offer any way to define the targeted representation elements (publishing parts or all of a domain model’s entities and their attributes).

  • Introduce Pagination  (Show Details ) (Hide Details )

    A client may not need all data (at once or at all) and truncate an overly large dataset. Since this truncation happens on the client side, data was unnecessarily processed and transmitted.

Polling proliferation

  • Make Request Conditional  (Show Details ) (Hide Details )

    Clients that participate in long-running conversations and API call orchestrations ping the server for the current status of processing (“are you done?”). They do so more often than the provider-side state advances.

Quality-of-service (qos) fragmentation and scattering

  • Externalize Context Representation  (Show Details ) (Hide Details )

    One or more protocols might be used in the same system. API clients and providers have to go to multiple places to gather or produce all required non-functional metadata. This might be error-prone and time-consuming and cause technical debt.

Rest principle(s) violated

  • Merge Operations  (Show Details ) (Hide Details )

    The “Uniform interface” is an important design constraint imposed by the REST style that many HTTP APIs employ. REST mandates using the standard HTTP verbs (POST, GET, PUT, PATCH, DELETE, etc.), which are associated with additional constraints. For instance, GET and PUT requests must be idempotent to be cachable [@Allamaraju:2010]. Sometimes, mismatches between the API semantics and the REST constraints can be observed; sometimes, the REST constraints limit extensibility (for instance, when a resource identified by a single URI runs out of verbs) [@Serbout:2021].

  • Rename Endpoint  (Show Details ) (Hide Details )

    Abstract API endpoints correspond to resources identified by URIs in RESTful HTTP APIs. Their names should convey the meaning and role of the resources. For instance, using verbs as relative paths might be considered an antipattern; resources should be named with nouns that can be traced back to domain entities (note that these entities and their resource representations can be represented as Processing Resources and/or data-oriented Information Holder Resources, both in REST and in APIs leveraging other integration styles).

Resistance to change caused by uncertainty

  • Introduce Version Identifier  (Show Details ) (Hide Details )

    A provider might hesitate to implement necessary API changes due to a lack of clarity in its strategy for evolving the API. Clients might be reluctant to upgrade to new versions because they are unable to assess the imposed changes on their side.

  • Introduce Version Mediator  (Show Details ) (Hide Details )

    One or more breaking change of the API have happened, or the lifetime guarantee has been softened.while this should generally be avoided, this is not always possible; the provider might have good reasons to do it However, clients are unwilling or unable to migrate to the latest version immediately. They might fear the effort and risk of the migration or they might lack confidence and trust in the new version.

  • Relax Evolution Strategy  (Show Details ) (Hide Details )

    A provider has made many assurances/guarantees to clients and is now reluctant to apply other API refactorings because it is afraid of disrupting these clients.

Responsibility mishmash

  • Extract Endpoint  (Show Details ) (Hide Details )

    The operations in the endpoint deal with multiple, not necessarily related domain concepts. Consequently, the endpoint has more than one reason to change during its evolution. It serves multiple stakeholder groups and/or its implementation is developed and maintained by multiple teams.

Responsibility spread

  • Merge Endpoints  (Show Details ) (Hide Details )

    The context description of the refactoring indicates that the single responsibility principle is violated. For instance, one stakeholder group might have to work with a large number of multiple endpoints to satisfy its information needs.

  • Merge Operations  (Show Details ) (Hide Details )

    Endpoint roles and/or operation responsibilities are rather diffuse; the Single Responsibility Principle is violated. For instance, API clients serving a particular stakeholder have to call multiple operations to satisfy their information needs. Another example would be that a choreographed or orchestrated business process implementation has to consult too many distributed operations to fulfill its job.

Role and/or responsibility diffusion

  • Extract Endpoint  (Show Details ) (Hide Details )

    The endpoint is both an Information Holder Resource and a Processing Resource. An Information Holder exposes different types of data, for instance, both master data and operational data. The endpoint operations have rather diverse functional and technical responsibilities. Hence, it is hard to explain the endpoint purpose.

  • Move Operation  (Show Details ) (Hide Details )

    The endpoint is both an Information Holder Resource and a Processing Resource. It is hard to explain what it does coherently. For instance, some of the exposed data lives long and changes rarely; other data goes through many create, update, delete operations in a short time span.

  • Rename Operation  (Show Details ) (Hide Details )

    The operation does something, but the effects of the operation execution are not clear. For instance, it is not specified whether it reads and/or writes provider-side data and application state. The domain model abstractions/concepts that it works with remain fuzzy. Precision might be harmed and ambiguities introduced.

  • Split Operation  (Show Details ) (Hide Details )

    An operation does (too) many things. Clients have to understand all these things to use the operation correctly. Its request message is rather deeply structured and may contain optional, generic, or variable parts to express diverse input options. This complexity may lead to errors and a degraded developer experience. The internal cohesion of the operation is low.

Same backend system and/or domain data processed by multiple endpoints

Security by obscurity

  • Rename Representation Element  (Show Details ) (Hide Details )

    Sometimes, it is argued that unlabeled, undocumented data is harder to tamper with. But such a tactic alone does not qualify as a sound security solution. It harms maintainability because it increases the risk of introducing bugs because of a lack of clarity for maintainers and auditors, not just attackers.

Sloppy or ill-motivated naming conventions

  • Rename Operation  (Show Details ) (Hide Details )

    Knowing something that nobody else knows might be seen as a pragmatic approach to achieving job security; obscuring operation names might be part of such a strategy. However, the attitude driving such naming decisions can be considered unprofessional (or even unethical); API design and documentation should be seen as a service provided to the client community (and other stakeholders). Another example of good intentions gone wrong is trying to be funny when naming program(ming) artifacts; endpoint and operation names are not the most suited places for humor or irony because they distract from the facts.

Spike load

  • Introduce Pagination  (Show Details ) (Hide Details )

    Regular requests for large amounts of data cause Periodic Workload [@Fehling:2014] for CPU and memory, for instance, when a large JSON object has to be constructed (on the provider side) and read (on the client side). For example, the “Time-Bound Report” variant of a Retrieval Operation might lead to relatively large responses, depending on the time interval size chosen.

  • Make Request Conditional  (Show Details ) (Hide Details )

    Regular requests for large amounts of data cause Periodic Workload for CPU and memory, for instance, when a rather large JSON object has to be constructed (on the provider side) and read (on the client side).

Structured artifact serialized and therefore strangled

  • Add Wish Template  (Show Details ) (Hide Details )

    Some data representation that is nested is serialized into a custom string format that is hard to parse and keeps on causing surprises in testing and production. In object-oriented programming, for instance, objects that contain other objects often have to be mapped to and from text notations such as JSON. The reserved characters of the notations (for instance, curly braces and double quotes in JSON) have to be escaped during serialization, which can be tedious.

Tacit semantic changes up to incompatibilities creep in

  • Introduce Version Identifier  (Show Details ) (Hide Details )

    While the technical API contract remains unchanged, the meaning of the received or returned data might change over time. Such semantic mismatches between older and newer version should be documented in the API Description and examples, and then caught during testing, ideally in an automated fashion. Implicit versioning and tolerant reading might hide such changes and their impact for quite some time.

Tight coupling of data contract

Tight coupling to a communication protocol

  • Externalize Context Representation  (Show Details ) (Hide Details )

    Most networking and communication protocols define their own header formats; HTTP is an example. Some of these protocols support custom headers, others do not. Using protocol-specific headers locks the communication participant in; this can be positive or negative, depending on context and requirements.

Too coarse-grained security or data privacy rules

  • Move Operation  (Show Details ) (Hide Details )

    Some operations in an endpoint have more advanced quality requirements than others. For instance, some work with sensitive personal information that has to be protected, while others merely operate on public data. Other quality requirements mismatches might exist as well, for instance regarding availability and scalability.

  • Segregate Commands from Queries  (Show Details ) (Hide Details )

    The security and data protection requirements of commands and queries differ. They are specified on the endpoint rather than the operation level. Hence, generalization has to take place that bears risks such as under-specification and over-engineering.

Underfetching

  • Add Wish List  (Show Details ) (Hide Details )

    Clients have to call multiple API operations to get all data they require because these operations do not offer any way to define the targeted Data Elements.

  • Add Wish Template  (Show Details ) (Hide Details )

    Clients may have to call multiple API operations to obtain all required data because the request messages of these operations do not offer any way to define the targeted representation elements (thus publishing parts or all of the entities of a domain model and their attributes).

  • Inline Information Holder  (Show Details ) (Hide Details )

    Clients have to issue many requests to get the data they require, harming performance.

Commit chaos

  • Rename Representation Element  (Show Details ) (Hide Details )

    The name has been, and continues to be, frequently changed according to the logs kept by the version control system. Frequent changes indicate that the domain language is not yet stable or has not yet been defined, communicated, and agreed upon sufficiently.

  • Split Operation  (Show Details ) (Hide Details )

    The operation has been, and continues to be, modified frequently, according to the commit logs kept by the version control system. Frequent changes may indicate that the operation has too many responsibilities and is not focused enough.

High coupling

  • Rename Representation Element  (Show Details ) (Hide Details )

    Program-internal names or identifiers might accidentally have leaked into the API. For example, the initial API could have been generated from internal classes. For API clients, such internal names might be hard to understand.

Low cohesion

  • Split Operation  (Show Details ) (Hide Details )

    An operation does (too) many things. Clients have to understand all these things to use the operation correctly. Its request message is rather deeply structured and may contain optional, generic, or variable parts to express diverse input options. This complexity may lead to errors and a degraded developer experience. The internal cohesion of the operation is low.