Categories: Architecture

Introduction

My colleagues and I (as architects) created (in collaboration with development teams) a list of architectural characteristics that we decided were important for business-tier software components (deployable artifacts). This list was used to assess existing components, ie determine which were most in need of a refresh, and to evaluate proposals for new components.

The name Component Maturity Model¹ is related to the topic of Process Capability Maturity Models, but has a slightly different focus.

The reasons behind each item are sometimes complex, and the justification does not fit well into a checklist document. However most of these are well documented in general literature on software architecture.

We attempt in this document to specify goals and not technologies, allowing individual teams to select any technology which achieves the goals that we believe are important. This is, however, a difficult task.

I have left out any items which are specific to our architecture, or turned them into general items.

If you find this list useful, please let me know.

Applying the Checklist

Checklists are not a great way to guide architecture; developers generally don’t read them unless they are pushed to do so as day-to-day work tends to take priority. This meant that we (as architects) need to schedule meetings with development teams to review existing components against this checklist, and need to ensure we are aware of all new components in order to schedule reviews of them before coding started. Ideally the ideas in this checklist would instead be automated, ie violation of the rules by existing components would be automatically detected, and new components which don’t follow the rules would just not be deployable. However automating all these checks is a non-trivial thing to do.

An (internal) tool was developed to present the items below as a form/survey/questionnaire. Each item can be answered as fulfilled/not-fulfilled/not-applicable, and notes can be added (eg a link to relevant documentation, dashboards, etc). An “overall score” can then be computed. Typically when performing a review, an appropriate number of tickets are raised to resolve the not-fulfilled items which have the greatest benefit/cost ratio. Future reviews then focus only on the still-unfulfilled items.

One positive side of this checklist is that it is at least more graspable than an “architecture guidelines” document - developers really don’t read those. By going through this checklist for a few projects, and discussing where necessary why some items are present in this list, a lot of knowledge was shared about what we (as architects) would like to see and why.

Developing this list also acted as a good focus for discussions (with all developers) about what is important to us as an organisation and software development group.

Note: The term “tribe” refers to a cross-functional software development team responsible for several software domains. The term “domain” means “a set of business functionality belonging to a team/tribe”; others might call this a subdomain, and in many cases it is equivalent to a bounded context.

Topics and Items

Topic 1: Domain Independence
Topic 2: Datastores and External Services
Topic 3: Availability
Topic 4: Deployability
Topic 5: Interoperability
Topic 6: Feature Toggles
Topic 7: Security
Topic 8: Observability
Topic 9: Traceability
Topic 10: Alerts and Notifications
- Alert Policy
Topic 11: Tech Fitness KPIs
- KPI Policy
- Measures KPIs
Topic 12: Communication
- Contact Channel
Topic 13: Workflow
- Trunk-based Development

Topic 1: Domain Independence

Topic Goal: Each back-end (business-tier) component in a domain (typically 1) is decoupled from components in other domains during development, testing, deployment and at runtime. This allows development to scale linearly with the number of components.

Notes: In the case where a single domain is implemented as multiple back-end components, then a higher level of coupling between these components can be tolerated as the same development team is responsible for all of them. Component independence is nevertheless helpful even in this scenario.

Code Ownership

Goal: The component owner and only the component owner decides what code goes into their components (as long as that is consistent with organisation architectural requirements). There is no “cross coding” from other tribes/domains and no non-library code is shared with other tribes.

Hint: The list of contributors in the VCS (version control system) might give you a hint if mostly the owners contribute to a component or not. This does not mean that people outside the tribe cannot contribute to a component they do not own. However, changes to a component’s codebase must always be approved by the component owner.

Note: Pull-requests from other teams are acceptable, and even encouraged, as long as the component owners have full rights to accept or decline.

Code Quality Standards Independence

Goal: The component owner agrees on and enforces code quality standards (as long as that is consistent with the tech KPI goals set by the organisation)

Hint: This item can be fulfilled by setting a quality gate in a tool such as Sonarqube that verifies bugs and code smells.

Code Independence

Goal: New features can be implemented without waiting for changes in other component codebases.

Prerequisite: No libraries are shared with other projects.

Prerequisite: Minimal coupling to other components via APIs or messaging (general-purpose APIs and messages are better than purpose-specific ones).

Deployment Independence

Goal: The component can be deployed at any time without the need to coordinate with other tribes.

Prerequisite: All interface changes are backwards-compatible, ie integration-points of a component which are visible to other software have to remain stable. As a result, there are no strict dependencies between application deployments regarding the deployment order. This also enables rollback of an unsuccessful deployment.

Prerequisite: The codebase for the component is small enough to be maintained by a single tribe.

Prerequisite: Private datastores (not readable by any other component); see section “Datastores”.

Runtime Independence

Goal: The deployed component remains largely functional during outages of other components.

Hint: Dependencies to authentication/authorization systems are excepted from this rule.

Prerequisite: There is no synchronous communication between backend components that perform business logic (ideally), or appropriate fallback behaviour is in place.

Persistence Independence

Goal: The component guarantees the integrity and privacy of its own persistent data per component.

Prerequisite: Component uses a private datastore (not writable by any other component); see section “Datastores”.

No Shared Functionality

Goal: The component’s codebase does not share functionality with any other domain (eg via common libraries encoding business logic).

Hint: Code sharing of components within the same domain can be acceptable to a certain degree.

Topic 2: Datastores and External Services

Topic Goal: Data is persisted in a way that allows development to scale linearly with the number of components.

Note: A datastore can be anything that holds data. Examples: databases, memcached, file-caching-servers.

Schema Independence

Goal: Ability to change data schemas without the need to coordinate with any other tribe.

Hint: This is the case if no other component directly accesses the database and any messages emitted from the component are appropriately decoupled from the storage schema.

Links: Integration Database, Shared Database

Cache Ownership

Goal: Ability to cache db results without danger that the underlying data is modified by another component.

Hint: This is especially risky when using things such as a shared cache-server; ensure keys are appropriately namespaced.

Data Access Permission Ownership

Goal: Ability to enforce rules on data access (only the component owners can grant and revoke permissions, no one else).

Isolation from External Datastores

Goal: Stability of this component regardless of changes made to data persistence in other components.

Private Record Keys

Goal: No internal (potentially DB specific) surrogate keys are exposed (and thereby used by other components).

Note: When an external system depends upon an internal key, then that field cannot be changed in future without breaking backwards compatibility.

Prerequisite: Every business entity should get an organisation-wide artificial unique ID that does not change if the leading component or any implementation detail changes (e.g. by re-inserting entries into the database and thus re-generating auto-allocated keys).

Hint: This is only relevant for data that is owned by this component and under your control. Data that you process as part of a read model needs to be fixed by the owner of the data. Also exposing randomly generated UUIDs is fine (as they are neither auto-incremented nor numeric).

External Resource Addresses are Configurable

Goal: Accesses external services (including databases) via a configurable address, i.e. backing services are attached resources. This is fulfilled if endpoints and their connection properties are configurable and not hardcoded.

Link: 12 Factor App: IV. Backing services

External Resource Access Rights are Minimised

Goal: Application users and administrative users (that perform db migrations, for instance) are separated to limit the damage if this component has vulnerabilities.

Hint: For each (accountid, credential) provided as configuration for this component, are the privileges associated with that account truly as low as possible?

Topic 3: Availability

Goal: To provide a service which is available “around the clock” and which has no user-visible downtime.

SLOs and SLAs

Goal: The component has defined SLO/SLAs based upon business requirements and they are reflected in alerting.

Backup Policy

Goal: Component has a documented backup policy. The implementation of a policy might be done by another tribe (eg one responsible for infrastructure), but the component needs should be defined by the tribe and the respective SLO.

Datastore Availability

Goal: Component relies only on datastores that are highly available or does so only opportunistically (eg a cache, where unavailability does not lead to an SLO miss).

Disposability

Goal: Component is “disposable” - an instance can be terminated and replaced by a new one without significant system impact.

Prerequisite: The component should pick up service (e.g. serve requests or start batch processing) from the time it starts within a few seconds. It should also shut down clean when it receives a SIGTERM signal.

Links: 12 Factor App: IX. Disposability

Statelessness

Goal: Component is stateless and share-nothing; any data that needs to persist must be stored in a stateful backing service, typically a database.

Prerequisite: The component does not rely on sticky sessions; any instance can process any request.

Links: 12 Factor App: VI. Processes

Startup Dependency Isolation

Goal: Component starts up even when external services are unavailable (private databases are excluded).

Hint: Requiring an external service on startup can lead to circular dependencies which makes it impossible to bootstrap the platform in case of a complete platform outage.

Rapid Startup

Goal: Component starts rapidly.

Prerequisite: Avoids designs that need long warmup times on new deployments (e.g. by building up large cache structures).

Hint: This question deals with long startup times on new deployments, and is not related to performance when serving requests.

Supports Readiness

Goal: Accept requests only when an instance is ready to process them.

Hint: Many runtimes rely on the component responding appropriately to a “readiness check”.

Reports Status

Goal: Provides health, readiness and status data (see Observability).

Hint: It isn’t enough to be available; availability must also be measurable.

Topic 4: Deployability

Topic Goal: To achieve rapid delivery of features, the component and associated infrastructure must be easily and rapidly deployable.

Continuous Integration

Goal: Component is built automatically on push of changes to version control system (Continuous Integration).

Automated Deployment of non-main branches

Goal: Component can be deployed with no manual steps (other than authorizing the deployment) - to both test and production environments (Continuous Delivery).

Automated Deployment of Main Branch

Goal: Component is automatically deployed to production on merge to the main branch.

Alert Generation

Goal: Alerts are generated automatically on deployment failure.

Versioned Build Pipeline

Goal: Build-pipeline configuration is stored together with the code (under version control).

Versioned Deployment Pipeline

Goal: Deployment-pipeline configuration is stored together with the code (under version control).

Pipeline Independence

Goal: Build and deployment pipelines are isolated from performance or stability of other components. Expressed differently: build and deployment is possible even when other services are not currently available.

Prerequisites: Integration tests mock all external components.

Hint: This is a primary use case for PACT.

Containerized Deployment

Goal: Support deployment to modern environments by packaging the component as a container image and supporting the organisation’s container-management system in deployment pipelines.

Rollback

Goal: Supports rapid rollback of a deployment.

Zero Downtime Deployment

Goal: Can do zero-downtime deployments, ie be able to deploy the component during normal working hours.

Progressive Rollout

Goal: Supports gradual rollout of a deployment driven by health metrics (eg blue-green or automated canary deployments).

Note: This is an advanced capability…desirable but not expected of every component.

Links: BlueGreen Deployment, Canary Releases.

Trusted Dependencies

Goal: Builds rely only on trusted image repositories (e.g. no downloads from arbitrary urls in the build process).

Hint: Downloading from untrusted sources poses two risks: 1) Continuity Risk: The artefact may suddenly become unavailable and 2) Security Risk: It might be possible to perform a supply chain attack by replacing the artefact.

Pipeline Ownership

Goal: The component owner has the ability to modify the build and deployment pipelines.

Environment Consistency

Goal: Dev/Test/Prod environments are structurally as similar as possible.

Note: This item is only concerned about structural similarity, not about the data and config from prod. Non-production environments must never have production data or configuration.

Links: 12 Factor App: X. Dev/prod parity

Topic 5: Interoperability

Topic Goal: Support a scalable, loosely-coupled organisation-wide architecture.

Events Availability

Goal: Publishes events of significance via a message-broker (significance is defined by your team or the organisation needs).

Event Format

Goal: Events are generic and self descriptive. Topics and fields have descriptive names and use enums instead of status codes.

Hint: Where relevant, event schemas should be registered in an appropriate registry (eg Confluent Schema Registry).

Events are Documented

Goal: The events emitted by the component are documented in the appropriate place.

Schema Evolution

Goal: Has a documented plan for schema evolution and versioning to avoid communication breaking unexpectedly because of schema changes (backwards-compatibility).

Hint: A schema registry may assist in verifying backwards compatibility (if appropriately configured).

Links: Schema Evolution and Compatibility

Self-Contained

Goal: The component is completely self contained and provides its service via port binding.

Prerequisite: The deployable artifact is directly executable. An artifact which is deployed into some “host” application does not fulfil this.

Links: 12 Factor App: VII. Port binding

Code Documentation

Goal: Source code is meaningfully documented.

Hint: Documentation should help a reader of the code (including your future self) understand the context why a particular class/method exists or why particular design or configuration choices have been made. Code comments that simply describe what the code does (“getter/setter javadoc”) is not helpful.

API Documentation

Goal: Provides automated documentation generation for api interfaces.

Topic 6: Feature Toggles

Topic Goal: Support trunk based development (which in turn supports rapid software deployment).

Hint: Trunk-based development means avoiding long-lived code branches; most branches should be merged to the main branch (and thus be released to production) within 2 days. Having not-yet-ready code active only when a feature-flag is enabled allows testing of code in non-production environments without having it active in production.

Supports Feature Toggles

Goal: Code-paths within the component can be enabled and disabled via external configuration at runtime.

Toggles are Off By Default

Goal: Uses feature toggles to enable features, not disable them (i.e. a feature is off by default).

Toggle Removal

Goal: A process exists to ensure feature toggles are removed from the code as soon as the feature is complete.

Hint: Feature toggles are intended only to support merge of not-yet-production-ready code into the main branch.

Topic 7: Security

Topic Goal: Provide a service which protects data and operations.

Follow Organisation Security Guidelines

Goal: Component follows any organisation-relevant guidelines - and this is documented.

Documents Security Variance

Goal: Component documentation explicitly describes any security requirements beyond the minimum requirements.

Threat Modelling

Goal: Project documents the results of at least one threat modelling session.

Initial Security Review

Goal: Project documents the results of at least one independent security review.

Hint: The review can be internal to the organisation or an external party.

Recent Security Review

Goal: Project has had at least one security review in the last 2 years.

Standardized Authentication

Goal: The component authenticates incoming requests with the organisation standard, eg OpenID Connect.

Note: Applies only to components which need authentication - typically webservers. Business-tier components usually require authorization only.

Standardized Authorization

Goal: The component verifies that incoming requests have appropriate authorization to perform the requested operation, and do so via the organisation standard approach (eg OAuth2). Originating IP address is never used as the sole factor for Authorization. Roles used for authorization are appropriately sized (no “superuser” permissions).

Links: Zero Trust

Document Roles

Goal: The set of roles used for authorization decision are appropriately documented.

Credentials are Configured

Goal: Uses credentials/secrets only from a designated secret storage with appropriate access restrictions (does NOT embed secrets in code).

Supplychain Security for Dependencies

Goal: Project has a process for regular review and reporting of dependency security issues, ie ensures that third-party libraries are kept up-to-date with security patches.

Hint: The use of an automated dependency-scanning tool on a regular schedule is a sufficient process.

Modern Dependencies

Goal: All dependencies are regularly updated to new versions, even in the absence of known security vulnerabilities.

Note: Security patches are often only available for newer versions of libraries; these may be hard to apply to components relying on older dependency versions. Keeping dependencies up-to-date is therefore good preparation for applying security fixes.

External Security-relevant Dependencies are Registered

Goal: Any external sites or other resources which the component interacts with are registered with the organisation’s relevant security tools.

Environment Isolation

Goal: Deployment environments (dev/test/prod/etc) are separated and do not allow artifacts or data to move between them except under approved conditions.

Hint: This forbids the use of production data in test environments, and use of production services from non-production environments.

Supply-chain Security for Base Images

Goal: When building container images, use only approved base images.

Hint: The organisation security team may not wish to allow every image to be used in production environments.

Topic 8: Observability

Topic Goal: To ensure component state can be inspected at runtime. This supports detection and analysis of problems.

Export Process Metrics

Goal: Export metrics that are related to technical state (Eg for a java application, jvm metrics).

Export Business Metrics

Goal: Export metrics that are related to business processes, eg counts of active users.

Export SLO/SLA metrics

Goal: Export metrics related to SLO/SLAs (counts of failing requests, errors/exceptions emitted, response times, etc).

Has a Monitoring Dashboard

Goal: Project has a live dashboard which shows all important metrics related to the component - particularly whether SLOs/SLAs are satisfied.

Low-cardinality metrics

Goal: All exported metrics have labels with low cardinalities to avoid performance issues with the monitoring system.

Links: Further information on cardinality spikes and their (potential) impact

Topic 9: Traceability

Topic Goal: Component records audit/trace data to enable debugging of problems in production environments.

Generates Logs

Goal: Treats logs as event streams: Each running process writes its log-messages, unbuffered, to stdout. In staging or production deploys, the stream for each process will be captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and secure, read-only long-term storage.

Hint: Storing logs as files on the host of each component instance does not fulfil this requirement.

Links: 12 Factor App: XI. Logs

Log Layout

Goal: Uses log levels and log formats consistently (as agreed by tribe).

Distributed Tracing

Goal: Supports the organisation’s distributed request tracing tool.

Private Data Access Monitoring

Goal: Logs all interactions with personal data not belonging to the originator of the request. This is mandatory for data classified as PII under the GDPR. Such logs must be kept for at least 90 days.

Security Events

Goal: Logs security-relevant events (eg logon, logoff, configuration changes).

Topic 10: Alerts and Notifications

Topic Goal: Generate alerts and notifications for early problem detection and remediation.

Alert Policy

Goal: Project has a defined Alerting Policy following organisation guidelines, and the component complies with those guidelines.

Topic 11: Tech Fitness KPIs

Topic Goal: Provide visibility of project status to the owner and management.

KPI Policy

Goal: Project has documented goals for metrics Lead Time for Change (LTC), Deployment Frequency (DF), Change Fail Rate (CFR), and Mean Time to Recover/Restore (MTTR).

Measures KPIs

Goal: Project publishes its LTC, DF, CFR, and MTTR.

Topic 12: Communication

Topic Goal: Ensure other tribes and individuals within the organisation can get in touch with the owners of this component.

Contact Channel

Goal: Have a well-documented chat channel or email group through which questions or information can be sent to the component owners.

Topic 13: Workflow

Topic Goal: Use best-practice workflows for development.

Trunk-based Development

Goal: Use trunk-based development, ie branches used to develop features should be very short-lived (in most cases less than 2 days).

Hint: Feature toggles can allow code which is not yet production-ready to be merged into the main branch.

Links: Trunk Based Development

Footnotes

The original list had a somewhat different name. I have changed the name for the purposes of this article. ↩