Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added v3.0 of the Azure REST API Guidelines #191

Merged
merged 13 commits into from
Apr 9, 2020
242 changes: 242 additions & 0 deletions azure/Guidelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
# Microsoft Azure REST API Guidelines

## History

| Date | Version | Notes |
| 2020-Mar-31 | v3.1 | 1st public release of the Azure REST API Guidelines|

## Introduction

The Azure REST API guidelines are an extension of the [Microsoft REST API guidelines][1]. Readers of this document are assumed to be also reading the [Microsoft REST API guidelines][1] and be familiar with them. Azure guidance is a superset of the Microsoft API guidelines and services should follow them *except* where this document outlines specific differences or exceptions to those guidelines. This document does contain additional Azure-specific guidance and additional details.

### Additional guidance for Azure Resource Manager resource providers

Teams building ARM Resource Providers (RPs) MUST follow the additional guidance in the ARM Resource Provider Contract (RPC) and related documents. These documents can be found here.

* [Azure Resource Manager Wiki][2] (Internal only)
* [Azure Resource Provider Contract][3]

ARM RPs are Azure Fundamentals requirement for Azure Services and ARM RP review is another mandatory review. Some of the guidance overlaps with general API review, but passing one review will generally make the other one go very quickly.

## API definition

All Services **MUST** provide an [OpenAPI Definition] (with [autorest extensions](https://github.com/Azure/autorest/blob/master/docs/extensions/readme.md)) that describes their service. The OpenAPI Specification is a key element of the Azure SDK plan and essential to improving the documentation, usability and discoverability of services.

## URL structure

In addition to the [URL structure guidance](https://github.com/microsoft/api-guidelines/blob/vNext/Guidelines.md#71-url-structure) in the Microsoft REST API guidelines, Azure has specific guidance about service exposure for multi-tenant services

### URL structure

All services **MUST** expose their service to developers via the following URL pattern:

```
https://<service>.<cloud-instance>/<unit-of-multi-tenancy>/<service-defined-root>
```

Where:

* **service** - the name of the service such as "blobstore", "servicebus", "directory", or "management"
* **cloud-instance** - the DNS domain name at the root of the cloud instance. For instance, public Azure uses `azure.net`. Sovereign clouds uses different domains.
* **service-defined-root** - the root of the service-specific path, such as "blobcontainer", "myqueue", etc.
* **unit-of-multi-tenancy** - refers to a globally unique moniker that identifies a unique container in the Azure service that has the following properties:

* This container is the boundary of isolation between different tenants of the service.
* Quotas as set and enforced at the level of this container - but there will be different limits for different operations; and operations will be service specific.
* Resources in the service are attached to this container and are tied to this container in terms of lifecycle. For example someone signs up, they get this container. If they unsubscribe (or don’t pay their bills) then cleanup of this container occurs and the resources associated with this container are cleaned up. Cleanup follows a state machine – the container and the resources attached to it are deactivated first (and can be easily restored if required), and if no response for some period then deleted.
* It is the container for billing – which means the owner of this container sees one bill for the resource usage of all azure services under this container’s identifier.

For Azure PaaS services like SQL Azure, Azure Storage, Caching, etc., the unit of multi-tenancy is the Azure subscription id, which is a GUID. This ensures consistent access using the same URL pattern, and identifier, across all these services.

When services produce URLs in response headers or bodies, they **MUST** use a consistent form – either always a GUID for tenant identifier or always a single verified domain - regardless of the URL used to reach the resource.

### Direct endpoint URLs

In addition to the required format above, services **MAY** also choose to expose direct endpoint for performance or routing reasons. The direct endpoint should be discoverable by clients, to ensure that developers are presented with a consistent pattern for accessing Azure services.

The format of the root of the direct endpoint **MUST** be as follows:

```
https://<tenant-id>-<service-defined-root>.<service>.azure.net
```

1. A request is made to the default end point (GET or HEAD). For example:

```
GET https://blobstore.azure.net/contoso.com/account1/container1/blob2
```

2. That request is returned with the `Content-Location` header set to the direct endpoint. See [RFC2557]:

```
200 OK
Content-Location: https://contoso-dot-com-account1.blobstore.azure.net/container1/blob2
darrelmiller marked this conversation as resolved.
Show resolved Hide resolved
```

Or, with the GUID format:

```
200 OK
Content-Location: https://00000000-0000-0000-C000-000000000046-account1.blobstore.azure.net/container1/blob2
```

## Versioning

All Azure APIs **MUST** use explicit versioning. The Microsoft REST API guidelines offer different options on how to specify an API version and guidance on what constitutes a breaking change. This section of the Azure API guidelines describes updates those guidelines to ensure consistency between Azure services across Azure Stack, public Azure, and sovereign clouds.

### Specifying the version in Azure

The Microsoft REST API guidelines give two options for how services and clients communicate the version: a url segment and a query parameter. Azure services **MUST** use the api-version query parameter. For example:

```
GET https://blobstore.azure.com/foo.com/acct1/c1/blob2?api-version=1.0
PUT https://blobstore.azure.com/foo.com/acct1/c1/b2?api-version=2014-12-07
POST https://blobstore.azure.com/foo.com/acct1/c1/b2?api-version=2015-12-07
```

### Breaking changes in Azure

A breaking change is any change in the API that may cause client or service code making the API call to fail. Obvious examples of such a change are the removal of an endpoint, adding or removing a required field or changing the format of the body (from XML to JSON for example).

Even though we recommend clients ignore new fields, there are many libraries and clients that fail when new fields are introduced. Azure services **MUST** update the version number of their API even when adding optional fields. In fact, servers should be as strict as possible. Ignoring a field can result in the API accepting content that containered a typo or an element at the wrong level of nesting. If this missing field changes the semantics (for example, we have seen cases where security settings were misplaced and ignored, leaving the resources more exposed than intended) this can be a huge and hard to discover error.

At a high level, changes to the contract of an API constitute a breaking change. Changes that impact backwards compatibility of an API is also considered a breaking change. Anything that would violate the Principle of Least Astonishment is considered a breaking change in Azure. Below are some concrete examples of what constitutes a breaking change. In the below breaking change scenarios, the API version must be changed.

adrianhall marked this conversation as resolved.
Show resolved Hide resolved
#### Existing property is removed

If a property called `foo` that was present in v1 of the API needs to be removed, it must be done in a newer API version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish this section started with a strong statement describing why compatibility is important to gaining trust with enterprise customers and how we will be striving for 100% compact. Right now, it reads like a prescription for trivial breaking changes, like removing a "foo" field. If you want to use an example of a breaking change, I would suggest examples related to closing a security hole, privacy issue, or geo-political issue, which by the way are the only API breaking changes we should consider. These changes improve trust we build with customers, and so they offset the loss of trust caused by breaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, We have such guidance currently in another doc, and want to merge that either here or up one level in the Microsoft guidelines in a future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reading this the first time, I agreed - we need a more meta "what will be considered for a breaking change" and then anything else is automatically not a breaking change. I think that wording can be a follow-on after an API review board meeting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I believe this document was originally written specifically for resource providers/RPC. Which both means that it was more specific than general guidance (both based on specific issues that we wanted to prevent before a team even came to the board as well as things that are extra problematic for resource providers). In other words, it was a collection of case law rather than a holistic document.


#### New property added to response

If a new property/field is added to the response of an API, the GET-PUT pipeline will be broken. Consider the case where a customer updates the value of a new property "A" from the Azure portal. Another customer does a GET of this resource using the SDK. The SDK will ignore the property since it does not understand it. From the SDK, the customer does a PUT using the model that was returned from the GET. This will overwrite the change made by the first customer from the portal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be increasingly at-odds with standard web development practices. I'm not sure how long Azure will be able to hold the line on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's certainly a topic that can be visited during the update meetings that are planned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Graph SDKs are able to avoid this problem because unknown properties retrieved from the server are round tripped. This follows the guidelines defined by the Atom publishing protocol. It is worth mentioning that Microsoft Graph does not make extensive use of PUT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And please note that we've had instances where a service would reject a call with a read-only property that a client didn't know, but still tried to round-trip. The requirement for client behavior has to be in lock step with how a service is to treat unknown or read-only properties.


#### New required property added to request

If a new property is made required in the request body, clients will have no way to set this and the request will fail.

#### Property name has changed

Note that this is implied by the requirement that adding/and removing properties are breaking changes, but in some ways worse, since it leads to the possibility of reusing a property name. Even with an API version change, this change is discouraged because it creates documentation and cognitive challenges.

#### Property type has changed

Property `foo was a boolean in v1 but is changed to a string. A client using the existing API version tries to set it as a boolean, but the service will fail since its now expecting a string. So, the API version must be updated.

#### Property default value has changed

If a property is optional and the service provides a default value, changing that default requires an updated API version.

#### Allowed values for an enum have changed
Enum “foo” had allowed values as “val1” and “val2” in v1 of API. Now, the values accepted by the service are “val1”, “val2” and “val3”. The client will fail to de-serialize if “val3” comes back in the response.

#### API has been removed or renamed

V1 of API contract supported `PUT /resourceType1/{resourceType1_name}` but the service no longer supports this method. This scenario should follow the proper Azure API deprecation policy and must be done in an updated API version.

#### Behavior of existing API has changed

There is a functional change in what the API was doing. This is a complex issue because it sometimes is not an easy option to maintain the old behavior even on an old API version. It also is very confusing to end users even when version is update and documented. Behavior changes need to be well justified and discussed on a case-by-case basis.

#### Error contracts have changed

#### Property is made required (from optional)

If property “foo” was optional in the request body of v1 and now it is required, this should result in an API version change. If not changed, clients relying on the older API version will fail if this property is not passed.

#### URL format has changed

Resource parameter names change from `/resourceType1/{resourceType1_name}` to `/resourceType1/{resourceType1_id}`. This will impact code generation.

#### Resource naming rules should not change

This could result in failures which would have earlier succeeded. Even if the rules become less strict, clients relying on earlier name constraints to perform local validation will fail.

### Non-Breaking Changes

The following changes are considered backwards compatible and hence non-breaking.

#### Adding new APIs to an existing service

When a new resource types is added, it does not require API version to be updated for existing types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ARM/RPC specific (based on the use of the term "resource type")


#### Bug fixes to existing API

Bug fixes to existing API which don’t fall into one of the above categories of breaking changes as described above are fine.
Copy link
Contributor

@johanste johanste Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that the list of breaking changes above is exhaustive. It is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to completely restructure the breaking change section as a first order of business .


### Group versioning in Azure and Azure Stack

Azure Stack allows customers and hosters to deploy their own small versions of Azure and upgrade it at a different pace than Azure. In order to make it possible to write an application or SDK that targets both Azure and Azure Stack, additional versioning policy is necessary. Contact the Azure Stack team for further guidance.

### Version discovery

Simpler clients may be hardcoded to a single version of a service. Since Azure services offer each version for a well-known period of time, a client that’s regularly maintained can be always operational without further complexity as long as during regular maintenance the client is moved forward to new versions in advance of older ones being retired.

API version discovery is needed when either a given hosted service may expose a different API version to different clients (e.g. latest API version only available in certain regions or to certain tenants) or the service itself may exist in different instances (e.g. a service that may be run on Azure or hosted on-premises). In both of those cases clients may get ahead of services in the API version they use. In might also be possible for a client version to ship ahead of its corresponding service update, leading to the same situation. Lastly, version discovery is useful for clients that want to warn operators that an API they depend on may expire soon.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you really want teams to do this to their customers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't :-) I'll add this to the list of things to discuss with the board.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anecdotally, we've shipped client libraries before the service was publicly available (it had not left private preview). It has not ended well. We are definitely shipping libraries before the api version is available in all clouds, however (stack being the prime example of where it will never be possible to be 100% in lock step with public azure)


Azure services **SHOULD** support API version discovery. If they support it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a random point, I really like this section and the features it describes. Are there libraries to help with this, or is it all an exercise to the reader? This seems like something that deserves a speclet (or even a DRAFT RFC!) on it's own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a javascript library for this a couple of years ago, but never got around to putting it into practice. The reality is that I locked the code down to a specific API and then depended on the contract not changing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cleemullins, are you asking about libraries to help the service implement the behavior, or for guidance/libraries that help clients dynamically discover and light up features based on version availability?


1. Services **MUST** support HTTP `OPTIONS` requests against all resources, including the root URL for a given tenant or the global root if no tenant identity is tracked or not a multi-tenant service
2. Services **MUST** include the `api-supported-versions` header, containing a comma-separated list of versions conforming to the Azure versioning scheme. This list must include all group versions as well as all major-minor versions supported by the target resource. For cases where no specific version applies (e.g. sometimes the root resource), the list still must contain the group versions supported by the service.
3. If a given service supports versions of the API that are known to be planned for deprecation in a year or less, it **MUST** include those versions (group and major.minor) in the `api-deprecated-versions` header.
4. In addition to the functionality described here, services **MAY** support HTTP `OPTIONS` requests for other purposes such as further discovery, CORS, etc.
5. Services **MAY** allow unauthenticated HTTP `OPTIONS` requests. When doing so, authors need to consider whether HTTP `OPTIONS` requests against non-existing resources result in 404s and whether that is leaking sensitive information. Certain scenarios, such as support for CORS pre-flight requests, require allowing unauthenticated HTTP `OPTIONS` requests.
6. For services that do rolling updates where there is a point in time where some front-ends are ahead of others version-wise, all front-ends **MUST** report the previous version as the latest version until the rolling update covers all instances and only then switch over to reporting the new latest version. This ensures that clients will not detect a version and then get load-balanced into a front-end that does not support it yet.
7. If using OData and addressing an expanded resource, the HTTP `OPTIONS` request **SHOULD** return the group versions that are supported across the expanded set.

Example request to discover versions (blob storage container list API):

```
OPTIONS /?comp=list HTTP/1.1
host: accountname.blob.core.azure.net
```

Example response:

```
200 OK
api-supported-versions: 2011-08,2012-02,1.1,2.0
api-deprecated-versions: 2009-04,1.0
Content-Length: 0
```

Clients that use version discovery are expected to cache version information. Since there’s a year of lead time after an API version shows in the `api-deprecated-versions` before it’s removed, checking once a week should provide sufficient lead time to client authors or operators. In the rare case where a server rolls back a version that clients are already using, the service will reject requests because they are ahead of the latest version supported. Whenever a client sees a `version-too-new` error, it should re-execute its version discovery procedure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still a minimum of one year?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to get clarification on this time frame - it comes up often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, I don't think the guidance of one year has officially changed. @johngossman - do you have any insight here?


## Long running operations

The Microsoft REST API guidelines for Long Running Operations are an updated, clarified and simplified version of the Asynchronous Operations guidelines from the 2.1 version of the Azure API guidelines. Unfortunately, to generalize to the whole of Microsoft and not just Azure, the HEADER used in the operation was renamed from `Azure-AsyncOperation` to `Operation-Location`. Services **SHOULD** support both `Azure-AsyncOperation` and `Operation-Location` HEADERS, even though they are redundant so that existing SDKs and clients will continue to operate. Clients that call these services **SHOULD** look for both HEADERS and prefer the `Operation-Location` version. Both HEADERS **MUST** return the same value.

## API deprecation policy

Disabling a runtime REST API that customers are dependent on of course has the potential of breaking their applications or services, perhaps even mission critical services. But inevitably our APIs will become obsolete and the cost of supporting them and operating the servers on which they run will require us to deprecate and shut them down. We have a public policy that describes how we will inform customers that deprecation is coming and help them move their applications off these services and on to their replacements.

### Policy

Azure does not have a single SLA for how long we will support all services. However, we have published expectations such as [the Azure Modern Lifecycle Policy][6]. The most relevant section of the document:

> For products governed by the Modern Lifecycle Policy, Microsoft will provide a minimum of 12 months' notification prior to ending support if no successor product or service is offered—excluding free services or preview releases.

In practice, we have found this is a bare minimum of how long service endpoints must be supported. Services with any significant usage **SHOULD** expect to run until customers are no longer using them, which can be 10 years or more.

Service teams **MUST** contact the Azure API review board before communicating the deprecation externally to customers and partners (which starts the 12 month clock).

Refer to the Azure deprecation policy for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref: @johngossman Is there a link I can link to for internal teams here?


#### Special case for pre-release and beta APIs

Pre-release and beta APIs are not covered by the normal API deprecation policy. Each team providing a preview API **SHOULD** communicate to customers what the policy is going to be for support and deprecation, even if that policy is “we may remove this at any time”. The Azure REST API Guidelines cover pre-release API versions. To summarize that section, they should be marked with a version tag like `2013-03-21-Preview`.
adrianhall marked this conversation as resolved.
Show resolved Hide resolved
adrianhall marked this conversation as resolved.
Show resolved Hide resolved

Though services may set their own deprecation policy for pre-release APIs, they should monitor these endpoints closely and consider following the normal deprecation policy and process. ** *Customers have suffered downtime because of deprecation of preview APIs* **.

<!-- Links -->
[1]: https://github.com/microsoft/api-guidelines
[RFC2557]: https://www.ietf.org/rfc/rfc2557.txt

<!-- Azure ARM Links -->
[2]: https://aka.ms/armwiki
[3]: https://github.com/Azure/azure-resource-manager-rpc

<!-- Open API Spec -->
[OpenAPI Specification]: https://github.com/Azure/adx-documentation-pr/wiki/Getting-started-with-OpenAPI-specifications

<!-- Versioning Guidelines -->
[6]: https://support.microsoft.com/en-us/help/30881