Definition of RESTful API design
Here are a few important terms that I will use throughout the document:
Resource: A single instance of an object. For example, an animal.
Collection: A collection of objects of the same type. For example, animals.
HTTP: Network communication protocol.
Consumer: A client application that can make HTTP requests.
Third-party developers: Developers who are not members of your project team but want to use your data.
Service: An HTTP service/application can be accessed by consumers through the network.
Node: The API URL on the server, which can represent a resource or an entire collection.
Idempotence: No side effects, can occur multiple times without side effects.
URL segment: the information segment separated by a slash ("/") in the URL.
Data abstraction and design
We need to determine the database design and the web service functions provided. With the functional design, we can design the API well. The API should abstract business logic and data as much as possible so that users can easily get started.
Request method
Obviously you must know GET and POST requests, these are the two most common requests. POST is so popular that it is incorporated into other common languages, and those who don’t know how the Internet works know that they can "post" some content in Moments.
There are four and a half very important HTTP verbs for you to understand. I say "half" because the PATCH request and the PUT request are very close. These two are often used in combination by developers.
GET (SELECT): Retrieve a specific resource from the server or resource list.
POST (CREATE): Create a new resource on the server.
PUT (UPDATE): Update the resource on the server and provide the entire resource.
PATCH (UPDATE): Update the resources on the server and only provide the changed attributes.
DELETE (DELETE): Remove a resource from the server.
The following are two unusual HTTP request methods:
HEAD - Retrieve the metadata of the resource, such as the hash value of the data or the time of the last update.
OPTIONS - Retrieve information about operations that consumers can perform on resources.
A good RESTful API will use these four and a half HTTP request methods to run a third party to interact with the data it provides, and will never include any actions in the URL segment.
Generally, GET requests can be cached by the browser. For example, the browser will cache GET requests (depending on the cache header). If the user tries to initiate a second request, the browser will prompt the user as much as possible. A HEAD request is basically a GET request without a returned body, and it will also be cached.
Version control
No matter what application you build, no matter how much you prepare in advance, your core application is always changing, your data association relationship will also change, and attributes are always added or deleted from your resources. This is how software development works, especially when your project still exists and is used by many people (if you are building an API, it is likely to be like this).
Remember that an API is a contract between the server and the consumer. If you modify your service API and cause backward incompatibility, you will break the contract between the service party and the consumer, and these API users will resent you for it. The more serious situation is that they will no longer use your API. In order to ensure that your application continues to develop and satisfy users, you need to ensure normal access to the old version while introducing new versions occasionally.
By the way, if you simply add new features to your API, such as new attributes of resources (not necessary, and can be used normally without these attributes), or add a node to your API, you don’t You need to upgrade your API version number, because these changes did not break backward compatibility. Of course, you need to update your API documentation (your contract).
After a period of time, you can no longer support the old version of the API. No longer supporting the features of the old version of the API does not mean that it will be shut down immediately or its quality will be reduced, but rather to inform consumers that the old version of the API will be removed on the specified date and they should upgrade to the new API version.
A good RESTful API will track the version in the URL. Another common solution is to put the version number information in the request header, but after working with many different third-party developers, I can tell you that putting the version information in the request header is not directly added to the URL. Easy in paragraph.
Analysis
The so-called API analysis is to continuously track the version and endpoint information of the API that the client is using. For example, each request increments a counter to the database. Many reasons show that tracking and analyzing APIs is a good idea, such as ensuring the validity of commonly used API calls.
To build an API that third-party developers like, the most important thing is that when you decide to deprecate a certain version of the API, you can actually use the deprecated API function to contact the developer. This is the perfect way to remind them to upgrade before you close the old version of the API.
The process of contacting and notifying third-party developers can be automated, for example, e-mailing to notify developers every time 10,000 requests are made for an outdated feature.
API Root URL
Believe it or not, the root address of the API is very important. When a developer takes over an old project. This project is using your API, and the developers want to build a new feature, but they may not know what services you provide. Fortunately, they know the list of URLs that the client calls out. It is important to keep your API root entry point as simple as possible, because a long and complex URL design will be daunting and may be unacceptable for developers.
Here are two common URL root examples:
https://example.org/api/v1/*
https://api.example.com/v1/*
If your application is very large or you expect it to become very large, then placing the API under a subdomain (such as api.example.com) is usually a good choice. This approach can maintain some flexibility in scale.
But if you think your API will not become very large, or you just want to make the application installation easier (for example, you want to use the same framework to support the site and API), put your API under the root domain name ( Such as example.com/api) is also possible.
It is usually a good idea to have some content in the API root. Github's API root is a typical example. From a personal point of view, I am a fan of publishing information through the root URL, which is useful to many people, such as how to obtain API-related development documents.
Please also pay attention to the HTTPS prefix. A good RESTful API is always published based on HTTPS.
Endpoint
An endpoint is a URL that points to a specific resource or collection of resources.
If you are building a fictitious API to display several different zoos, and each zoo contains many animals, employees, and species of each animal, you may have the following endpoint information:
https://api.example.com/v1/**zoos**
https://api.example.com/v1/**animals**
https://api.example.com/v1/**animal_types*...
https://api.example.com/v1/**employees**
For each endpoint, you need to list valid HTTP verbs and endpoint combinations. The following is a semi-comprehensive list of operations that our fictitious API can perform. Please note that I put the HTTP verbs before the fictitious API, just like putting the same annotation in every HTTP request header.
GET /zoos: List all zoos (ID, NAME and other information should not be too detailed)
POST /zoos: Create a new zoo
GET /zoos/ZID: Get details of a zoo
PUT /zoos/ZID: Update the specified zoo
PATCH /zoos/ZID: Update the designated zoo (partial data)
DELETE /zoos/ZID: delete the specified zoo
GET /zoos/ZID/animals: Retrieve the list of animals in the specified zoo (ID and NAME should not be too detailed)
GET /animals: List all animals (ID, NAME should not be too detailed)
POST /animals: Create a new animal
GET /animals/AID: Get the details of the specified animal
PUT /animals/AID: Update the specified animal
PATCH /animals/AID: Update specified animals (partial data)
GET /animal_types: Retrieve the list of animal types (ID and NAME should not be too detailed)
GET /animal_types/ATID: Retrieve the specified animal type
GET /employees: Get the list of employees (ID, NAME should not be too detailed)
GET /employees/EID: Get the specific employee details
GET /zoos/ZID/employees: Get the list of employees in the specified zoo
POST /employees: Create a new employee
POST /zoos/ZID/employees: hire an employee for the designated zoo
DELETE /zoos/ZID/employees/EID: delete the specified employees in the specified zoo
In the above list, ZID represents the ID of the zoo, AID represents the ID of the animal, EID represents the ID of the employee, and ATID represents the ID of the species. It is a good idea to have a keyword for everything in the document.
For brevity, I omitted the public API URL prefix in the example above. As a communication method, this is fine, but if you really want to write in the API documentation, you must include the complete path (eg, GET http://api.example.com/v1/animal_type/ATID).
Please pay attention to how to display the relationship between the data, especially the many-to-many relationship between the employee and the zoo. By adding an additional URL segment, more interactive capabilities can be achieved. Of course no HTTP verb can indicate that a person is being fired, but you can use DELETE an employee in the zoo to achieve the same effect.
Filter
When a client creates a request to get a list of objects, it is very important that you return to them a list of all objects that meet the query conditions. This list may be large. But you can't arbitrarily limit the amount of returned data. Because these unnecessary restrictions will cause third-party developers to not know what is happening. If they request an exact set and iterate through the results, they find that they only get 100 pieces of data. Then they had to find the source of this restriction. Is it caused by a bug in the ORM, or is it because the network truncated large packets?
Minimize unnecessary restrictions that will affect third-party developers
This is very important, but you can let the client do some specific filtering or restrictions on the results. One of the most important reasons for this is to minimize network transmission and allow the client to obtain query results as quickly as possible. Secondly, the client may be lazy. If the server can filter or page the results, it will be good for everyone. Another less important reason is that (from the client's perspective), the less load on the server to respond to requests, the better.
Filters are the most effective way to handle requests for resource collections. So as long as there is a GET request, the information should be filtered by URL. Here are some examples of filters that you might want to add to the API:
?limit=10: reduce the number of results returned to the client (for paging)
?offset=10: Send a bunch of information to the client (for paging)
?animal_type_id=1: Use condition matching to filter records
?sortby=name&order=asc: Sort the results by specific attributes
Some filters may duplicate the effect of the endpoint URL. For example, the GET /zoo/ZID/animals I mentioned earlier. It can also be achieved by GET /animals?zoo_id=ZID. Independent endpoints will make the client better, because their needs often exceed your expectations. The redundant differences mentioned in this article may not be visible to third-party developers.
In any case, when you are preparing to filter or sort data, you must explicitly put those columns that the client can filter or sort into the whitelist, because we don't want to send any database errors to the client!
Status code
For a good RESTful API, it is very important to use the proper HTTP status code; after all, this is a standard! All kinds of network social insurance can recognize these status codes. For example, load balancing can be configured to avoid excessive 50x errors to the Web server. There are plenty of HTTP status codes for you to choose from, but some of these will be a good starting point:
200 OK - [GET]
Consumers request data from the service, and the service finds the corresponding data (idempotent)
201 CREATED - [POST/PUT/PATCH]
The consumer sends data to the service, and the service creates resources
204 NO CONTENT - [DELETE]
The consumer requests the service to delete the resource, and the service successfully deletes the resource
400 INVALID REQUEST - [POST/PUT/PATCH]
The consumer sent the wrong data to the service, and the service did not perform any processing (idempotent)
404 NOT FOUND -
The consumer requested a non-existent resource or collection, and the service did not perform any processing (idempotent)
500 INTERNAL SERVER ERROR -
An error occurred in the server, and the consumer is not sure whether the request was successful
Status code range
The status code in the 1xx range is reserved for the use of the underlying HTTP function, and it is estimated that you will not need to manually send such a status code in your career.
The status codes in the 2xx range are reserved for success messages. You should ensure that the server always sends these status codes to users as much as possible.
Status codes in the 3xx range are reserved for redirection. Most APIs don't use this type of status code too often, but they will use more in new hypermedia style APIs.
Status codes in the 4xx range are reserved for client errors. For example, the client provided some wrong data or requested non-existent content. These requests should be idempotent and will not change the state of any server.
Status codes in the 5xx range are reserved for server-side errors. These errors are often thrown from low-level functions, and developers often cannot handle them. The purpose of sending this type of status code is to ensure that the client can get some response. After receiving the 5xx response, the client has no way to know the status of the server, so this type of status code should be avoided as much as possible.
Expected return document
When using different HTTP verbs to perform actions on the service node, consumers need to obtain certain information in the return. The following typical RESTful APIs:
GET /collection: Return a list (array) of resource objects
GET /collection/resource: return a single resource object
POST /collection: Return the newly created resource object
PUT /collection/resource: return the complete resource object
PATCH /collection/resource: Return the complete resource object
DELETE /collection/resource: returns an empty document
Note that when consumers create a resource, they usually don't know the ID (or other attributes such as creation or modification time, if they exist) when the resource was created. These additional attributes will be returned in subsequent requests, and of course also during the first POST.
Certification
In most cases, the server needs to know exactly who is making which requests. Of course, some APIs can be accessed publicly (anonymously), but most of the time we need to determine the identity of API visitors.
OAuth2.0 provides a very good way to do this. In each request, you can clearly know which client created the request and which user submitted the request, and provides a standard access expiration mechanism or allows users to log out of the client, all of which do not require third-party customers The terminal knows the user's login authentication information.
OAuth 1.0 and xAuth are also suitable for such scenarios. No matter which method you choose, please make sure that it provides some common and well-designed documentation for libraries in many different languages/platforms, because your users may use these languages and platforms to write clients.
I can honestly tell you that although OAuth 1.0a is the most secure option, it is a huge pain to implement. A large number of third-party developers have to develop an adapted package for the language they use. I spent enough time debugging the mysterious invalid signature error, so I recommend you choose an alternative.
Content Type
Currently, most of the "comfortable" APIs provide JSON data support for RESTful interfaces. Such as Facebook, Twitter, Github, etc. you know. The XML method is now rarely used (except in large enterprise environments). Fortunately, SOAP is almost no one in use, and now we rarely see API returning HTML as a result to the client (unless you are building a crawler).
As long as you return to them a valid data format, developers can use popular languages and frameworks for analysis. If you are building a generic response object, by using a different serializer, you can also easily provide the data formats mentioned earlier (excluding SOAP). All you have to do is to put the usage method in the receiving header of the response data.
Some API creators would recommend putting the extensions of files such as .json, .xml, .html in the URL to indicate the type of content to be returned, but I am not accustomed to doing this. I still like to use the Accept header to indicate the return content type (which is also part of the HTTP standard), and I think it is more appropriate to do so.
Hypermedia API
Hypermedia API is likely to be the future of RESTful API design. Hypermedia is a great concept, it returns to the "essence" of how HTTP and HTML work.
In the context of a non-hypermedia RESTful API, the URL endpoint is part of the server-client contract. These endpoints must be known to the client in advance, and modifying them also means that the client may no longer be able to communicate with the server. You can first assume that this is a limitation.
Today, API clients on the Internet are more than just user agents that create HTTP requests. Most HTTP requests are made by people through browsers. People are not bound by the pre-defined RESTful API endpoint URLs. What makes people so different? Because people can read the content, they can click on the links they are interested in, browse the website, and then jump to the content they care about. Even if a URL changes, people will not be affected (unless they bookmark a page beforehand, then they return to the home page and find that there is a new path to the previous page).
The operation of the Hypermedia API concept is similar to human behavior. Get a list of URLs by requesting the root of the API. Each URL in the list points to a collection and provides information that the client can understand to describe each collection. Whether to provide an ID for each resource is not important (or not required), as long as the URL is provided.
Once a hypermedia API has a client, it can crawl links and collect information, and the URL is always updated in the response and does not need to be known in advance as part of the contract. If a URL has been cached and a 404 error is returned in subsequent requests, the client can simply fall back to the root URL and rediscover the content.
When getting a list of resources in the collection, an attribute is returned, which contains the complete URL of each resource. After implementing a POST/PATCH/PUT request, the response can be redirected to the complete resource by a 3xx status code.
JSON not only tells us which attributes we need to define as URLs, but also tells us the semantics of how to associate URLs with the current document. As you guessed, HTML provides such information. We might be happy to see that our API has gone through its full cycle and returned to processing HTML. Think about how far we have been with CSS. One day we may see it again as a common practice for APIs and websites to use the same URL and content.
Documentation
To be honest, even if you do not meet the 100% mark in this guide, your API will not necessarily be bad. But if you don't write your API documentation in the right way, no one will know how to use it, and it will become a terrible API.
The documentation should be for all developers.
Do not use the automatic document generator, if you must use it, make sure you process it and be displayed.
Don't just list some sample request and response bodies; they should be displayed in their entirety. Use syntax highlighting in your documents.
The document should provide the expected response and possible error prompts for each interface, and how these errors are caused.
If you have enough time, you can consider building an API console (such as Postman) for developers so that they can experiment immediately. It is not as difficult as you think, if you do this, developers (internal and third-party developers) will love you to death!
Make sure that your document can be printed; CSS is rich enough; Remember to hide the sidebar when printing the document. Even if no one has printed a physical copy, you will be surprised how many developers like to print it as a pdf for offline reading.
Errata: Original HTTP packet
Since everything we do is based on HTTP, I will show you an anatomy of an HTTP packet. I am often surprised to find that many people do not know the HTTP request data format. When the client sends a request to the server, they provide a set of key-value pairs, first a header, followed by two carriage returns and line feeds, and then the request body. All of these are sent in one packet.
The server response is the same set of key-value pairs, with two carriage returns and line feeds, and then the response body. HTTP is a request/response protocol; it does not support the push mode (the server sends data directly to the client) unless you use other protocols, such as Websockets.
When you design an API, you should be able to use tools to view the original HTTP packets. Wireshark is a good choice. Also, make sure that the framework/web server you are using allows you to read and change as many of these fields as possible.
HTTP request example
POST /v1/animal HTTP/1.1
Host: api.example.org
Accept: application/json
Content-Type: application/json
Content-Length: 24
{
"name": "Gir",
"animal_type": 12
}
HTTP response example
HTTP/1.1 200 OK
Date: Wed, 18 Dec 2013 06:08:22 GMT
Content-Type: application/json
Access-Control-Max-Age: 1728000
Cache-Control: no-cache
{
"id": 12,
"created": 1386363036,
"modified": 1386363036,
"name": "Gir",
"animal_type": 12
}