GraphQL Schema Design Best Practices
Thoughts from 19th December 2019
Alternative Title: I read tons of GraphQL blog articles and these are my notes.
Build your schema based on existing requirements and your domain model
- It's tempting to try to define the "perfect schema" for all of your data up front, but what makes the graph valuable is the degree to which it follows user requirements - which are constantly changing. Therefore the true perfect schema makes it easy for the graph to evolve in response to changing needs, without breaking existing clients.
- All fields and methods in your schema should be oriented by the demand of the consumers of your API. Fields shouldn't be added to the schema speculatively. You should build your schema incrementally based on actual requirements, and evolve it over time.
- Start with the schema first, and design it purely based on the domain and not what's behind the fields (Databases, Rest Endpoints, etc.). To do this, you have to be either a domain expert yourself, or work closely together with them.
- The schema should never leak implementation details (for example, use
friends
instead ofunfilteredFriendConnection
as a field name).
Don't try to build a one-size-fits-all schema
- One big benefit of GraphQL is that each client can select exactly what they need and want, instead of being forced to consume what the API designer cooked up. You should embrace the different use cases and clients.
- Prefer optimized, exact fields over "smart" fields, whenever distinct use-cases exist.
# Do!
userById(id: ID!): User!
userByName(name: String!): User!
# Don't!
user(id: ID, name: String): User!
- Use unions of specific types instead of flags on general types
# Do!
union Shipping = Pickup | Mail
# Don't!
type Shipping {
isPickup: Boolean!
}
Use consistent naming conventions
enum Region {
EUROPE
NORTH_AMERICA
}
type MarketingCampaignConnection {
edges: [MarketingCampaignEdge!]!
}
type MarketingCampaignEdge {
node: MarketingCampaign
}
type MarketingCampaign {
trackingUrl: String!
region: Region!
}
type Query {
marketingCampaigns(filter: MarketingCampaignFilter): MarketingCampaignConnection!
}
type Mutation {
deleteMarketingCampaign(input: DeleteMarketingCampaignInput!): DeleteMarketingCampaignPayload!
}
type Subscription {
deleteMarketingCampaignEvent: DeleteMarketingCampaignPayload!
}
- Field names should use
camelCase
. - Type names should use
PascalCase
. - Enum names should use
PascalCase
. - Enum values should use
ALL_CAPS
, because they are similar to constants. - Apply the
(Action)(Type)(Modifier)
format to everything. - Use as specific naming as possible (e.g. use
imageUrl
instead ofimage
, oronlineStoreUrl
instead ofurl
) so when the requirements change, you can make changes to the schema without having to introduce a breaking version.
Use Object types instead of simple types whenever possible
# Do!
type Customer {
location: {
city: String!
zipCode: String!
}
}
# Don't!
type Customer {
locationCity: String!
locationZipCode: String!
}
- It may seem ugly at first, but nesting is a virtue in GraphQL schema design. You will want to nest as much as reasonable.
- The rationale for this is that nesting types gives you the most flexibility to evolve your schema in a sensible way without having a breaking version in the future.
- You should heavily consider applying this rule to fields that have a prefix or suffix.
- Another good common example is
image
which should likely not be a simpleString
field but instead an object to allow for example for resizing the image, having text for thealt
field, etc.
# Do!
type Product {
image: {
url (width: Int!, height: Int!): String!
title: String!
}
}
# Don't!
type Product {
image: String!
}
Nest your types, do not reference their IDs
# Do!
type Book {
author: Author!
}
# Don't!
type Book {
authorId: ID!
}
- Coming from a REST API development standpoint, it might be reasonable to embed IDs into the response types.
- This is a big anti-pattern in GraphQL, because you loose the ability to fetch related resources in one API call via nesting in the graph, and (like for REST) need another round-trip to fetch the resources by their IDs.
Use an input
object type for mutations
# Do!
type UpdateAuthorInput {
author: {
id: ID!
firstName: String!
lastName: String!
}
}
type Mutation {
updateAuthor (input: UpdateAuthorInput!): UpdateAuthorPayload!
}
# Don't!
type Mutation {
updateAuthor (id: ID!, firstName: String!, lastName: String!): UpdateAuthorPayload!
}
- Having only one
input
object makes it much easier to make the mutation dynamic with variables (one variable total vs one variable per field). - Nesting the input data into an additional object (e.g.
author
) allows for more flexibility later (for example, if you have asendTeamNotificationEmail
flag, you can now nest it underflags
instead of having to add it directly into theInput
type). - This is also required for Relay's
clientMutationId
used to consolidate mutations and their responses. - Don't reuse the
Type
(e.g.Author
) as theInputType
, because it may contain circular references, properties you may not want the user to be able to set, computed properties, etc.
Return affected objects as payloads for mutations
# Do!
type UpdateAuthorPayload {
author: Author!
}
type DeleteAuthorPayload {
id: ID!
}
type Mutation {
updateAuthor(input: UpdateAuthorInput!): UpdateAuthorPayload!
}
# Don't!
type Mutation {
updateAuthor(input: UpdateAuthorInput!): ID!
}
- Return the affected resources for
create
andupdate
mutations. This makes it easier to directly consume the change in the client without having to send an additional request. - For
delete
mutations, only return the deleted ID, since resolving the relations of the (now deleted) resource can cause errors that are unclear to the consumer. - Nesting the payload data into an additional object (e.g.
author
) allows for more flexibility later (for example, if you want to return some additional data like the ID of a queued backend job). - This is also required for Relay's
clientMutationId
used to consolidate mutations and their responses.
Don't forget about computed fields
- Since clients have to specify exactly what fields they need, don't be shy about adding behaviour driven fields that answer specific client use-cases, and help reducing behavioural logic in the client (for example a
isMergable
field that computes server-side). - Another good candidate for computed fields are fields based on the authentication context (for example
isMe
,iAmFollowing
,myLessions
, etc.)
Use connections for pagination, and use pagination for everything that is a list
type Product {
recommendedProducts (first: Int, last: Int, after: String, before: String): ProductConnection!
}
type ProductConnection {
edges: [ProductEdge!]!
pageInfo: PageInfo!
}
type ProductEdge {
cursor: String!
node: Product!
# Optional: Additional data for the relationship
boughtTogetherPercentage: Float!
}
type PageInfo {
hasNextPage: Boolean!
hasPreviousPage: Boolean!
startCursor: String
endCursor: String
}
# Usage:
product (id: 1) {
title
recommendedProducts (first: 5, after: "4e025") {
edges {
cursor
boughtTogetherPercentage
node {
title
}
}
pageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
# Additional optional fields you could implement:
pageCount
hasNextPages (amount: Integer!)
hasPreviousPages (amount: Integer!)
}
}
}
- Never use
authors: [Author!]!
like you see in many tutorials. It will not scale. - These connections are more forward-compatible, as they allow for adding metadata to the association itself (e.g. for pagination) and the "edge" (the relation between the parent entity and the associated entity).
- You should use these connections for everything that returns a list, so queries, 1:N and N:M relationships.
- If you don't like how this looks (like me), take five minutes and watch the explanation here which helped me understand the reasons behind this design. You can also read more about this here.
- For mutations, use a naming strategy like
(Action)(TypeLeft)(TypeRight)Edge(Modifier)
, e.g.createUserFriendsEdgePayload
Consider adding filter & sort to connections
products(filter: "createdAt < 2019", sort: {field: "createdAt", direction: "ASC"}, first: 5) {
edges {
node {
title
}
}
}
- In a lot of cases, it is very useful to have the ability to filter and order the resources inside of connections similar to the
WHERE
/ORDER BY
part in SQL. This generic interface allows for the implementation of many different features without having to update the schema. - How exactly the syntax for these filters should be defined is up for discussion, but most APIs take inspiration from SQL or MongoDB's
find
syntax. - Keep in mind that the implementation of these filters can get very complex, for example if you decide to allow filtering based on the data of relations (e.g.
stores(filter: "products.title = 'Foo'")
to filter all stores that sell a product with a specific title). Choose to limit complexity based on your requirements.
Provide top level queries for "get from ID" and for "get from filters"
product(id: ID!): Product!
products(filter: FilterString): ProductConnection!
- Providing both a query to handle single resources (e.g. for detail pages), as well as a more general, filterable, paginated query for reading out multiple resources (e.g. for overview pages) handles most common use cases.
- These queries should be provided for all types that can be used as starting points into the graph.
(Optional) Global object identification
- If your application is growing out of proportion, it can help to provide only a single top level
node()
query for handling single resources, that can handle any resource with a global identifier. This reduces complexity for queries because it only needs a single resolver instead of multiple ones. - This global identifier has the type encoded into it, for example via
base64(Author:123)
, and has to be provided on all resources. - You can read more about how to implement this here.