(See also Building a self-updating profile README for GitHub on my blog)
For my auto-updating personal README I needed to fetch the latest release for every repository I have on GitHub. Since I have 316 public repos I wanted the most efficent way possible to do this. I decided to use the GitHub GraphQL API.
Their API allows you to fetch up to 100 repositories at once, and each one can return up to 100 releases. Since I only wanted the most recent release my query ended up looking like this:
query {
viewer {
repositories(first: 100, privacy: PUBLIC) {
pageInfo {
hasNextPage
endCursor
}
nodes {
name
releases(last:1) {
totalCount
nodes {
name
publishedAt
url
}
}
}
}
}
}
This gives me back my 100 first repos, and for each one returns the most recent release (if a release exists).
Just one problem: I needed to paginate through all 316. The way you do this with the GitHub GraphQL API is using the after:
argument and the endcursor
returned from pageInfo
. You can send after:null
to get the first page, then after:TOKEN
where TOKEN is the endCursor
from the previous results.
My Python code ended up looking like this (using python-graphql-client):
from python_graphql_client import GraphqlClient
client = GraphqlClient(endpoint="https://api.github.com/graphql")
def make_query(after_cursor=None):
return """
query {
viewer {
repositories(first: 100, privacy: PUBLIC, after:AFTER) {
pageInfo {
hasNextPage
endCursor
}
nodes {
name
releases(last:1) {
totalCount
nodes {
name
publishedAt
url
}
}
}
}
}
}
""".replace(
"AFTER", '"{}"'.format(after_cursor) if after_cursor else "null"
)
def fetch_releases(oauth_token):
repos = []
releases = []
repo_names = set()
has_next_page = True
after_cursor = None
while has_next_page:
data = client.execute(
query=make_query(after_cursor),
headers={"Authorization": "Bearer {}".format(oauth_token)},
)
print()
print(json.dumps(data, indent=4))
print()
for repo in data["data"]["viewer"]["repositories"]["nodes"]:
if repo["releases"]["totalCount"] and repo["name"] not in repo_names:
repos.append(repo)
repo_names.add(repo["name"])
releases.append(
{
"repo": repo["name"],
"release": repo["releases"]["nodes"][0]["name"]
.replace(repo["name"], "")
.strip(),
"published_at": repo["releases"]["nodes"][0][
"publishedAt"
].split("T")[0],
"url": repo["releases"]["nodes"][0]["url"],
}
)
has_next_page = data["data"]["viewer"]["repositories"]["pageInfo"][
"hasNextPage"
]
after_cursor = data["data"]["viewer"]["repositories"]["pageInfo"]["endCursor"]
return releases
Full code here: https://github.com/simonw/simonw/blob/50d4188f9f067b68b2203540f1983750d51800db/build_readme.py