If you have an open source project, the more you can automate around the pull requests review process the better, at least that’s what I believe.
So if you go through your checklist for expectations you have some things are obvious continuous integration which there already exist good services for, but some of your checks could be unique and currently manual — one my just recent automated check is the example in this post.
The example explain in this blog post will be validating so each commit in a pull request comes from an known GitHub user, it’s a quite easy check, but also easy to miss when reviewing yourself as the pull request itself will always originate from an known GitHub user — so just by looking at the pull request on GitHub it’s not an obvious issue.
The reason to do this could be i.e. if you want to automate attribution of authors in release notes, validate CLAs and so on.
The reason for this occurring often isn’t malicious intent, but rather that a to GitHub unknown email address had been used to do the commit, i.e. issuers corporate email address.
The easiest fix is just to add all email addresses under their GitHub Settings/Emails, the other option is to amend the commit with the desired email address and do force push with the new commit.
FaaS — Functions as a Service — is very well suited for these kind of scenarios as you don’t need a web site/server up and running 24/7, but instead FaaS enables you to have code that’s only up and running when invoked — where the idea is paying only for what you use when you use it.
Roughly the flow will be GitHub triggers a webhook call to a function, the function will parse out what’s needed from the GitHub payload and place it on an Azure Storage Queue, the queue will trigger a function that does the validation and reports back status to the pull request using the GitHub rest API.
I usually use the Azure Functions CLI scaffolding, write my functions in VS Code and then deploy via KUDU or CI/CD of choice. But the Azure portal is quite capable too and can be a quick way to get started and it’s fully possible do get the code to source control later (source control should always be the way when in production, but when getting started, prototyping and playing around the portal is actually quite productive)
Azure Functions has a ready template to quickly get started with a GitHub webhooks.
This template will give you a good starting point to dynamically parse the data posted from GitHub, making easy to support most scenarios.
So now we got data setup to come in to the service and got the ability to parse the incoming data. Now we just want to quickly cherry pick what’s needed out of the GitHub payload, push it to a queue for later processing and tell GitHub we got it.
Go to “Integrate” to add a queue output via the portal, click “+ New Output”, in this case we want an Azure Storage Queue
You then give the “queue output” a parameter name which will be the name of the output parameter in your function, then name of the queue to push messages to and last choose the appsetting
where your storage queue connection string resides.Hit save and our queue parameter binding is ready.
Now we just need add it as an output parameter to our function so we can assign content to it:
As you might see here using output parameter you lose the async/await capabilities, if you don’t need multiple outputs and just want to return“200 — OK” if processed and queued successfully, then you can keep using async/await by going under “Integrate” removing the HTTP result binding and setting the queue to bind to the return value of the function.
So a complete function that parses and queues while still supporting async/await could look something like this:
A neat thing is that the queue item can be changed to custom class and then it will automatically serialize and deserialize the JSON string.So lets create an class and to keep it nice and tidy we’ll add it to it’s own file.Hit “View Files”->”+Add”, then type in the filename hit enter.
The class to our payload would look something like below
We can then in our function run.csx
use the #load
directive like #load "GitHubPayload.csx"
, it’s available to us in our function and we can change our function method signature to a typed return value and end up with something like below:
So what does the above do? Not much and a lot, it parses the GitHub pull request event’s action, URL to post status to and URL to fetch pull request commits. As we bound the return value to the queue it’ll be automatically serialized as JSON and queued.
Now we need something to pop the queue and do the actual work.
To get up and running quickly Azure Functions actually offers an scaffolding option that will get you up and running quickly.
If you go to Integrate and the queue output we previously created there’s a “Create a new function triggered by this output” action .
This will pre-select a C# Queue Trigger, fill in the queue name and the storage account connection used.
All you need to do his hit create, but you probably want to name your function more meaningful than QueueTriggerCSharp1
too.
The function crated will look something like below.
It gets triggered when something gets enqueued and the dequeued value is what you get in and the template example just outputs it to the trace log.
Also the dequeue trigger can be typed by reusing the class we previous created, this way Azure Functions will automatically deserialize the queued JSON into an typed object we can easily work with.
The above code will validate that it’s an event/action we’re interesting and make sure we’ve got status API URL needed to report back status and commits API URL needed to fetch pull request commit details.
Now we got queue item parsed and ready, next step is to fetch the commits from GitHub, before we can do that we need to be able to authenticate against them.
GitHub has several ways of authenticating against their APIs, one way is thru personal access tokens, which is what I’ll be using for this blog post.
You’ll find your personal access tokens by going to “Settings” and then click on “Personal access tokens” in the left menu.
Then click the “Generate Token” button, give the token a description, choose the required access, there’s loads of available permissions, but for this post “repo:status” is the only access needed
Hit “Generate token” and you’ll be presented with the newly generated token.
We don’t want to store our token in our source code, fortunately Azure Functions just as the other Azure app services provide means to store app settings.
You find it under “Function app settings” -> “Configure app settings”
There you have a list of key value pairs, which you can add, delete and modify both app settings and connection strings.
All app settings & connection strings can be accessed as environment variables, which makes it easy to access regardless of which of the languages supported by Azure Functions you choose to use.
Connection strings environment variables are prefixed with it’s provider, i.e. if you choose the custom provider it’ll be prefixed CUSTOMCONNSTR_
, so if you enter a variable named GITHUB_TOKEN
you’ll access it as CUSTOMCONNSTR_GITHUB_TOKEN
.
So now we got our token and app setting in place we want to get
and iterate over pull request commits, validate each and then post
to if it passed or failed validation. The .NET HttpClient
is used to call to the GitHub API, for convenience and to keep it nice and tidy I’ve wrapped it in two helper methods GetObjectAsync
and PostObjectAsJsonAsync
, both utilizing the same method for common headers and authentication, all placed in it’s own HttpClientHelper.csx
file
Putting all the pieces together we end up with a function like this
And we now we’re feature complete, ready to queue GitHub webhook request, validate input, validate and report that commits has an registered GitHub user as author.
Now we just need to configure GitHub to call our function when a pull request is created or modified. This is done under repository (or organization) settings “Webhooks”->”Add webhook”
You can under your GitHub webhook function find the URL and secret to be used. Enter these as Payload URL, Secret and select application/json
as Content type
Choose to select individual events and select only the pull request event
then just hit “Add webhook” and your done.
Now we create our first PR to the repository we’ve configured our webhook on.
If all goes to plan, you should see that 1 author validation has been performed and as I’m a known GitHub user the test passes and all’s OK.
Now we’ll add a commit from an unknown user, easiest way to test this is by overriding the author when doing the commit. From command line that would look something like this
git commit -a -m "Unknown update" --author "John <John@unknown.doe.com>"
Then we push our changes, it’ll trigger the GitHub pull request synchronize
event and our functions
Pressing the details link will take us to the commit displaying the unknown author.
It probably took a lot longer to read this post than it would take to get the functions up and running — as with many things it’s easy when you know it. It really lets you focus on what you want to solve and as I’ve shown combining with third party APIs is super simple which makes it really powerful. Working with queues is a breeze and for incoming web hooks that’s a great approach as you just receive and acknowledge — doing the actual work in a separate function, you can hook up functions to a poisonous message queue for each queue making it a very robust solution — without turning it too complex.
On a final note extending your GitHub pull request review process with small automatic checks is a great way to not only save you time, but also speed up the communication towards your contributors — and the sky truly is the limit on what you can check! You can also do bot like things like certain maintainer comments triggering a integration test, creating new issues, a Microsoft Teams message, etc.
So I think for these scenarios the functions as a service approach fits like a glove! What do you think? Would love to hear your feedback!
The complete code for the functions in this blog post can be found on GitHub
azurevoodoo/OnlyKnownCommitters_Contribute to OnlyKnownCommitters development by creating an account on GitHub._github.com
If you want to learn more Azure Functions please checkout some of the previous posts I’ve done on the topic!
Bring static to life using serverless⚡_Breathe life into your static website using Azure Functions_hackernoon.com
⚡Azure Deployment -> Microsoft Teams_Using Azure functions to hook up Azure app service deployment notifications to a Microsoft Team channel_medium.com
Going serverless with PowerShell_Why should JavaScript developers have all the fun?_medium.com