Enginursday: GitHub API Introduction

We dig in to the Octocat's API

As often discussed on Enginursday, we are really picky about what tools we like or don't like, which ones we use constantly, which are for those rare one-off instances, etc. This week's tool of choice for discussion is GitHub's own tool, the API.

The GitHub API is a wonderful option for interfacing your own apps or projects with GitHub. If you want to create a GitHub server that notifies you of pull requests on your repositories, or a display to visualize how many issues a repository has, the API is the tool you need.

Several libraries exist to interface with the API. I have recently been playing with agithub - The Agnostic GitHub API, a Python module that works well for getting your hands dirty on the API. There are several other Python options available as well, which makes interfacing your Edison or Raspberry Pi projects with the GitHub API very simple.

What exactly can you get with this API? A lot of information it turns out. For example, if I want to find out general information about a user, I can run a basic script and get back information regarding the number of repositories they own, whether they are a site admin for GitHub, their location, etc. The agithub module allows me to call all information for my GitHub user name with a single line of code: g.users.ToniCorinne.get(). That provides the following information:

[
[
    200,
    {
        "avatar_url":"https://avatars.githubusercontent.com/u/2359976?v=3",
        "bio":null,
        "blog":null,
        "collaborators":0,
        "company":null,
        "created_at":"2012-09-17T04:27:48Z",
        "disk_usage":1368,
        "email":null,
        "events_url":"https://api.github.com/users/ToniCorinne/events{/privacy}",
        "followers":23,
        "followers_url":"https://api.github.com/users/ToniCorinne/followers",
        "following":9,
        "following_url":"https://api.github.com/users/ToniCorinne/following{/other_user}",
        "gists_url":"https://api.github.com/users/ToniCorinne/gists{/gist_id}",
        "gravatar_id":"",
        "hireable":null,
        "html_url":"https://github.com/ToniCorinne",
        "id":2359976,
        "location":"Boulder, CO",
        "login":"ToniCorinne",
        "name":"Toni Klopfenstein",
        "organizations_url":"https://api.github.com/users/ToniCorinne/orgs",
        "owned_private_repos":<Number of private repos>,
        "plan":{
            "collaborators":<Number of collaborators>,
            "name":"<Billing Plan>",
            "private_repos":<Number of private repos available>,
            "space":<Space>
        },
        "private_gists":<Private Gists>,
        "public_gists":0,
        "public_repos":11,
        "received_events_url":"https://api.github.com/users/ToniCorinne/received_events",
        "repos_url":"https://api.github.com/users/ToniCorinne/repos",
        "site_admin":false,
        "starred_url":"https://api.github.com/users/ToniCorinne/starred{/owner}{/repo}",
        "subscriptions_url":"https://api.github.com/users/ToniCorinne/subscriptions",
        "total_private_repos":<Number of private repos>,
        "type":"User",
        "updated_at":"2015-08-25T16:45:49Z",
        "url":"https://api.github.com/users/ToniCorinne"
    }
]
]

You can also do the same thing for organizations. For example, if I query 'SparkFun' as an organization using the command g.orgs.SparkFun.get(), I get the following info back:

[
[
    200,
    {
        "avatar_url":"https://avatars.githubusercontent.com/u/142880?v=3",
        "billing_email":"<Email Address>",
        "blog":"http://www.sparkfun.com",
        "collaborators":<Number of collaborators>,
        "company":null,
        "created_at":"2009-10-21T22:17:05Z",
        "description":null,
        "disk_usage":<Disk Usage>,
        "email":null,
        "events_url":"https://api.github.com/orgs/sparkfun/events",
        "followers":0,
        "following":0,
        "html_url":"https://github.com/sparkfun",
        "id":142880,
        "location":"Boulder, CO",
        "login":"sparkfun",
        "members_url":"https://api.github.com/orgs/sparkfun/members{/member}",
        "name":"SparkFun Electronics",
        "owned_private_repos":<Number of private repos being used AND owned by SparkFun>,
        "plan":{
            "name":"<Plan level>",
            "private_repos":<Total number of private repos available>,
            "space":<Spaaaaaace>
        },
        "private_gists":<Number of private gists>,
        "public_gists":1,
        "public_members_url":"https://api.github.com/orgs/sparkfun/public_members{/member}",
        "public_repos":554,
        "repos_url":"https://api.github.com/orgs/sparkfun/repos",
        "total_private_repos":<Total number of private repos being used>,
        "type":"Organization",
        "updated_at":"2015-08-15T18:38:23Z",
        "url":"https://api.github.com/orgs/sparkfun"
    }
]
]

At a glance, I can see that the SparkFun organization has been on GitHub since October 2009, and currently has 554 public repositories. Because I'm a member of the SparkFun organization, I also get to see additional data that non-members won't see when they get the organization info, such as billing addresses, billing plan info, etc.

In contrast, if I query Adafruit on GitHub, I get a lot less data back.

[
[
    200,
    {
        "avatar_url":"https://avatars.githubusercontent.com/u/181069?v=3",
        "blog":"www.adafruit.com",
        "company":null,
        "created_at":"2010-01-12T23:57:58Z",
        "description":null,
        "email":null,
        "events_url":"https://api.github.com/orgs/adafruit/events",
        "followers":0,
        "following":0,
        "html_url":"https://github.com/adafruit",
        "id":181069,
        "location":"New york city",
        "login":"adafruit",
        "members_url":"https://api.github.com/orgs/adafruit/members{/member}",
        "name":"Adafruit Industries",
        "public_gists":1,
        "public_members_url":"https://api.github.com/orgs/adafruit/public_members{/member}",
        "public_repos":419,
        "repos_url":"https://api.github.com/orgs/adafruit/repos",
        "type":"Organization",
        "updated_at":"2015-08-23T18:43:44Z",
        "url":"https://api.github.com/orgs/adafruit"
    }
]
]

As you can see, none of those juicy billing details or private repo information shows up for me - which is a pretty good security feature!

Jeff is watching

Security of your information is important - you never know who might be watching.

While you can call data from the API without authenticating or signing in, you are limited on what data you can get back, and how many requests you can make in an hour. Luckily it's easy to authenticate using a Personal Access Token, and initializing your script with the following command: g = Github(token='<BIG LONG TOKEN NUMBER HERE>'). Alternatively, you can use your user name and password - this will depend on your particular application and what security method is preferred.

One thing that is easy to forget (especially if you're being good about version control) is that you can accidentally post your token in your repo for the world to see. If you forget at the end of the day and commit your token to a repo, you will probably get a nasty-gram from GitHub warning you that they have revoked your token's access, meaning any script running that will no longer be authenticated.

GitHub Security

Thanks GitHub for keeping my repos safe and secure!

If you'd like to run the demo scripts I created, check out our repo here. The scripts output the information to text files, which you could parse out at a later date if you so desire. Keep in mind you will need to enter your own authentication token or your user name and password. Don't push those to GitHub if you decide to save your scripts!

Have you used the GitHub API with any of your projects? We'd love to hear about it!