Q: If we could have done one thing to improve your experience what would it be?
End-of-year (EOY) is an excellent time for a retrospective on your app’s growth –specifically looking at where you didn’t grow. At GitHub we conduct a short annual survey, dubbed “The GitHub 365.” We use the 365 to examine what happened with people who signed up for an account, but at some point ceased to return.
Let’s talk about your product:
- Who are your newest users that don’t come back?
- Why did they leave?
- What is one thing you could have done differently to help them succeed?
- How (and should) you try to bring them back?
Designing “GitHub 365”
From a technical perspective, the GitHub 365 is an annual retrospective cross-sectional study that captures and depicts a snapshot of why new users went inactive within their first year of account creation. From a personal perspective, the 365 is our opportunity to listen to people who left us, in other words these are not our superfans.
A 365 study looks back at a year’s worth of inactive account data and can be used to help your organization hit the ground running in January when you’re most likely to have a burst in seasonal sign-ups (those new year resolutions!).
The following are some of the high-level observations we learned from this year’s GitHub 365. My goal in sharing is to inspire you to create a similar end-of or beginning-of the New Year research activity or at least starting talking about it with your teams.
Inactive, but not abandoned
When we think of inactive users we think of abandoned accounts. In this study we reached out to 100,000 inactive accounts created from December 2014 — December 2015. More than 3,000 people responded.
Q: Which best describes why you stopped using GitHub?
People shared thoughts and experiences through closed-response options and open-ended responses, depicting that more than 50% of inactive account respondents indicated intentions to return to GitHub someday. 50% provided quantitative insights, but by reading open-ended responses for the category of “other” (qualitative data) we also learned that having an “inactive” account doesn’t always mean that people aren’t currently using GitHub (more about this in the “quotes” section below).
Ultimately, this year’s study provides us with a new vantage point from which we can more clearly observe new user behavior, including perceived inactivity. Today’s newest users consider GitHub to be a content platform, but newcomers don’t often have ongoing project work (e.g. school assignments, freelance gigs, career-changes).
Before going more deeply into the data and insights, it’s a personal nit of mine to start with the instrument design and the bias we surfaced.
To understand the data, let’s look at how we designed the survey instrument and recruitment strategy.
Survey instruments are deceptively difficult to design well. The 365 needed be short, interesting, and encouraging — we were reaching out to people who left GitHub (essentially said, “See ya!”). With abandoned accounts in mind, here’s the criteria we used for the 365 instrument design:
Less than one-minute to complete.
Five required product/behavioral questions about activity (closed-answers and an “Other” for chatty folks).
Five optional questions about personal demographics.
The core question:
Q. Which best describes why you stopped using GitHub?
. . . solicited brilliant open-text responses like:
A. Just climbing the first steps on the ladder. Thank you!
When we think of inactive users we think of abandoned accounts. However, in this study we reached out to 100,000 inactive accounts created from December 2014 — December 2015. We quickly learned from respondents that people who fall into GitHub’s inactive category don’t line up to our traditional definition.
Instead, most inactive accounts are temporarily dormant. It turns out that peoples’ projects end, life changes happen, or our humans don’t feel the need to log into the site to get value and instead browse as logged out users. Who knew!?!
To reach inactive users, we sent out an email with a link to the survey to a random sample of 100,000 qualified accounts. Accounts qualified if they were:
- Created between 6 months and 1 year ago (between Dec. 2014 and June 2015).
- Had a verified email and were not flagged as spammy.
- Had not been “active” (not pushed code, created any repos, issues, or pull requests; or commented on issues or pull requests) in the previous 3-month period (we wanted to make sure to avoid any seasonal lulls with October — November holidays).
We always begin our analysis by determining “Who” are the respondents and look for blind spots in our recruitment and identify areas of bias. We start with weaknesses in the data, so we can apply rigor in our analysis.
In this case, we looked at respondents’ IP address or account records, and incorporated data (when available) on the following attributes to surface overrepresented and underrepresented groups::
- Account creation date
- Organization membership (paid and free)
- Account age at last activity (tenure)
We additionally solicited self-reported data from respondents on personal background and demographics, including:
- Programing experience
- Formal computer science training
Our team reserved some of this data for analysis in later studies, but made heavy use of the data on current technical background (CS degree v. self-taught) in this study because it is a strong predictor of responses.
Q: Did you study computer science formally?
A: 58% self reported CS training and 42% shared they went their own path outside of school.
Assessment of selection bias
Before we look at some of the results, let’s address the elephant. All surveys are subject to selection bias, since users who respond to an invitation to participate in research are systematically different than those who choose not to. Of the 100,000 inactive users we sampled, 3,000+ completed the survey, for a response rate of 3% (note: these folks have left GitHub, so we were pretty pleased with the response rate and quality of feedback in open-ended text fields).
We started by examining the differences between respondents and non-respondents to measure the bias introduced by the process of self-selection into survey participation. Skipping the nerdy details, I’d like to emphasize analyzing who + bias is a critical step to take before you evaluate any data set.
Know who responded, shine a light on where you have blind spots and bias.
Qualitative data & insights
As mentioned, more than 5o% of respondents indicated that they would be back, and were just busy. To better understand both the behavior and the motivations, we looked deeply into the open-text responses to the question:
Which best describes why you stopped using GitHub?
Open-ended responses are unstructured data that require hand-coding (a labor of love) into categories. In the category of “Other” the majority of open-ended responses were categorized in the following seven buckets:
1.) Project-driven. People who signed up for GitHub to work on or download a specific project (includes references to online courses like Coursera and consulting gigs).
We built our site, launched and haven’t been maintaining it. No need for GitHub right now.
I made this account ahead of time, i’m currently still studying to be accepted to a full time CS study and thought this would be useful. Not enough time to start using it yet.
Dropped class I was taking that required a GitHub account. Thanks for the notification; I will need to delete the account some time soon!
We mainly used GitHub for a temporary enterprise project. It provides us the best way to exchange our information and we were very glad of using it. This temporary project ended so, for the moment, we don’t use it. But be sure that if needed we will use this again
2.) Exploring/Curious. People who signed up for GitHub to explore the platform and projects. May still be exploring as a logged out user, and may also have realized that GitHub is a product namely for programmers and doesn’t meet their needs.
Just exploring. Will be back soon. Love everything Linus!
I’m just not logging in, but using it a lot
I’m the business guy — the development team uses Git
I was exploring programming, but for now my interests are leading me in a different direction.
3.) Occasional & forgetful. People who signed up and forgot about GitHub, or only occasionally use it (may be using it as a logged out user).
I don’t need to use version control very often, but occasionally I need it. Like once a year occasionally.
4.) Worker Bots. Accounts created for machine users (e.g. CI, CD, and other integrations).
5.) Life. The most surprising and heartstring-pulling. People who signed up, but experienced life changes in the realm of career, personal, and even political spheres (being called to war), and are not currently using GitHub.
LIFE — kinda sucks lately but I’ll be back
We are at war and my skills are being used in other places.
Promoted, no need for it at this time
Changed jobs. I no longer supervise a programmer that uses GitHub.
I am taking data science courses that require GitHub & am a disaster responder with the American Red Cross. I began deploying shortly after opening my account. The disaster season is ramping down so I will be back on GitHub shortly. :)
6.) Multiple Identifies & 2FA. People who signed up for a secondary account to be used for work or personal (segregating worlds), or were locked out via 2FA and created a new account because they were unable to recover their password.
Personal account I use for test purposes. I use another (more active) account for business purposes.
I forgot my username and created a new account and am using that.
It is because i have not needed a version control system and i have another account for work that we use so don’t worry i’m still here
7.) Potatoes & other tales. A catch-all category affectionately named.
My code is potato. No sense potatoing the nice things on GitHub
I’m not exactly sure what to do with github. I work alone.
As a perfectionist, i like to get something useful with best quality in mind before letting the community try to change something or get to the backbone of my work.
If not GitHub, then what?
Q: Which version control system are you primarily using for code?
Graph by Frances Zlotnick, GitHub’s first data scientist
Experience. Programming experience strongly conditions the responses to this question. A large majority of both people who do not program and those newer programmers report that they don’t use any other version control system for code, while slightly less than half of moderately experienced programmers and a small % of very experienced programmers say the same.
SVN. While SVN is the most frequently cited competitor selected over Git and GitHub, this varies quite a bit by market. The frequency of SVN in this sample is due largely to respondents in foreign markets, where SVN is much more common.
Human Age + VCS. BitBucket, GitLab, and Dropbox users are significantly younger than those who are instead using SVN, Mercurial, or TFS.
Team Foundation Server (Microsoft). TFS users are more likely to have formal CS training, while Dropbox users are particularly unlikely to have studied programming formally.
VCS + Education. More than half of those not using any VCS have no formal training in programming.
Open text data:
Digging into the stories again, here is a selection of open-ended responses that surprised and delighted us:
I use cvs. Don’t blame me…
facebook messages :(
google drive, aka, no real version control system
Not using any and my work is a big mess
What is a control system?
Git built into IDE
I attach a version number to the files I save — much easier than using Git.
By looking back over the past 365 days, this study explores the relationship between inactive accounts and attributes including: geography, account creation range, organization membership (paid/free), activity and drop-off, programming experience and coursework, and responses to questions about motivations and support.
Project-driven. GitHub’s newest users are increasingly pragmatic and project-driven.
Audience-focus. Getting started is especially challenging for users without formal CS training.
Programming experience. Experience programming is a strong predictor of familiarity with version control. Most experienced programmers use some type of VCS, but a plurality of our inactive new users don’t.
The largest responding population identified, “I’ll be back, I’ve been busy,” which challenges the notion that inactive accounts are abandoned accounts. We looked for differences in the respondent group who identified that they would be back, finding that they are instead best characterized by their heterogeneity. They are a bit of everything, which indicates that the phenomenon of starting something and then not having time to follow through is both common and universal.
Our biggest takeaway is that listening to inactive account holders tells us that they might more fruitfully be thought of as dormant rather than abandoned, and could use a gentle ‘git push.’