How an interview code submission that wasn’t even submitted changed our process
In a previous role, I was an engineering manager for a well-known company for a particular tech stack. We were heavily involved in the community and allowed for remote hires, so we received a constant influx of applications for open roles. One way we sorted through them all was requiring coding tests for potential candidates. And as you can imagine, we got all ranges of results from massively impressive to wondering if this candidate was messing with us. But one truly stood out, and it taught me to think about what I am really looking for in these sorts of submissions.
First, a disclaimer.
I know the practice of asking people to code for free just to get an interview is not popular right now. The pros and cons of this are a completely different discussion altogether. During the time this story occurred, it was more accepted, and we had a lot of success with it. There was a time when people didn’t have a bunch of public repos to peruse.
The submission process
We made a public repo on GitHub under our org account for the purpose of tests. The instructions and ask were simple, and laid out in a README.
Instructions: 1. Fork this repo on GitHub 2. Create a program that can interactively play the game of Tic-Tac-Toe against a human player and never lose. 3. Commit early and often, with good messages. 4. Push your code back to GitHub and send us a pull request. We are a [REDACTED*] shop, but it is not a requirement that you implement your program using that tech stack.
*Tech stack removed to protect the innocent.
That was all we asked; we intentionally left it open-ended. Some people tried to impress us with huge, elaborate apps that used different services and engines. We had one submission that worked through CLI just because “I was bored and wanted to try it.” We tried to keep an open mind when reviewing the candidates, and multiple people were required to weigh in before a final decision. If there were more approvals than rejections on the PR, we brought the person in to learn more.
We were not looking as much at the tech, typos, edge case bugs they left, or even if their tic-tac-toe engine never actually lost; we had many unanimous approvals for apps that sometimes lost. We wanted to see factors that tended to align with our team and workflows: How often did they check in? How were the commit messages? Were tests added or needed? How readable and organized was the project?
There was no hard list, but we found early on that these were better factors to look at instead of just focusing on if the app won all the time. We tried to be fair, but often we saw some submissions that were not in good faith. Sometimes we saw apps that were straight copy and pastes from other sites — we began to memorize them after a while — with comments and credits from a whole other author. But more often than not, if the app made sense and the code was readable, even if in a very different style than what we were used to, we would be up to talking more and asking the applicant about it.
The one that stood out
After talking with a recruiter who introduced us to a particular candidate, it became obvious that it might be tough for this applicant to move forward with the coding test. The person was very busy at work and was worried they couldn’t do a code submission in time. I let them know there was no time limit or minimum for how many hours they had to work on it; we wanted to make a decision in about two weeks, so that would work for us if it would for them. Everyone gave a thumbs-up, and we were hopeful for a good submission.
What more am I trying to learn?
At two weeks minus a day, we got a pull request and an email. We looked at the PR first, and, sadly, there was little to go on. The structure of the app was well-organized, and we could see the path they would have taken — commits were frequent and with good messages — but the meat of the app was missing. Sadly, we couldn’t even run it, and we were pretty sure that time just ran out on them. I read the email, and they were very apologetic. They explained that they didn’t have time due to work and personal issues, so their submission was incomplete. But then over the next three paragraphs, they explained what they would have done.
- They linked to articles on Minimax that they were going to take as inspiration. They wondered if Negamax may be faster and would have tried to find out.
- They listed parts they thought would be tough to deal with based on experience and listed some things they would try if plan A failed.
- They wrote how they would add tests for certain sections but wouldn’t for others and gave a quick explanation on what they called “test bloat” and why they tried to avoid it.
The points were concise, but still very clear. Normally, I would have replied with well wishes on their challenges and mention that I would reach out if we start another round. But after thinking about it, I wondered: what more am I hoping to learn?
We had a few early pieces of their code to see a little style and their thought process on how they would move forward to address pitfalls. Even the commit messages, as few as they were, showed clarity and consideration for the reader. I compared the factors that became clear on this submission to other more finished examples and noticed that I got just as much of a view into the candidate in this submission with an explanation as I did in others with unanimous approvals. So I copied the three paragraphs from the candidate with my thoughts, a link to the PR, and emailed it to the pool of reviewers before I went to my next meeting. When I came back, I had three replies in the email chain that just said, “Ship it.”
The lesson
I thought about that submission a lot. Why did this work well yet was a big deviation from what we had planned? How did we get such a strong impression of them with very little code? What am I really looking for in these code submissions? The takeaway wasn’t that they got the interview (they did great) or that they got an offer (they politely declined), but in asking “what are we really looking for in these tests?”
It is a tough question. The change for us wasn’t immediate, it was far more gradual in allowing a wider range of how submissions could be done. Do we even need code? How much code? Let’s try less. Let’s try none! Do we just skip to the phone screen and talk about how they would start?
We started looking at how many people started participating more in the tests and saw the number of great candidates that came along with this more open-minded approach.
What we do now
Since then, I have moved away from coding tests before having a conversation. There are now many avenues to see how people develop that don’t require them spending a whole night coding for just a chance at an interview. Interviewing and screening will always be hard to do well; they take a lot of work and understanding. Now I try to take a moment before each step and remind myself what I am attempting to learn with the process I am working from, inside and outside of interviewing. There is always fluff that feels comfortable and obvious, but when we are able to focus on what we are really trying to learn, and the wide array of ways we can learn, it becomes a lot easier to connect with people. And in the end, whether in hiring, coaching, or giving recognition, connecting with people is the goal for a manager.
Tags: bulletin, interviews, stackoverflow
16 Comments
Your coding test was better than most in that it tested an area where even highly experienced candidates are likely to show enormous differences in ability, namely their effective use of collaboration tools. Abandoning that test says your company has the resources to train those basic skills. Not all companies have that luxury.
Not necessarily. It says, as the author noted, that people today generally “have a bunch of public repos to peruse”. In other words, these days you typically don’t have to TEST someone’s ability to code collaboratively in a DVCS-based environment: If they’re worth talking to, it’s pretty likely they’re ALREADY doing that, and all they need to submit is a link to their Github (or other public repo) account.
If you’ve got a potential candidate with NO public development “paper trail” at all — maybe they’re just starting out, maybe they’ve been working exclusively on for-pay or contract code managed in private repos, who knows? — then how to best gauge their abilities becomes more of an open question.
But even then, remember that this test gated the ability for someone to BECOME a candidate. You had to pass this test to get the initial interview, not the job. Eliminating that test doesn’t commit you to hiring people without those skills, or to training them once they are hired. Some companies still will, because as you say they have the resources, and are willing to devote them to grooming more entry-level candidates who may not yet possess all of the required skills.
But giving candidates a busywork homework assignment is, TODAY, pretty time-wasting and borderline insulting for the majority of candidates who can just give you a Github URL, and that alone provides a far deeper insight into their skills, knowledge, style, and ability to work with others than some contrived programming project.
Ben, thank you for presenting this well thought out inside look at a real manager’s focus on what matters to you in a quality candidate. It’s not often we get to see what the hiring manager is really thinking and is willing to let ‘outsiders’ know what it takes to succeed with a company such as yours. I found your post searching for a way to connect VSC to Chrome for debugging cross platform applications. Thanks for the insight into what it really means to code for a living (I’m an electronics tech, troubleshooting to the component level. I just thought I’d branch out and see what coding is like, so I could build something that could squash SPAM!)
As a Business Analyst / Product Owner , I would say you focussed on your requirements rather than on the solution 🙂
It sounds trite, but I’ve regularlly surprised myself throughout my career how much value I’ve added by using this approach in discussions with developers.
If you can not finish a basic task such as this in two weeks it shows either that you do not care about getting the job, or that you have bad time management skills and waited until the last minute to start.
This task was very open ended and as the designer of the program you can make design decisions to reduce the work for yourself. With tictactoe there are 3 main subtasks with complexity.
1. The user interface
2. Enemy AI
3. Checking for the win condition
Number one can be simplified by using a simple CLI. Input validation is simple bounds checks and checking if the desired cell is empty.
Number two can be simplified by finding a simple strategy that will never lose. A simple strategy I came up with is having the AI always go first and having it choose the center square. For the AI’s subsequent moves you simply place them adjacent to the player’s last move. You can reuse the cell input function from number one. This means all your AI has to do is try to make a placement in one of the 4 adjacent cells until it succeeds.
Number three can be simplified by knowing that your AI will never lose. You also know that your winning move will always be on the outer edge / corner. Additionally, we know that a winning move must pass through the center due to strategy. This means checking for the end of the game is done by checking the tile on the opposite side / corner of the board when your AI makes a move and by ending the game in a stalemate if 9 turns have been taken total.
It probably takes longer to customize your resume and write a cover letter than for implementing this.
It depends on what you’re looking for. Good engineers should demonstrate that they have an insight into what they’re being asked. The submission in question showed by writing out what they would have done that they understood that the hirers wanted to understand more about their approach.
Just completing a task completely misses the point of the recruitment – if you only ask for code that delivers the application then that doesn’t tell you much about the candidate and it also leaves it open to plagiarism (yes you have tools to check for that but it’s not the point).
All you’ve done is given a potential solution which anyone could do with a bit of searching on the internet. Recruiting developers should be trying to determine engineering approach and ability to work in a team. Your strategy is pretty specific so how does a hiring manager get an insight into how you might apply this to e.g. a real world problem involving distributed systems and concurrency?
Bad at time management? OK what if the candidate has a sickly partner and two kids aside of full time job? Does it still mean he is bad at time management due to having other obligations that are important which can’t be just put aside? Maybe because of that, that candidate is very good at time management?
Please don’t be a douchbag.
Not everyone is a programming genius like you.
Wow such smart, so skilled. Go on fill in all the jobs of the world, take all the salaries and save us from the misery of having to work for, or with someone like you.
You’ve gone out of your way not to mention the company or tech stack, but there is a fork of the repo on your GitHub account.
This is really neat! I intuitively knew that approach to coding matters more than how you write a piece of specific code. Your article made that subconscious thought more concrete. Thanks for sharing.
I’m not sure assessing someone’s GitHub account offers the necessary insights required. Typically, GitHub projects are passion projects: open-ended, stop-start, motivated because what they’re building is their passion. If it’s being built by a lone developer, comments, documentation will classically be in absence – not to thwart any teamwork, but simply because that is the time-saving approach thing to do when you develop alone.
A job demands something entirely different. It requires a specific task be completed – one the developer may not necessarily enjoy. It requires that task be developed to a deadline, none of this ‘stop-start’ stuff. It requires comments, documentation for other team members. It also requires social interaction. Some of the best programmers in the world can have terrible personalities that clash in a team environment. Such cooperation, social skills, etc, cannot be gleaned simply by looking at a repo.
Your own edge-case scenario used to justify the scrapping of the code tests is arguably a failure that argues against that action. The candidate not only failed to complete an arguably simple task, but they ultimately declined the offer, which means everyone’s time was wasted. It also means you have zero basis for justifying the changes – you don’t know what they actually would have been like in a work environment. For all you know, the lack of completion could have been an ongoing trend, which would have gotten worse with more complex projects.
Additionally, the coding test setup wasn’t inflexible, so there wasn’t a mistake to begin with. You generated exceptions in case scenarios where you felt people still had a chance, were still able to filter out bad quality candidates (who tried to copy-paste code), and ultimately hired someone else – someone, I imagine, not only complete the code, but also didn’t cause any grief or issues. This suggests, largely, the test was still working.
I’d argue the failure was somewhere else: why did they decline? Was the company not forthright in it’s contract hiring terms, and the candidate only found out last minute they weren’t suitable? Insufficient pay? Non-ideal work environment? Or was the candidate merely chancing their arm, and had no real commitment to the job – just like they had no real commitment to finishing the code?
I think your test performed fine, and that you took the wrong lesson from it. A candidate you couldn’t even hire anyway would have been screened out, and by granting exceptions you ended up wasting time.
We used coding tests but they ended up being very small, a single class or even a single method (merge several sorted arrays into a single sorted array, see if a collection of letters are all contained in an input string, etc). We didn’t want something that people were going to spend time on, and we didn’t want something that would have required serious problem-solving or some kind of Eureka! moment to figure out a trick. We didn’t use these coding problems as a ticket to an interview, but instead we interviewed everybody who got to that point. A big part of the interview would be a walkthrough of the solution and a defense of decisions made. The submissions ranged all over the map. Some people would turn in an entire .zip with complete solution, build scripts, unit tests, etc. Other people would turn in just a single code file with class and method but no infrastructure to run it. One guy turned in a .txt file with just a few relevant lines of code, that took a lot of effort for us to get to compile. Sometimes the code samples were clearly borrowed from online sources and the candidate would really struggle to explain how they worked. We saw some really bad code quality in the submissions, even from people who had public github accounts with very nice-looking code. Apparently people use two completely different sets of styles when they’re on github vs when they’re trying to get a job? One guy said he didn’t think he needed to “write production-quality code just for an interview”, but promised us he could write such code if we hired him. I still feel like code samples have some benefit, especially because not all candidates have public github accounts, and there are a lot of bozos who can’t write code to save their lives who were weeded out by it, but it certainly wasn’t a perfect tool.
This fascinating article about one limitation of coding tests only scratches the surface of how counterproductive they are.
Oh, sure, I know the usual excuses for it, such as “we need to weed out all the fakers”, but the one big problem that is the source of so many problems with the idea is really very simple: people who are expert at coding are not often experts at testing for coding abilities, which is really the expertise of a good teacher.
But it gets worse: I was contacted by a recruiter for a Facebook position, she thought it was legit for her, a non-technical person, to read out technical questions, jot down my answers and pass them on to technical people for assessment! There are so many things wrong with this approach it is hard to know where to begin!
But, so far we haven’t found best (or even good) approach yet! Aren’t we? All mid and big size companies have really hard challenge to find good candidates.
The task set was pretty simple, if not trivial. I just tried it and did it in under 45 minutes, so anyone who spends all night coding a solution is someone you don’t want anyway. And that’s the right level of complexity for a question – you don’t want to ask too much before an interview. For a task like this, there should be only one commit as it’s possible to write the code out, do a little manual testing, and be finished. Every commit should be a self-contained piece of work which results in code which the committer believes is working correctly. In a mature product, that might be something trivial such as fixing a misspelling, but when writing new code it has to be a big enough chunk that it works, so probably several hours at least. You might start with a trivial UI (e.g. command line) and simple game mechanics, or a simple UI with trivial game mechanics (e.g. for chess, just make a random legal move). But this task has a trivial UI and trivial game mechanics, so there should be no intermediate point worth recording.
On your candiate – they were terrible.
1. There should be no minimax, negamax, or suchlike as this is a very simple game and the winning strategy is easily available (e.g. on Wikipedia). Anyone who approaches this as if it were a novel and complicated game is someone who will put lots of unwanted complexity in their code, reinventing solutions already known.
2. There should be no parts which are tough as this is such a simple task. If they find somehthing tough either they are no good at solving the problem or they have gone about it the wrong way and are making it complicated. Either way, you don’t want them getting near your codebase.
3. There should be minimal tests for something like this. (This may be whatthey said – the article is not clear). It’s a simple, self-contained piece of code. If someone wants tests so that they can refactor it, then they should stop immediately and explain what’s so wrong that needs refactoring. The rules of the game are very well established and won’t change, and the only part likely to need to change is the UI, but since the game playing is so simple, if you need a different UI you might as well just write the whole thing from scratch. Otherwise you’re wasting a lot of effort on testing this solution which would be better spent on something else. This is a classic YAGNI.
I have interviewed many candidates over the years and found that talk is cheap, but coding is valuable. Quite a few candidates can bluff their way through an explanation of what they would do but then prove unable to do it. Sometimes not even knowing basic features of a language that they claimed to be expert in – once to the point that someone claiming to be an expert in C could not declare a struct.