Review: Should companies conduct multiple interviews for each candidate?

In this blog post, I will review Aline Lerner’s 2016 analysis Technical interview performance is kind of arbitrary. Here’s the data. I will analyze the conclusion and discuss the article’s methodology.


Aline’s team collected data from 67 interviewees on the platform. In this platform, individuals can practice interviewing with engineers from top companies. At the end of the interview, interviewers rank interviewees on a scale of 1–4 based on both technical performance and communication skills. This data is used to generate the probability that an interviewee will fail a given interview based on their mean score, as shown below:

Retrieved from Technical interview performance is kind of arbitrary. Here’s the data

The article uses the graph to claim that technical interviews do not produce repeatable results about one’s performance as an interviewee. Finally, the article concludes that companies should conduct several interviews and use aggregate performance to measure one’s success as an interviewee.

Points for Improvements

A more accurate scale will contribute to the reliability of the results. Currently, interviewers are restricted to a 1–4 scale with a passing score represented by a 3. However, an interviewee might be somewhere between 3 and 4. For example, consider an interviewee who solved two challenges in 50 minutes but required two hints. Should this interviewee be a 4 or a 3? This is unclear and requires the interviewer to make a tough decision. Instead, an alternative 1–10 scale is more clear and would suggest a score between 8 and 9 for that interview.

But even with an improved grading system, subjectivity will still impact interview performance. While the Appendix section mentions that the article compared interviewees from the same interviewer, ratings can still subjectively vary between individual interviews. In the the platform, every interview consists of a different programming challenge. Interviewers may have a different perception about the expected time and difficulty of the challenge. Moreover, differences in personal attributes between the interviewer and the interviewees may impact subjective score. It is possible that if both parties are from the same country, they will bond and the interviewee will receive a higher a score. It is also possible that the opposite will happen. More research into the connection between personal attributes such as sex, age, and country may give a clearer picture of the scope of the results.

The Article’s Solution

Finally, I want to highlight that even though aggregate performance is more accurate than a single interview, the article’s solution is not practical. Conducting multiple interviews is time-consuming for the company. Moreover, the solution doesn’t help interviewees either. Interviewees who mess up an interview at one company can improve and apply to another company.


While I have no doubt that aggregate performance will provide a better candidate profile, this solution is not practical. I think the article could improve by using a wider scale and tracking individual’s characteristics. This will give a better picture behind the differences in interview performance.