Data Science Spotlight: Cracking the SQL Interview at Instacart (LLM Edition) | by Monta Shen | Jul, 2024 | tech-at-instacart

Ask questions Research chat →

https://tech.instacart.com/data-science-spotlight-cracking-the-sql-interview-at-instacart-llm-edition-52d04bde474c · scraped

data science databases

Attachments

Scraped Content

— 595 words · 2026-02-14 17:43:29 UTC ·

Excerpt

By: Anahita Tafvizi, Michael Curran, Monta Shen ![](https://miro.medium.com/v2/resize:fit:700/1*u9hfTgNB7fANQlppSyloOA.png) Data Scientists at Instacart require a unique combination of skills to be successful on the job. They need to have a combination of business acumen, analytical, communication and technical skills. The Data Science interview loop tests the candidate for these skills, beginning with the Technical Screen — including a SQL interview where candidates show that they can translate business questions into code that retrieves the correct data from databases. This set of questions generally provides the candidate with a set of schemas and common business questions where they then leverage SQL to translate data into insights. Here’s an example of such a question: Using these two tables, ascertain: 1. What are average ratings by day? Include only users with 5+ orders. 2. Is there a relationship between order number and average ratings? ![](https://miro.medium.com/v2/resi
By: Anahita Tafvizi, Michael Curran, Monta Shen ![](https://miro.medium.com/v2/resize:fit:700/1*u9hfTgNB7fANQlppSyloOA.png) Data Scientists at Instacart require a unique combination of skills to be successful on the job. They need to have a combination of business acumen, analytical, communication and technical skills. The Data Science interview loop tests the candidate for these skills, beginning with the Technical Screen — including a SQL interview where candidates show that they can translate business questions into code that retrieves the correct data from databases. This set of questions generally provides the candidate with a set of schemas and common business questions where they then leverage SQL to translate data into insights. Here’s an example of such a question: Using these two tables, ascertain: 1. What are average ratings by day? Include only users with 5+ orders. 2. Is there a relationship between order number and average ratings? ![](https://miro.medium.com/v2/resize:fit:700/1*MkQyYaBxZfogMswl-dLA6g.png) LLM Cracks the Interview Before the rise in popularity of LLMs for coding use cases, Data Scientists would have to write code manually to retrieve data through databases and manipulate the data to achieve the desired insight. Now that LLMs are widely accessible, Data Scientists are able to write and edit code through natural language, saving significant amounts of time and effort. A more efficient way to ask and answer the above interview question would be to simply ask LLMs. An example prompt would include the schemas above, the questions and the task. > Here are the schemas: <<INSERT SCHEMAS>> Here are the questions: << INSERT QUESTIONS>> Here is the task: Write Snowflake SQL to answer the above questions. Here’s a truncated depiction of what might happen when we use this prompt in Ava, Instacart’s internal AI assistant powered by OpenAI and other models. Ava is able to write all the necessary SQL to answer the interview questions. ![](https://miro.medium.com/v2/resize:fit:700/0*gxzqj82zHtvyVzNu) Through a quick test of this prompt via a few popular LLMs (e.g. GPT-4o, Snowflake Arctic and Llama 3–70B), each is able to do these tasks correctly. Rethinking the SQL Interview at Instacart Having candidates write live SQL to test their ability to code is both ineffective and a poor representation of on-the-job workflows. This usually leads to questions needing to be extremely simple in order to fit a time constraint and unfairly penalizes candidates if they do not write SQL daily. Moreover, interviews that can be solved easily through a simple prompt and relevant context are not effective ways to test candidates–especially considering Instacart Data Scientists will be expected to leverage AI in their workflows. Given this evolution, we’re making changes to our SQL interview process to orient more around AI-forward workflows that have become best practice on our team. Now, as part of their SQL interview, Instacart Data Science candidates may be asked to: - Translate an insight into a prompt for a SQL query — this tests a candidate’s ability to prompt engineer and translate a business question into an actionable data pull. - Explain and debug a sample SQL query — this tests a candidate’s ability to understand and fix LLM-generated SQL outputs. - Identify ways to make a sample SQL query more efficient — this tests a candidate’s deep understanding of SQL, both in writing and processing efficiently. This reimagined SQL interview, combined with our other technical and non-technical interviews (e.g. product sense, statistics, cross-functional partnership, analytics), will give the team a better understanding of candidate skills and allow us to continue to up-level the Data Science team at Instacart. For more information on the interview process and to see our open Data Science roles, please visit Instacart’s Careers Page.

Visibility

Visible to everyone

Reading Status

Related Bookmarks

My Note


Saved!

Annotations

Export as Markdown
+ Annotate selection

Add Annotation