the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders

Home Page:https://huggingface.co/spaces/mike-ravkine/can-ai-code-results

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'senior' coder test suite

the-crypt-keeper opened this issue · comments

The junior-v2 interview is showing it's age, I created it back when llama was all we had and at the time every single open source model failed the test.

The clustering we now see at the top of the leaderboard is a result of the massive improvements in open source coding models these past 6 months, anything above .95 is a binary pass and junior-v2 has no comparing ability up here.

A more difficult test suite is needed.

A senior interview suite mvp is now available, gpt4 can just barely pass it.

If you have any good ideas for interview questions please open PRs!