code-review
anthropicclaude-opus-4-7complete- Pass rate
- 33/51 (65%)
- Cost
- $15.9900
- Avg latency
- 708ms
- Started
- May 2, 2026, 10:46 PM
- Triggered
- api-seed
Prompt template
You are evaluating a code review comment. Read the diff and the comment, then decide whether the comment correctly identifies a real, actionable issue with the code. Reply with exactly one word — either "correct" or "incorrect" — and nothing else. No punctuation, no explanation. A comment is "correct" when: - It identifies a real bug, security issue, or runtime error in the diff - It points out a meaningful style problem the diff actually has - Its core claim is true and applies to the code shown A comment is "incorrect" when: - It misses an obvious bug in the diff and instead nitpicks something else - It claims a bug exists but the code is actually fine - It suggests a change that would break the code or is factually wrong __SAMPLE__
Results
| Pass | Src | Input | Expected | Output | Score | Cost | Latency |
|---|---|---|---|---|---|---|---|
| ✗ | A | --- diff ---
def divide(a, b):
+ return a / b
--- review comment ---
This will throw ZeroDivisionError if b is 0 — handle that case. | correct | CORRECT. | 0% exact: expected "correct", got "correct." | $0.2400 | 403ms |
| ✓ | A | --- diff ---
function getUser(id) {
- return users.find(u => u.id = id)
+ return users.find(u => u.id == id)
}
--- review comment ---
Use === instead of == for strict equality. | correct | correct | 100% | $0.2400 | 464ms |
| ✓ | A | --- diff ---
+const total = items.reduce((s, x) => s + x.price)
--- review comment ---
reduce without an initial value will throw on an empty array. Pass 0 as the second arg. | correct | correct | 100% | $0.2800 | 797ms |
| ✓ | A | --- diff ---
+for (let i = 0; i <= arr.length; i++) {
+ console.log(arr[i])
+}
--- review comment ---
Off-by-one: condition should be i < arr.length, not <=. | correct | correct | 100% | $0.2900 | 805ms |
| ✓ | A | --- diff ---
+const password = "hunter2"
+const hash = md5(password)
--- review comment ---
MD5 is not suitable for password hashing — use bcrypt or argon2. | correct | correct | 100% | $0.2300 | 686ms |
| ✗ | A | --- diff ---
+if (user.role = 'admin') {
+ grantAccess()
+}
--- review comment ---
Assignment instead of comparison — should be ===. | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.2600 | 1.19s |
| ✓ | A | --- diff ---
+func saveUser(u User) {
+ db.Save(&u)
+}
--- review comment ---
saveUser doesn't return an error, so callers can't tell if the save failed. | correct | correct | 100% | $0.2500 | 779ms |
| ✓ | A | --- diff ---
+const query = "SELECT * FROM users WHERE id = " + req.params.id
--- review comment ---
SQL injection — use a parameterized query. | correct | correct | 100% | $0.3600 | 626ms |
| ✓ | A | --- diff ---
+try:
+ result = expensive_call()
+except:
+ pass
--- review comment ---
Bare except swallows everything including KeyboardInterrupt — catch a specific exception. | correct | correct | 100% | $0.2600 | 670ms |
| ✓ | A | --- diff ---
+std::vector<int> v;
+v.reserve(10);
+for (int i = 0; i < 10; i++) {
+ v[i] = i;
+}
--- review comment ---
reserve() doesn't change size — accessing v[i] is undefined behavior. Use push_back or resize. | correct | correct | 100% | $0.2500 | 446ms |
| ✓ | A | --- diff ---
+def get_config():
+ with open('config.json') as f:
+ return json.load(f)
+ f.close()
--- review comment ---
f.close() is unreachable — and the with-block already closes the file. | correct | correct | 100% | $0.2600 | 581ms |
| ✓ | A | --- diff ---
+let cache = {}
+function memoize(key, fn) {
+ if (cache[key]) return cache[key]
+ cache[key] = fn()
+ return cache[key]
+}
--- review comment ---
cache[key] === 0 or "" or false will be treated as a miss. Use `key in cache` instead. | correct | correct | 100% | $0.3300 | 483ms |
| ✓ | A | --- diff ---
+async function load() {
+ const data = fetch('/api/data')
+ return data.json()
+}
--- review comment ---
Missing await on fetch() — calling .json() on the unresolved promise will fail. | correct | correct | 100% | $0.3200 | 913ms |
| ✓ | A | --- diff ---
+useEffect(() => {
+ fetchData().then(setData)
+}, [])
--- review comment ---
Race condition: if the component unmounts before fetch resolves, setData runs on an unmounted component. | correct | correct | 100% | $0.3000 | 789ms |
| ✗ | A | --- diff ---
+pub fn first(v: Vec<i32>) -> i32 {
+ v[0]
+}
--- review comment ---
This panics on empty input. Return Option<i32> or use .first().copied(). | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.3800 | 1.16s |
| ✗ | A | --- diff ---
+const sum = (a, b) => a + b
+sum(1, 2)
--- review comment ---
Should use function declaration instead of arrow function for hoisting. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.2600 | 442ms |
| ✗ | A | --- diff ---
+const items = ['a', 'b', 'c']
+for (const item of items) {
+ console.log(item)
+}
--- review comment ---
for...of on arrays is slower than a classic for loop — switch back. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3800 | 587ms |
| ✓ | A | --- diff ---
+if (user) {
+ return user.name
+}
--- review comment ---
Use user?.name to avoid the if entirely. | correct | correct | 100% | $0.2400 | 697ms |
| ✓ | A | --- diff ---
+def factorial(n):
+ if n == 0:
+ return 0
+ return n * factorial(n - 1)
--- review comment ---
Looks good — clean recursive implementation. | incorrect | incorrect | 100% | $0.3800 | 600ms |
| ✗ | A | --- diff ---
+function isEven(n) {
+ return n % 2 == 0
+}
+isEven(2.5)
--- review comment ---
Returns true for 2.5 because 2.5 % 2 === 0.5, but the user might pass non-integers. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3400 | 953ms |
| ✓ | A | --- diff ---
+function transferFunds(from, to, amount) {
+ from.balance -= amount
+ to.balance += amount
+}
--- review comment ---
Add a comment explaining what this does. | incorrect | incorrect | 100% | $0.3800 | 637ms |
| ✓ | A | --- diff ---
+const userId = req.headers['x-user-id']
+db.users.delete(userId)
--- review comment ---
Use req.headers.userId for cleaner access. | incorrect | incorrect | 100% | $0.3600 | 722ms |
| ✓ | A | --- diff ---
+def parse_age(s):
+ return int(s)
--- review comment ---
Add type hints to make the function self-documenting. | incorrect | incorrect | 100% | $0.3200 | 478ms |
| ✓ | A | --- diff ---
+const x = a == null ? defaultVal : a
--- review comment ---
Could simplify to a ?? defaultVal. | correct | correct | 100% | $0.3400 | 528ms |
| ✗ | A | --- diff ---
+if (xs.filter(x => x.active).length > 0) {
+ process()
+}
--- review comment ---
Use xs.some(x => x.active) — short-circuits and reads better. | correct | CORRECT. | 0% exact: expected "correct", got "correct." | $0.2500 | 401ms |
| ✗ | A | --- diff ---
+let counter = 0
+for (let i = 0; i < 1000; i++) {
+ setTimeout(() => counter++, 0)
+}
+console.log(counter)
--- review comment ---
Use forEach instead of a for loop for readability. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3100 | 980ms |
| ✓ | A | --- diff ---
+std::string get() {
+ char buf[256];
+ sprintf(buf, "hello");
+ return buf;
+}
--- review comment ---
Use std::string instead of std::string for the parameter type. | incorrect | incorrect | 100% | $0.3200 | 1.19s |
| ✓ | A | --- diff ---
+let result = ''
+for (const s of strings) {
+ result += s + ','
+}
--- review comment ---
Use strings.join(",") — clearer and faster, no trailing comma. | correct | correct | 100% | $0.3600 | 744ms |
| ✓ | A | --- diff ---
+if (status === 'ok') {
+ return true
+} else {
+ return false
+}
--- review comment ---
Just return status === "ok". | correct | correct | 100% | $0.3700 | 506ms |
| ✓ | A | --- diff ---
+const items = arr.map(x => x.value).filter(Boolean)
--- review comment ---
filter(Boolean) drops 0 — use filter(x => x != null) if you want to keep falsy values. | correct | correct | 100% | $0.2600 | 476ms |
| ✓ | A | --- diff ---
+const arr = []
+for (let i = 0; i < 10; i++) arr.push(i * 2)
--- review comment ---
Use Array.from({ length: 10 }, (_, i) => i * 2) for a one-liner. | correct | correct | 100% | $0.3000 | 599ms |
| ✗ | A | --- diff ---
+for (const key of Object.keys(obj)) {
+ console.log(obj[key])
+}
--- review comment ---
Use Object.values(obj) directly — you don't need the keys. | correct | CORRECT. | 0% exact: expected "correct", got "correct." | $0.2800 | 409ms |
| ✗ | A | --- diff ---
+const cleanup = () => clearInterval(interval)
+useEffect(() => {
+ const interval = setInterval(tick, 1000)
+ return cleanup
+}, [])
--- review comment ---
cleanup closes over the wrong `interval` — declare cleanup inside the effect. | correct | CORRECT. | 0% exact: expected "correct", got "correct." | $0.3200 | 439ms |
| ✗ | A | --- diff ---
+function fetchUser(id: string) {
+ return fetch(`/api/users/${id}`).then(r => r.json())
+}
--- review comment ---
TypeScript will infer the return type — no need to annotate. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3200 | 697ms |
| ✓ | A | --- diff ---
+const handler = (e) => {
+ e.preventDefault()
+ submit()
+}
+button.addEventListener('click', handler)
+button.removeEventListener('click', () => handler)
--- review comment ---
removeEventListener takes a different reference — listener is never actually removed. | correct | correct | 100% | $0.3100 | 467ms |
| ✓ | A | --- diff ---
+async function process(items) {
+ items.forEach(async (item) => {
+ await save(item)
+ })
+ console.log('done')
+}
--- review comment ---
forEach ignores the returned promise — "done" prints before saves complete. | correct | correct | 100% | $0.2300 | 487ms |
| ✗ | A | --- diff ---
+const user = users.find(u => u.id === id)
+console.log(user.name)
--- review comment ---
find() returns undefined when no match, dereferencing .name will crash. Add a guard. | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.3400 | 1.20s |
| ✗ | A | --- diff ---
+if (process.env.NODE_ENV === 'production') {
+ console.log('hi')
+}
--- review comment ---
console.log in production — strip it. | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.3300 | 1.14s |
| ✓ | A | --- diff ---
+const sum = (a: number, b: number) => a + b
+sum(1, '2' as any)
--- review comment ---
Use `as unknown as number` for a slightly safer cast. | incorrect | incorrect | 100% | $0.3500 | 882ms |
| ✗ | A | --- diff ---
+app.get('/users', async (req, res) => {
+ const users = await db.users.findAll()
+ res.json(users)
+})
--- review comment ---
No error handling — if db throws, the request hangs and Express never responds. | correct | CORRECT. | 0% exact: expected "correct", got "correct." | $0.3800 | 426ms |
| ✓ | A | --- diff ---
+let i = 0
+while (true) {
+ if (i > 100) break
+ process(i++)
+}
--- review comment ---
Just write `for (let i = 0; i <= 100; i++) process(i)` — easier to read. | correct | correct | 100% | $0.3800 | 1.09s |
| ✗ | A | --- diff ---
+function getBalance(account) {
+ return account.balance.toFixed(2)
+}
--- review comment ---
toFixed returns a string — surprising for a function called getBalance. | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.3200 | 1.17s |
| ✓ | A | --- diff ---
+const date = new Date('2024-03-15')
--- review comment ---
Date parsing of YYYY-MM-DD strings is timezone-dependent — use a date library. | correct | correct | 100% | $0.3900 | 752ms |
| ✗ | A | --- diff ---
+const emails = users.map(u => u.email.toLowerCase())
--- review comment ---
Crashes if any user.email is undefined. Use u.email?.toLowerCase(). | correct | incorrect | 0% exact: expected "correct", got "incorrect" | $0.3800 | 1.14s |
| ✗ | A | --- diff ---
+function getName(user) {
+ return user && user.name
+}
--- review comment ---
Returns the user object instead of name when user is truthy — wait, no, this is fine. | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3800 | 707ms |
| ✗ | A | --- diff ---
+function clamp(n, min, max) {
+ return Math.min(Math.max(n, min), max)
+}
--- review comment ---
Reverses min and max — should be Math.max(Math.min(n, max), min). | incorrect | correct | 0% exact: expected "incorrect", got "correct" | $0.3600 | 462ms |
| ✓ | A | --- diff ---
+lock.acquire()
+doWork()
+lock.release()
--- review comment ---
If doWork throws, the lock is never released. Wrap in try/finally. | correct | correct | 100% | $0.3500 | 566ms |
| ✓ | A | --- diff ---
+const config = JSON.parse(rawConfig)
+startServer(config)
--- review comment ---
JSON.parse can throw on malformed input — wrap in try/catch. | correct | correct | 100% | $0.3000 | 552ms |
| ✓ | A | --- diff ---
+const handler = useCallback(() => {
+ fetch('/api').then(setData)
+}, [setData])
--- review comment ---
setData from useState is stable — you can drop it from the deps array. | correct | correct | 100% | $0.3600 | 1.07s |
| ✓ | A | --- diff ---
+const ids = users.map(u => u.id)
+const unique = [...new Set(ids)]
--- review comment ---
Could do users.map(u => u.id) inside the Set: `new Set(users.map(u => u.id))`. | correct | correct | 100% | $0.2400 | 564ms |
| ✓ | A | --- diff ---
+const x = parseInt(input)
--- review comment ---
parseInt without a radix can give surprising results on inputs like "08". Pass 10 as the second arg. | correct | correct | 100% | $0.2500 | 590ms |