lab / Octopus Arena

Eleven local models. Same game prompt.

Each lane built a vanilla JavaScript space shooter. Play the artifacts and compare code review with public ratings.

11 artifactssame promptcode + player boards
Gameplay contact sheet from Octopus Arena models

Same prompt across every lane.

Code review and player ratings stay separate.

Generated games run as sandboxed static artifacts.

0 public ratings captured

Qwen3.6 27B preserve-thinking gameplay screenshot
#188/100

Qwen3.6 27B preserve-thinking

Strongest code path: kill-gated progression, stateful waves, boss/unleash wiring, mostly dt-aware. Weak spots: no hit invulnerability and some hardcoded bullet damage.

code88playersno votes yet

Likely the best balanced game if players value progression that actually works.

Open game
Qwen3.5 122B-A10B no-think gameplay screenshot
#286/100

Qwen3.5 122B-A10B no-think

Best broad systems coverage: collisions, enemy damage, ink, boss phases, powerups, unleash. Weak spots: frame-based timing and harsh contact damage.

code86playersno votes yet

May lose some player votes if the early moment feels sparse despite stronger source coverage.

Open game
Qwen3.6 27B thinking gameplay screenshot
#384/100

Qwen3.6 27B thinking

Complete active shooter loop: bullets, enemies, invulnerability, boss and unleash concepts. Progression wiring is less clean than preserve.

code84playersno votes yet

Good candidate for people who prefer classic active shooter feel over source neatness.

Open game
Qwen3.6 35B-A3B no-think gameplay screenshot
#478/100

Qwen3.6 35B-A3B no-think

Strong 35B gameplay surface: enemies, bullets, score, combo, level. Weak spots: mostly frame-based and boss/tier repeat risks.

code78playersno votes yet

Best 35B lane for visible combat and immediate arcade signal.

Open game
Qwen3.6 27B no-think gameplay screenshot
#576/100

Qwen3.6 27B no-think

Functional bounded worker lane: bullets, enemies and score are active. Weak spot: damage/invulnerability can drain health too fast.

code76playersno votes yet

Could score well with players if the fast damage bug does not dominate the session.

Open game
Qwen3.5 122B-A10B thinking gameplay screenshot
#674/100

Qwen3.5 122B-A10B thinking

Excellent play-feel pieces: ship progression, bullet growth, enemy ramp, homing-missile unleash concept. Weak spots: boss/health pressure underwired.

code74playersno votes yet

Prime candidate to beat its code score because the progression fantasy feels strong.

Open game
Gemma 4 31B thinking gameplay screenshot
#768/100

Gemma 4 31B thinking

Playable and scores/kills in smoke. Weak spots: undefined player-radius path, enemy bullets not drawn, messy constants/timing.

code68playersno votes yet

Playable enough to surprise people, but brittle under real pressure.

Open game
Qwen3.6 35B-A3B thinking gameplay screenshot
#864/100

Qwen3.6 35B-A3B thinking

Mixed but not dead: some high-score behavior exists, but source synchronization/progression risks show up quickly.

code64playersno votes yet

Probably middle-pack unless the high-score path appears quickly for players.

Open game
Qwen3.6 27B shortthink-v2 gameplay screenshot
#961/100

Qwen3.6 27B shortthink-v2

Pretty and efficient at first glance, but boss path and tier updates are not wired into normal progression.

code61playersno votes yet

Screenshots will flatter it; longer play should expose the ceiling.

Open game
Qwen3.6 35B-A3B preserve-thinking gameplay screenshot
#1035/100

Qwen3.6 35B-A3B preserve-thinking

Visual shell, broken core loop: undefined ship size creates NaN bullets; stale enemy export breaks collision/progression.

code35playersno votes yet

Looks alive but should fall hard once players try to shoot and score.

Open game
Gemma 4 31B no-think gameplay screenshot
#1130/100

Gemma 4 31B no-think

Good-looking formations, but shooting/collision internals break: nested bullet arrays, bad split reference, unused unleash path.

code30playersno votes yet

Should rank near the bottom once players hit the broken loop.

Open game