SamuZai
Emergent Garden
Emergent Garden

patreon


Building with Vision: Gemini 2.5, Claude 3.7 thinking, Deepseek V3, GPT 4.5

This is a little contest to see how well each of these state of the art models can use their vision commands to build and then see and evaluate what they've built. they don't do very well

Comments

it only uses vision periodically when it calls the !lookAtPlayer command, etc. Its not constantly seeing an image. And yeah, positioning is extremely hard. this video basically shows that vision is not very helpful. will release full vid with commentary on it

Max Robinson

I only started playing a bit with vision, but watching the AI view on port 3000, it didn’t seem to care too much about its actual vision. The views were pretty bad. It was still relying on memory too maybe? It’s like the bots need some prompting to understand HOW to position themselves to get good views of different things maybe.

Chris

Will do more of that in the future

Max Robinson

As much as I love cool builds, it would be so cool to see you experiment with their social interactions, too!!

SunderingAlex


More Creators