Claude Computer Use
Anthropic's Claude Computer Use for macOS. How to install Claude Computer Use as a GUI agent for AI-assisted robotic process automation.
Claude Computer Use is the first widely available AI agent that can operate a computer like a human. Claude Computer Use is an API to build applications that interact with a computer desktop GUI, performing tasks, and automating workflows—almost anything a human virtual assistant might do. It was released by Anthropic on October 22, 2024.
Claude Computer Use is a beta API (application programming interface) for testing and feedback. As an API, Claude Computer Use is not an application that you install on your Mac. However, you can install various applications that build on Claude Computer Use. More are in development.
Other applications you may want
If you're looking into Claude Computer Use, now's a good time to add Warp Terminal and Zed Editor. Warp Terminal is an AI-assisted console for tools that give you unrestricted access to the OpenAI API and other LLMs. Zed Editor is the best AI-assisted editor for text and code. Both are free; you can get Warp Terminal here and Zed Editor here.
How it works
Developers write software applications that send screenshots and your instructions to the Claude Computer Use server, which will interpret instructions and screenshots, develop a plan of action, and send coordinates and actions for back to the application for mouse movements and clicks. After each sequence of actions, Claude Computer Use evaluates a new screenshot and decides what to do next.
Clause reduces screenshots and instructions to "tokens." Anthropic charges your account a tiny amount for each token consumed by Claude Computer Use. Screenshots consume a large amount of tokens. And it automatically reprompts itself in a loop until it gets good results, using more tokens, so Claude Computer Use can be expensive.
"Be warned this thing burns through tokens extremely fast" -- Fireship
You can use a chat interface to type instructions with applications built with Claude Computer Use. For example, "Go to the New York Times website and search for 'climate change'." Claude will trigger the application to take a screenshot, launch a web browser and navigate to a website. Claude will then take another screenshot. The server will analyze the screenshot and send instructions to the application to move the mouse and click on the search box and type "climate change." Finally, it will take another screenshot to check if the search results are displayed as expected. Claude's intelligence is its ability to follow instructions, interpret the screenshots, and decide what to do next.
What to expect from Claude Computer Use
In some ways, the introduction of Claude Computer Use in October 2024 is as significant as the release of OpenAI's ChatGPT in November 2022. ChatGPT showed us that an AI can respond to questions with human-like answers. Claude Computer Use shows us that an AI can drive a computer's desktop like a human.
Though Claude Computer Use is innovative, it has notable drawbacks.
On the plus side, Claude Computer Use can take control of a computer’s mouse and keyboard and perform various tasks, such as browsing web pages, entering data, saving files, and submitting forms.
On the downside, Claude Computer Use is expensive and slower than a human. Unless you have a specific need for an AI to automate repetitive tasks on your computer, you may not find it worth the cost or effort. Claude Computer Use often makes errors, such as failing to find a button on a web page. You can intervene to provide guidance, but this can be frustrating and time-consuming.
Notwithstanding its limitations, Claude Computer Use is the first GUI agent that is widely available.
What people are saying about Claude Computer Use
"Claude actually looked at my screen. Moved the mouse by itself. Clicked buttons like a human. Created reports automatically. It's like having a virtual assistant that can really use your computer!" -- Hacker News
"People will use this at work to pretend that they are doing work while they listen to a podcast." -- Hacker News
"Perhaps the most dangerous AI feature ever handed over to the public... what could possibly go wrong?" -- Fireship
Obtaining Anthropic API keys
You'll need to get an Anthropic Account and Claude API Key to use the available applications. The API key is a secret code that identifies you to the Claude Computer Use server. It's the same Claude API key used by other applications, such as the Zed editor.
See the instructions to get your API key:
You can get an API key for free, but you'll need to add $5 to your account with a credit card to get enough credits to do anything useful.
With the Claude API key, you can begin developing applications that use Claude Computer Use or you can try some of the applications listed below.
AI safety
Applications that build on Claude Computer Use are powerful and risky if used carelessly. Assume that Claude Computer Use can do anything on your computer, including:
- deleting files
- looking up passwords
- submitting web forms
- sending emails
- posting to social media
You can easily imagine that a malicious application built with Claude Computer Use could open a password manager, log into your bank account, and transfer money to someone else's account. Anthropic claims that guardrails are in place to prevent this kind of abuse, but the API is still in beta, and guardrails are imperfect. As a cautionary example, watch a YouTube video where a Norwegian engineer tricks Claude Computer Use into posting to his Reddit account.
Anthropic warns that Claude Computer Use can be tricked into downloading and installing malware that gives unauthorized access to your computer. Within 2 days of the Claude Computer Use announcement, security researcher Johann Rehberger demonstrated how to trick Claude Computer Use into downloading and installing a malicious application through a "prompt injection" attack.
If you download and install an application that uses Claude Computer Use, I advise you to either:
- Install the application on a spare computer that doesn't contain stored passwords or sensitive information.
- Install the application in a Docker virtual machine that is sandboxed from your primary computer.
Don't run a Claude Computer Use application while you're away from your computer. Stay at your computer and watch the application while it runs.
Applications built on Claude Computer Use
Anthropic released a sample application at the time of the Claude Computer Use announcement. Another application, computer_use_ootb, was released a few weeks later by a research lab at at National University of Singapore. Judging by how quickly the first applications were built, we can expect many more applications in the coming months.
Anthropic sample application
Anthropic released a sample application that demonstrates the capabilities of Claude Computer Use. This powerful reference implementation runs in Docker, a virtual Linux environment that runs on your Mac. You can use the sample application to automate a wide-ranging set of tasks, retrieving web pages, entering data in spreadsheets, and much more.
Show Lab's computer_use_ootb
Students at the National University of Singapore built this application to accompany a paper, Case Study with Claude 3.5 Computer Use. Docker is not needed but you need to install Python to run the application. It's got a unique ability to operate the test computer remotely using a web browser on a phone or other computer.
Grunty computer-agent
Developer Ishan Nagpal built this application "in a day." Docker is not needed but you need to install Python to run the application.
Resources
Here are two "Awesome-style" collections of relevant papers and projects.
- Showlab's Awesome GUI Agent
- Ranpox's Awesome Computer Use Agents
What's next
My mac.install.guide is a trusted source of installation guides for professional developers. Take a look at the Mac Install Guide home page for tips and trends and see what to install next.