1. Library
  2. Computer Use: AI’s Most Enterprise-Ready Use Case?

Computer Use: AI’s Most Enterprise-Ready Use Case?

12 mins
Light Mode
  • Heavybit Photo
    HeavybitHeavybit
How Do You Get Change-Resistant Enterprises to Adopt AI?
Addressing Enterprise Inertia and Resistance to Change
AI Features Enterprises Care About: Speed vs. Accuracy
Enterprise Challenges: Speed and Decision Tree Errors
Hard Lessons: Enterprise Users Expect Their Own Flow
What’s Missing in Computer Use: Security and Safety
Computer Use as an Enterprise Modernization Opportunity

How Do You Get Change-Resistant Enterprises to Adopt AI?

For some time, enterprises have resisted change, with internal departments using a specific software platform as their center of gravity: The ERP for finance, the CRM for sales, and so on. Each with painstakingly built custom configurations (with new team members not allowed to touch anything!). So what happens in the age of AI when everything speeds up and the pressure is on to modernize, even at risk-averse enterprise orgs?

Founder Prateek Jannu’s project Coasty focuses on what he considers to be AI’s most immediately valuable use case for enterprise: Computer use, which defers actual point-and-click functionality on desktops and browsers to autonomous AI. Says the founder, computer use lets enterprises gain immediate value from the speed and versatility of AI, while not disrupting existing enterprise tech stacks (especially the customized Netsuite instance that took the finance team years to perfect).

Addressing Enterprise Inertia and Resistance to Change

Jannu notes that the Coasty project came from observing how, while many academics have been excited about the potential of AI, enterprise orgs seemed to recoil in horror. “A lot of companies, even among the Fortune 500, did not want to build MCP servers, or anything they thought could destroy their systems or accidentally leak sensitive information.”

From there, the Coasty project launched as a computer use agent with high performance scores 82% on the agentic OS World benchmark when tasked with looking at users’ screens and figuring out how to interact as a keyboard and mouse would. “A score is just a score and doesn’t necessarily imply anything with regard to long-term tasks,” the founder admits, but it may be an indicator of potential value for a variety of use cases for form fills, accounting, hotel management, and others.

The founder notes that building AI computer use has taken a different path than previous-generation workflow automation and screen-scraping tools due to changing interfaces and other external factors. “We use five different systems with a variety of models, which is important for enterprise systems because UIs also change.”

“If an accounting team is using a legacy system that gets some kind of software update, or alternatively, if they’re filing state taxes but next year, there’s a new option, some new provision for an updated state tax rate which a simplistic workflow automation tool might not pick up. We’ve built our models to look for and pick up such changes and understand variances between steps that previous-gen solutions might get stuck on.”

AI Features Enterprises Care About: Speed vs. Accuracy

“When we started out, we noticed other teams in the computer use space seemed to be focusing on speed, perhaps because teams have looked at speed as a factor when they think about process automation. ‘If it’s faster, it must be a better option,’ was the thinking, and why would anyone pay for something that’s slightly slower?”

“This was a hard decision for us: Can we compromise on speed right now but guarantee that we can be more accurate than any system out there? That's a major trade off.” Jannu notes that while having low latency definitely demos well, but a computer use tool that’s fast but inaccurate is virtually pointless.

As his team reviewed their options, they realized that in the AI age, as code becomes faster and cheaper to produce, software will eventually change daily, which will mean that UIs and APIs will too. “If you focus on speed, you’re going to break at some point.” Meanwhile, enterprises run on decades-old software that can have fragile configurations or integrations, causing them to be extremely hesitant to change vendors with any frequency.

“Once your AI is logged in to their software, [enterprise teams] don’t want to make significant changes. If you did go in and make changes to their tools, they wouldn’t renew. They wouldn’t even adopt. So we decided to focus on accuracy instead of speed for enterprise personas.”

Enterprise Challenges: Speed and Decision Tree Errors

Moreso than building for any specific interface (desktop, browser, or terminal, all of which Coasty supports), Jannu suggests that significant challenges include decision tree errors and time thresholds. “Anytime you give a task, you expect it to take a certain amount of time. Small tasks are relatively fast, but as users build confidence, they start testing boundaries and make bigger asks.”

“They start requesting more long-horizon tasks trying to test how well Coasty can do, tasks that may take closer to 10-20 minutes. Sometimes, tasks can go off trajectory. If I request the creation of a certain kind of database, and it chooses the wrong table in step two, it's going to be completely wrong for the entire trajectory. So it’s really about how to get those first five to 10 steps right. And even if the system gets something wrong, how can we get back on track?”

“One thing we focus on with this project is self-correction: If there’s a mistake, Coasty can go back and rectify that. One thing that was really hard to pin down was controlling and auditing which step the system gets wrong.” The founder explains how the process involved building dozens of testing layers to detect errors and revert to previous steps, a process that remains complicated.”

Hard Lessons: Enterprise Users Expect Their Own Flow

“We also learned that enterprise users want an easy way to provide logins for their agents, but the challenge is doing that safely. We added what we call agent credentials which we store in an encrypted format, but the agent never sees the credentials, and will prompt users to enter them as needed.” The founder suggests he expected this behavior: Of users to instruct agents to take action until they ran into a roadblock and needed something, like credentials.

“We realized that people were prompting their agents more like they would a human, because that's how these users were expecting it to work. For anyone trying to implement computer-use agents as well, I'd highly recommend trying to see what users expect from the system: Sometimes they may expect a full, end-to-end prompt, but sometimes they’ll say: ‘Hey, just come back to me and ask if you need credentials.’”

“This is something that we did not expect. And a lot of users got turned off because of that. We had to learn the hard way. And we had to talk to a lot of users to find out about these issues, whether their issues were related to costs or functionality, before we realized it was something as simple as users expecting agents to come back and ask them for credentials.”

What’s Missing in Computer Use: Security and Safety

The founder admits that the ultra-viral OpenClaw project kickstarted adoption of agents for personal use, but suggests that agentic has seen a noticeable delay on security to accompany all the exciting advancements. “If you look at how software had a boom in the early 2000s: A lot of compliance companies came up with their own standards.”

“But at the end of the day, the larger companies, including the Fortune 100, came together with the goal of figuring out how to identify whether software is ‘safe’ enough for use by big companies or clients. And they came up with SOC 2 and similar methodologies.”

“I definitely think there's going to be at least some kind of standard in the future, a security framework specifically for agents that could also cover computer use or general agentic frameworks. It might involve stress testing agents with jailbreaking problems, putting them through some kind of verification system to ensure it isn’t vulnerable to prompt injections or other kinds of attack.”

The founder suggests that an industry standard might emerge once enterprise adoption of agents hits critical mass, which doesn’t seem to have happened yet. “A lot of these companies are adopting [agents] to a very small degree. Maybe it's one small team using it for a certain reason. But if you want the whole organization to use a certain kind of AI, the company needs to make sure that if something does go wrong, someone is responsible.”

“There's going to be a sort of framework or a standard set for it, ideally an open standard so people can go audit it. For our project, we maintain a really slim internal security standard so if something does happen on our end, we’re ready for it.” The founder suggests that every agentic software project should stand up their own security standard along with a battery of internal tests.

“If you’re building agentic, try to test it. Break it. There are a lot of good open-source repositories with guides on how to do prompt injections or jailbreak certain system prompts.Try them on your own system. Check how it works. Is it failing? How? If you're trying to take your software to a production-grade level, try to break it as much as you can beforehand.”

Computer Use as an Enterprise Modernization Opportunity

“I think there are a lot of enterprises that depend on legacy systems that do not have APIs or MCPs built. Maybe 95% of them do not. I believe [computer use] is the best way to get started with AI [for enterprise customers] because a lot of companies are under pressure right now to ‘implement’ AI.”

The founder notes that while some early adopters are diving into building MCP servers, most companies are still pondering implementation and potential security issues. “[Enterprises] have a lot of these questions. What's the best way to go about [implementing AI]? Try to build computer use. At the end of the day, your software does have a login screen of some kind. People use it with a keyboard and mouse. And if humans can use these things, computer use AIs do the same.”

The founder suggests that as models only continue to improve, computer use will as well. “If today, we can perform a one-hour task reliably, maybe within a year, we can perform a three- to four-hour task reliably. Lots of companies have processes that take hours. With successful computer use, you can see efficiency improvements of 2x or 3x, even as you run agents as background tasks.”

“A lot of business units rely on tools that don’t have MCPs or AI-capable APIs. For example, I know of several accounting tools that don’t have MCP support, and won’t for some time? Maybe upwards of five to ten years? But whatever you do with a computer, with a keyboard and mouse, can be done right now with computer use.”