1. Library
  2. MLOps vs. Eng: Misaligned Incentives and Failure to Launch?

MLOps vs. Eng: Misaligned Incentives and Failure to Launch?

Light Mode
Failure to Launch: The Challenges of Getting ML Models into Prod
Why Do So Few ML Projects Make it to Production?
Approaching MLOps as a Learning Journey with Alexandra Johnson
Discussion: How MLOps, Dev, and Execs Map Models to Value
ML Experts With No Business Background vs. Business Leads With No ML Background
Contextualizing ML Projects in Terms of Business Timeframes and Metrics
More Resources:
Improve Communication (But Ship Something!) with Andrew Fong
Discussion: Balancing the Black Box vs. Shipped Products
Is Change Management The Key Leadership Skill To Adopt AI Successfully?
Implementing AI at the Pace of Your Product (and Buyers)
More Resources:
Make the Metrics Make Sense with Stefan Krawczyk
Discussion: Put Change in the Roadmap (and Business Plan)
Getting a Valuable MVP Running and Iterating
Closing the Gap Between Executive Expectations and Ops Realities
More Resources:
Will MLOps Go the Way of DevOps? with Adam Zimman
Discussion: Applying Software Dev Lessons to the Future
Why Experimentation Is Important Enough to Bake Into Business Incentives
Why ML in Production May Be Having Its “DevOps” Moment
How Operationalizing Could Lead to Common Ground
More Resources:
Conclusion
  • Andrew Park headshot
    Andrew ParkEditorial Lead
    Heavybit
37 min

Failure to Launch: The Challenges of Getting ML Models into Prod

Machine learning is a subset of AI–the practice of using complex algorithms to model human learning and cognition. The ML we typically speak of today focuses on using models trained via supervised learning (on specific, known-quantity data sets for input and output) and unsupervised learning (finding hidden patterns or insights in other data sets). After 20-30 years, you might expect everyone to have figured out how to get ML projects into production. Unfortunately, that isn’t the case. This article will cover:

  • The ongoing “failure to launch” problem with ML models in production
  • Misaligned incentives, skill sets, and cultural expectations between data science, engineering, and management teams
  • Perspectives from veterans from both the ML and software development disciplines
  • Specific recommendations for teams to build alignment around ML projects to not only get them into production, but drive results.

Hear more diverse perspectives on getting AI into production at the DevGuild AI Summit II event.

Why Do So Few ML Projects Make it to Production?

Studies suggest that as few as 10% to 20% of all machine learning projects ever make it to production. More disturbingly, additional research suggests that 91% of models degrade in performance over time–so the few models that do go live aren’t “set it and forget it” projects. Ongoing performance issues for ML in production means ML operations teams need to regularly perform triage.

There’s a potential challenge happening at the organizational level–that data science teams and engineering teams may simply be too far out of sync with regards to their skill sets, incentives, day-to-day priorities, and culture. Data scientists, in the interest of training models to be as robust as possible, anticipate occasional failures as learning opportunities to better tune future performance. Engineering teams build for consistent performance, and may be gun-shy about investing many cycles into ML systems that fundamentally aren’t deterministic, and may fail for completely unanticipated reasons, sometimes catastrophically. Meanwhile, management teams that have already invested heavily in expensive AI projects are increasingly feeling pressure to deliver ROI.

If you find the state of affairs surprising, it’s not just you. We at Heavybit fully expected machine learning to be the next great driver of change in operations. But MLOps seems to have fallen short of expectations. Why? And what can we–as stakeholders, investors or founders–do to help data science and engineering teams actually get AI projects into production and reach their full potential? Below is a set of expert interviews on how to bridge the gap between data science and engineering–and how to attack MLOps’ failure-to-launch problem.

Approaching MLOps as a Learning Journey with Alexandra Johnson

Alexandra Johnson is founder and CEO of Rubber Ducky Labs, building operational analytics for recommender systems. She has held product leadership and engineering positions at startups for more than a decade, including four years at early MLOps startup SigOpt (acquired by Intel).

  • Resolving Misalignments: Data science and engineering teams need better infrastructure, better processes, and better mutual understanding of each other’s daily priorities–and they need to avoid the pitfalls of being shortsighted.
  • Starting from Business Goals and Value: Rather than focusing entirely on arcane data science or engineering KPIs, organizations may be better served by also including business value and successful implementation of business-focused use cases to benchmark the effectiveness of their ML program.
  • Managing Executive Expectations: Teams can collaborate better with C-suite executives by using universally recognized business metrics such as specific timelines and budgeting to frame the need for ongoing learning, adaptation, and even setting expectations for model decay (and the occasional failure).

Discussion: How MLOps, Dev, and Execs Map Models to Value

In addition to offering the above suggestions, Johnson reflects on how, with recent developments in the still-growing MLOps space, working with machine learning models may be going through its own DevOps-like growing pains–a situation where not every party is keyed into the best possible technology or why the tech would be worth the investment. “When we talk about the fundamental misalignment between what MLOps is trying to do versus what the business is trying to do–you can compare that to investments in infrastructure. Which by now, is basically second nature. Obviously, if there were a company right now building a static-hosted website, and it wanted to run it on its own servers, that’d be a very different conversation than it would’ve been 20 years ago. Today you would say, ‘Why aren't you using a cloud hosting service?’ We’ve already built up the muscle for making those investments in DevOps and infrastructure when it comes to general software engineering. But we’re just starting to learn how to build that muscle on the machine learning side.”

On the topic of reconciling different expectations with management teams, Johnson identifies a fundamental rift between business, dev, and ML teams, who, in many of today’s organizations, may not be speaking the same language. In a worst-case scenario, teams stop trusting each other and develop a shortsighted focus on immediately measurable results–when they should potentially be seeking a neutral third party that understands each side’s objectives. “I was just talking to someone who works for a very large company, at which their business stakeholders had basically lost trust in their technical team.”

“When that trust is gone, executives tend to start making very pointed requests to the technical team. They start asking for very short-term value. And then the technical team is in a panic trying to deliver exactly what they were asked for. And there isn't anyone in the organization who can unpack the situation and point out the consequences of optimizing for short-term value. Ideally, you’d be able to bring in people to the organization who can understand both sides and who can explain to executives that while machine learning is very powerful, if you optimize for short-term value too much, you can get into trouble–and here are the ways that can happen. Avoid deep algorithmic discussion and put it in plain-and-simple business terms.”

ML Experts With No Business Background vs. Business Leads With No ML Background

Johnson suggests that the relative newness of AI/ML to business has led to a familiar disparity in standing and influence in the organization: Relatively junior technical people trying to explain arcane technical complexities to tenured business executives that don’t share domain expertise. “I think the industry is so new that you have people at the business layer who don't necessarily have an intuitive or deep enough understanding about how ML projects are executed to understand exactly what they're asking for. And you've got quite a few people in ML who haven't been in the industry as long, or have the seniority or comparable experience to their counterparts in the business layer. I think we need to see more investment in bringing up some of the technical folks to the executive level; and it wouldn’t hurt to also take the ‘business people’ and give them an ML bootcamp. Not even about how the algorithms work, but about how ML projects are executed.”

Johnson also suggests that one aspect of the chasm between business goals and data science might be the ‘science’ part. “I was having a conversation with someone on this topic who kept interjecting the phrase, ‘as a scientist.’ This person would say, ‘Well, as a scientist, I see this.’ But science is not a discipline in which we always expect concrete results. Experimentation is not a discipline in which we always expect to succeed. So you have some periods of things like data collection, development, experimentation, and some periods of failure. And that can be really scary for a business to bring in something that seems ‘risky’ in this way, even in recommender systems. You're taking this area–where you used to have full control over the products that you're showing your users–and turning it over to a system that you don't understand. That system could potentially give you much better results, but it could also potentially give you much worse results.”

Contextualizing ML Projects in Terms of Business Timeframes and Metrics

“So, I think that one thing that ties the experimentation of ML back to business goals is timelines–which can be long. It’s important to understand, from a project management perspective, what's going on with the timeline of the project, and whether it’s actually on track, because these projects do also get off track.” Johnson points out a particularly noticeable similarity to the world of software development. “As a software developer, it's very hard to estimate how long it's going to take to do something. In ML, it's even harder. So from a project management perspective, folks need to get started by being able ask, ‘Hey, is this project even on track for what we originally wanted to learn?’”

“And once we can verify that, yes, it is on track, and yes, we are launching things to production, it’s important to understand how to ask the important questions: Are we seeing the performance that we want to see? Are we learning what we wanted to learn? What are the risks of poor performance of our model? And then, once you can assess some of those things, you can, from an executive perspective, make the decisions you need to make. For example, when working on a recommendation engine project, if someone can very clearly say, ‘We are behind on this project because we have a risk of showing poor recommendations to users and this could damage our brand,’ then executives can make a better decision. A much clearer insight than arcane metrics such as AUC being 80% or NDCG being something else.”

More Resources:

Improve Communication (But Ship Something!) with Andrew Fong

Andrew Fong is CEO and cofounder of Prodvana, a platform that streamlines and accelerates software delivery. As a developer and engineering lead with 25 years of experience shipping products and leading software teams, he has served tours of duty at Vise, Dropbox, and YouTube. His takeaways include:

  • The Ideal State is Better Communication Driven By Mutual Interest: In a perfect world, devs, data scientists, and execs all make the effort to improve channels of communication because they’re genuinely curious about each others’ goals.
  • The Search for Common Ground Starts From the Top Down: It’s incumbent on leadership to improve understanding between themselves, devs, and data scientists.
  • The Challenge Is Balancing Non-Deterministic AI and Business Need for Predictable Outcomes: Ultimately, dev teams need to deliver a product or service people consistently want to buy–which means maximizing impact and minimizing risk.

Discussion: Balancing the Black Box vs. Shipped Products

Beyond the above suggestions, Fong suggests that in order to resolve misaligned incentives, different groups are going to have to get past the “three different people telling three different stories” phase. “For example, historically, the DevOps side of the world will say ‘No one understands reliability.’ Well, that's still going to be the problem if you're now introducing something a little bit more non-deterministic, like an AI model, on top of that.”

Fong draws the analogy to the initial resistance to the DevOps movement–which was as arguably much of a cultural conflict as it was a technical one. “I don't think this really has anything to do with ‘AI versus not AI.’ This is just about being able to communicate what outcomes you’re looking for, from all three parties. I don't think it's all that different from what we saw (or still see) from the site reliability community with regard to software engineering versus executives. Maybe the tools need to be slightly different for the personas, but the actual organizational incentives, I think, are exactly the same.”

“I would hope that no matter how much AI research goes into your product, you end up with a predictable business outcome. Whatever AI model or thing you’re building still needs to generate something that somebody wants to buy on a continuous basis, so that had better be predictable on some level. We can absolutely talk about the efficacy of a model or how you test it, and all of those things. But to me, all of that goes back to saying, ‘OK, if we do this correctly, it's still going to produce something predictable.’ Predictably what someone wants to buy. And I think in a conversation about incentives not being aligned, that might be the point that’s getting lost.”

Is Change Management The Key Leadership Skill To Adopt AI Successfully?

Fong suggests that change management might be the biggest challenge standing between misaligned leadership, engineers, and data scientists and that ideal future where everyone is finally on the same page. “Executives that can rise to the challenge of change management–and people within an organization that can handle change management and help teams work through it? They're probably the ones that are ‘at a premium’ right now. I think the ability to build software is not that uncommon. I think the ability to fully understand the AI side may be more uncommon, but a researcher who can't help manage change within an organization over the course of a year is probably way less valuable than one who can.”

“For my team, this stuff has admittedly been fascinating. I think it was shortly after the release of GPT-4 in Spring of 2023, and our startup was relatively young at the time. We had to think carefully about where we would place our attention and resources. We were gaining traction in the market and onboarding customers, but we also had some serious discussions about whether, and to what extent, we would consider working these new AI products into what we do. As a startup that focuses on deployments, we had to ask ourselves from a first principles perspective: Will people accept non-deterministic deployments? Probably not.”

Implementing AI at the Pace of Your Product (and Buyers)

“Some months later, my co-founder and I started spending some time looking more seriously at what was available in the world of generative AI and what might be possible. We were doing a lot of prototyping–really just to see what was doable at the chat prompt level. We realized that what was possible could be way bigger than what we thought, so we decided to spend some time thinking through what we could do with these types of systems.”

“And we found ourselves asking ourselves the same questions many other technical teams have probably asked themselves: Where does this fit in the product? What’s the right use case for it? We attempted to work on something really small initially–generating release labels correctly from commit messages. As we worked through it, we realized that we were going to have to do this–it became a must-do project. But we also knew we weren’t necessarily AI experts. So we decided to work at a pace that would make sense for our product space and for our buyer.”

Fong reflects on his own professional history and empathizes with developers’ concerns about risk. “I have spent 25+ years in infrastructure. And I think that I know infrastructure people fairly well, and they're usually pretty risk-averse to things that are not deterministic. We knew we couldn’t just drop [generative AI] into the product and ‘go with it.’ We had to build an actual thoughtful strategy around what buyers might actually want from such a thing. We had to ask ourselves: Where will buyers accept [generative AI]? Where can it create leverage for them? We had to think beyond AI being cool tech and interrogate where we could create real value for our customers.”

More Resources:

Make the Metrics Make Sense with Stefan Krawczyk

Stefan Krawczyk is the CEO and founder of DAGWorks, an open-core platform that provides observability, lineage, and catalog for ML and LLM workflows from development to production. A machine learning & data science veteran with 15+ years of industry experience, he has held engineering and research positions at Stitch Fix, Nextdoor, LinkedIn, and Stanford University. Stefan suggests:

  • Designing for Change: The LLMOps space, and AI in general, is evolving so quickly that it doesn’t make sense to build entire businesses around industry conditions that might not be around in six months.
  • Putting AI into Production Means an Evolving SDLC: It will definitely behoove teams to take a systems thinking approach toward implementing AI–and understanding the downstream effects of any tweaks or changes you implement.
  • Drawing Up Metrics That Acknowledge the Need for Experimentation: Businesses need metrics to understand what’s working and what isn’t, and will be best served by also taking into account the need for iterative experimentation.

Discussion: Put Change in the Roadmap (and Business Plan)

Krawczyk expands on the above points by noting that the AI and ML spaces are evolving so quickly that teams should consider change as a constant part of their plans. “It’s exciting and terrifying. There's a lot of momentum and speed when you look at how different the space is from just a few months ago–so if you're going to get into the MLOps space, you need to design for change. But to manage change well, you also need the ability to reliably evaluate your outputs.”

“If you're really serious about putting something into production, you should be able to ensure that you can actually change those pieces out when the need arises–and that you haven't made too many assumptions about them, which may require the right tools, among other things. I’m admittedly a bit biased, as I’ve been working on the open-source project Hamilton, which I think is a great way to model dataflows that are modular and easy to modify. For example, when what you’re working on has changed or has been simplified, like larger context windows, better LLMs, and so on...How quickly can you adopt it, and how quickly can you evaluate what changes with it so you can move with confidence?”

“Understanding what's going on in the system and how things have changed post-launch is going to be pretty critical to your software development life cycle. In other words, let’s say you change a prompt and it seems to work. The space of inputs for an LLM tends to be much larger than a standard unit test, so you can never essentially have 100% evaluation coverage unless you build out a giant suite of tests to confirm that you’ll get your expected output from whichever type of input you’re using.”

Getting a Valuable MVP Running and Iterating

Reflecting on the question of whether the future will belong to large-scale foundation models, open models, or smaller, local models, Krawczyk suggests that teams will be better served by starting from their intended outcomes and understanding how upstream changes will affect their AI program going forward. “Some people have suggested that the future will belong to complementary tools that close the gaps in popular foundation models, but if the likes of OpenAI, Anthropic, or Google update their models and fix those issues, your business is gone. Another reason to design for change.”

“For machine learning programs, you’d usually start with heuristics and rules, and you’d build your way up. Now, these major foundation models, in contrast, seem very ‘general purpose.’ I think you will potentially see reasons to move away or try something else–reasons like hallucinations and controllability, but also cost curves and other business considerations.”

“For example, maybe you’ve built an MVP on ChatGPT. But maybe you realize that you can get away with using a much smaller model. Then that means the hardware to run it gets cheaper as the model requires less memory, and so on. But before investing in that, you first need to prove the business value. I think the life cycle will look something like this: Get something running, prove that it works, prove that it's valuable, and then refine it with something like fine tuning. So I think if anything, there's going to be more tooling around the software development life cycle. How do you develop? How do you then change something that's running? You'll have different APIs or foundational models for different parts or different calls.”

“From there, how do you evaluate that? How do you change it with confidence? Monitoring could be a potentially huge challenge, for instance. If you have a lot of these LLM API calls back to back, some small perturbation up the top might really impact things down the line. How do you easily debug, trace, and understand what went wrong? I think these are problems that have existed before, but just at a different kind of scale and rate of change. That’s why I’m also building another open source framework called Burr.”

Closing the Gap Between Executive Expectations and Ops Realities

Krawczyk points out the inherent conflict in trying to fit inherently non-deterministic systems into traditionally rigid units of measurement. “Admittedly, you’re dealing with something that’s part of a hype cycle. But you want to manage expectations so that execs don’t think you've over-promised and under-delivered, right? We want our project to be up and running, and stable. We’d naturally try to define some sort of metrics or boundary, something to prove some sort of baseline. But given my time in research work with ML and related disciplines, I think that some of this stuff just doesn't always fall nicely into a sprint.”

“In the very short term, you might look at metrics like iteration speed–define some boundaries of what success is. If your immediate goal is making execs happy, you can start with things that are inherently measurable, and narrow the scope, such as down to a specific use case. But with regards to how incentives are aligned between data scientists, developers, and executives, it’s ultimately a tricky question, because you have to ask: Who's developing and who's productionizing? How do you actually get things out? Who is ultimately responsible?”

“If speed is critical, you’d want the same person or team doing both. How would that team be measured? There’s organizational stuff to think about–something that more people are going to encounter over time because back-end and front-end devs can now spin up these applications themselves. If eng teams promise the world, or are expected to deliver the world, and they can’t, they’ll need someone with an ML or data science background to set everything up. This is why it’s so important to be able to properly measure and evaluate, something that those with a ML and/or data science background should know how to do.”

More Resources:

Will MLOps Go the Way of DevOps? with Adam Zimman

Adam Zimman is an angel investor and strategic advisor to early stage VC firms and software startups. He has served as a professional developer, an engineering executive, and a go-to-market executive at organizations such as Dell (EMC), VMWare, GitHub, and LaunchDarkly. Adam offers the following observations:

  • Risk-Taking and Experimentation Need to Become Part of Your Culture: Google’s SRE team adopted the concept of error budgets to stress-test systems. Organizations that adopt non-deterministic models should do the same.
  • Incentives Need to Align With Experimentation: The orgs that make important discoveries faster will be those that don’t penalize taking risks.
  • Operational Excellence Will Come from Integrating MLOps into the Dev Life Cycle: Orgs will get past the bottlenecks of requests getting thrown over the fence by rethinking the development life cycle to include what MLOps teams need.
  • The Lines Between Software Devs, Data Engineers, and Data Scientists Must Blur: Orgs will see the greatest efficiencies with common lines between data science and dev. And given how few ML PhDs there are, the data engineers of the future may increasingly come from traditional software dev backgrounds.

Discussion: Applying Software Dev Lessons to the Future

To contextualize his observations, Zimman refers to some of his recent conversations with startups he advises–and how they rhyme with the rise of DevOps. “I've talked to a number of startups that are looking to establish this notion of MLOps pipelines and set up some standards around it. It seems like there aren’t too many who have quite nailed it, at this point. There's still a huge bifurcation between the academic community, which has been very participatory in the building of ML models, and the engineering-plus-operations community, which has been looking to support them operationally. There's still a sense that data scientists are still very much viewed as an ‘other,’ as opposed to just a regular member of the engineering team. Right now, one of the biggest challenges is a cultural one, not a tooling one.”

“It’s kind of like when developers didn't consider themselves to be part of operations. This was part of the motivation behind the DevOps movement. It was getting to that realization: ‘For the work that the two of you [Dev and Ops] are doing, neither one of you exists without the other.’ Data scientists still seem to sit outside of that, until the situation changes culturally–until some companies start looking at how to incorporate data scientists as, for instance, a different developer persona – similar to front end versus back end developers. From there, you have to figure out–how do you get them to use the same tools? How do you get them to use the same processes that are going to drive that unification, not just functionally, but culturally? You need to get them to realize: ‘Oh yeah. I guess we are all working together.’”

Why Experimentation Is Important Enough to Bake Into Business Incentives

Zimman reflects on how the current risk-averse environment is eerily similar to the software startup scene 10 years ago. “This whole conversation reminds me of the early days at a startup where I worked. Basically, the idea of ML models being this ‘adjustable algorithm you still need to refine’ requires a culture and mindset that encourages experimentation, like the Google SRE error budgets. Really thinking about how you can actually encourage people to break things because you know that will ultimately make you better, faster, and more stable.”

“For the vast majority of organizations, that is still a huge cultural change. In those earlier days, we had people agree that experimentation sounded like a great idea in theory, analysts claimed it was valuable, and there was data to back it up. But in reality, very few organizations have been able to adopt a risk-taking culture with any kind of success. And it was because the incentives for advancement, for compensation–for avoiding getting calls in the middle of night–were all completely misaligned with this idea of failing on purpose for the sake of learning.”

Why ML in Production May Be Having Its “DevOps” Moment

While the “before” picture for how MLOps, engineers, and executives get on the same page remains uncertain, Zimman is confident the “after” picture will look a lot like modern DevOps, both culturally and organizationally. “I’ve heard both James Governor and Charity Majors use the term operational excellence, which leads me to think of things in terms of maturity models. Teams are going to get to a point where operations are part of your development backlog and lifecycle. You’ll need to start thinking about how your operations team can actually make requests back to engineering or changes to systems and services and applications that will help them unblock bottlenecks and just make themselves more efficient.”

“Frankly, that’s something of an anomaly, even among Fortune 2000 companies–I’d argue that less than 10% of them have reached that point of maturity from an operations perspective.” But Zimman cautions against siloing across different teams, which can jeopardize alignment and slow things down. “My big concern is that the picture starts looking like DevSecOps–where all of a sudden, you have this notion of a ‘third team.’ To me, there really shouldn’t be.”

“If you want to think about the context of DevOps, then at its core you’re doing one of two things: You’re building a thing, or you're running a thing. If you're on the build side, you're on the build side and that's it. You are part of being able to write code, deploy new code, and continue to iterate and develop. If you're on the run side, then you're on the run side. Your responsibilities are only to keep things up and keep things running and look for opportunities to be able to do improvement. But the reality is that you should be working closely with those two organizations to make sure that you're iterating quickly and intelligently to tackle the things that matter most, first.”

How Operationalizing Could Lead to Common Ground

Zimman suggests that while data, engineering, and operations will converge, differences will necessarily remain. “Personally, I don't believe that large-scale DevOps teams are actually ‘a thing.’ I think at scale, DevOps is just a change that only affects process, or technology, or alignment, or culture. It's not a single team structure. While it can roll up to a chief development officer or CIO, the reality is that you still have individuals whose roles are either developer or operator. When orgs try to claim that its developers are operators, that just doesn't work at scale in my experience.”

“So, operations teams are still going to be operations teams. And I think that they're going to need the ability to own and control model deployment and delivery. Similarly, I think that on the developer side, the role of data engineer may end up going to developers. And so, they're going to need to make sure they're working within the kind of guidelines you’d expect within the development community. Now some of those guidelines may need to shift and expand so that they incorporate different tools or augmented processes that are needed by a data scientist. But ultimately, I think the healthy teams are going to find ways to be able to look at consolidating that and collapsing that down so that there's less of a distinction.”

“I've been in organizations where initially there's a need from the marketing team to be able to have developers be able to build the website. And ultimately, the most successful ones I've seen that have been the ones where that’s inevitably a function that gets moved out of marketing and back into R&D, because there's a closer alignment to being able to encourage developers with a career track by putting them in an org with other developers. This isn’t strictly an org structure thing–you can still have reporting back into marketing, maybe with a dotted line back to engineering, but there's got to be a greater relationship and connection between those types of functions.”

“It's going to happen with data science. What I think will really change is that as having models in production and the need for data science expertise becomes more mainstream, more developers will need to take on the role of data engineer because there just aren't enough data scientists to do it. You're just going to see fewer folks coming from academia as there are only so many PhDs. And while some PhDs may continue to be enamored with ‘the perfect,’ an increasing population of software-devs-turned-data-engineers will recognize that perfect is the enemy of good, and that ultimately, we have to ship something. There needs to be an approach of a Progressive Delivery model that prioritizes shipping, testing in production, operational excellence, and consistent incremental improvement.”

More Resources:

Conclusion

Over time, as more teams come to understand the complexities of machine learning models in production, more organizations will hopefully be able to set sensible goals that allow for experimentation (and the occasional unexpected model hiccup), especially if software developers, data scientists, and executives can close the gap of specialized knowledge through more education, tooling, and culture that encourages transparent knowledge sharing and learning by improvement.

For more discussion on the challenges of getting AI into production, join the DevGuild AI Summit II event.