Building AI Factories: How Red Hat and NVIDIA Turn Enterprise Data Into Intelligence - Ep. 293

March 12, 202638 min · 6,940 words

Open in Steadcast for Mac Apple Podcasts Overcast

Show notes

Enterprises are moving from AI pilots to full‑scale AI factories that turn data into trusted digital intelligence. Red Hat CTO Chris Wright and NVIDIA’s Justin Boitano unpack the "five‑layer cake" AI factory stack, from accelerated hardware and hybrid cloud infrastructure to models, agents, and production‑grade governance.

Highlighted moments

“Perfection is the enemy of good enough. So if you have this like perfect view of your future world where you've normalized all your data and everything is well-defined, you'll spend all of your time doing that. And you'll never be able to get to showing some business value.”

Jump to 23:28 in the transcript

Transcript

Introduction to AI Factories

0:00Welcome to the NVIDIA AI Podcast. I'm your host, Noah Kravitz. My guests today are Red Hats Chris Wright and NVIDIA's Justin Boitano, and we're talking AI factories. Why should enterprises build AI factories, and how can they do so with confidence in building AI factories that they can trust? By way of introductions, and I'll keep it brief because

0:31both of these guys' work speaks for itself, really. Chris Wright is Chief Technology Officer and Senior Vice President of Global Engineering at Red Hat, and Justin Boitano is Vice President and General Manager of Enterprise Computing at NVIDIA. Gentlemen, welcome to the NVIDIA AI Podcast. Thank you so much for taking the time to join us. Thanks for having us. Thanks for having me, Noah. Let's get right into it. And Justin, I'll start with you, but always both of you guys feel free to jump in, you know, as the spirit moves you, so to speak, as we go. But Justin, why don't we

Defining AI Factories

1:00start with you? Can you talk a little bit about, well, maybe first give kind of a working definition of what we mean, what you mean when we talk about an AI factory, and then get into kind of at a high level, why would an enterprise be interested? Why are enterprises building AI factories, and what are some of the tangible benefits that an enterprise can expect to see from an AI factory? Sure, Noah. Yeah, you know, and I think it's important to understand kind of the context of where we are as an industry. And, you know, building digital intelligence to power the productivity of organizations is going

1:35to be as critical in this decade as, you know, energy in running our companies. This is the next industrial revolution, and companies are always asking us, you know, how do we build these factories that basically take data in and then produce the intelligence that, you know, helps them run their businesses more efficiently. And so as we talk about, like, what is an AI factory, you know, we think of them as really kind of five layers of technology that need to come together. At the base layer, you know, you got to make sure that you have got the data centers with power to bring

2:08into these factories. You've got to have, you know, chips is the easy way to talk about it. But we're at this point of building rack scale infrastructure that's, you know, six chips with extreme co-design to build the best token efficiency from the power available to you. The next layer, you typically want to have the software infrastructure to orchestrate everything. And then you want to have models that run that intelligence, and then ultimately, the apps and the agents on top. And so what every business needs to do, though, is take this intelligence and build, you know,

2:42use case specific business outcomes that help them drive innovation, build products faster, and ultimately, you know, grow revenue, top line through deploying this intelligence at scale. Right. And so these five layers you're referring to, this is the cake, right? The five-layer cake? That's right. This is the five-layer cake.

AI Factory Layers

3:00Excellent. Chris, the world is, I feel like we can say this so often, but things are changing so quickly. Right now, as we record this, there's a lot of talk about open claw and autonomous agents and kind of long-running agents. Can you speak kind of a little bit sort of to that and how NVIDIA and Red Hat are working together to help enterprise and enterprise IT departments kind of step into this new world? Yeah, actually, open claw is a great example because there's so much enthusiasm about, I guess,

3:32what's possible, what you could do. It's captured the kind of the builder's imagination, but also built quite quickly, certainly leveraging AI to help produce code quickly, but not with the enterprise in mind. So when we think about what Justin was describing, that kind of data in to a factory context that produces business value as an output, we're talking about enterprises. That's their data. Those business outcomes are really either

4:03driving net new growth or focused on the productivity and efficiency. All of that needs to be done responsibly, safely, respecting access controls, delivering audit trails, things that are maybe not as fun in the builder world, but fundamental to the enterprise world. And so a lot of what we're doing is taking these building blocks, the layers of that five-layer cake, and making them accessible to the enterprise together. So obviously, NVIDIA has got world-class hardware. We're bringing a

4:38software layer that enables the higher levels of that cake. And then we're building the right guardrails and security considerations into this combined solution so that our customers can then feel confident about bringing this into their enterprise as they're all trying to figure out how to do AI transformation, go from a traditional company to really an AI-native company, and in that context, not introduce undue risk or essentially undermine the core of their business.

AI Adoption Statistics

5:11Right. So there's research that shows that only 1% of organizations right now have reached the stage of an optimized, AI-fueled, AI-native, as you were talking about, Chris, enterprise, while over half of organizations still remain in the early stages of transformation. But at the same time, projections have global AI investment exceeding a trillion dollars total by 2029, just a few years out. And of that trillion dollars, these projections are saying agentic systems are going to account for roughly half of that spending. That's a big shift from, you know,

5:46a year ago, two years ago. You guys know the timeframe better than I would. But, you know, when agents were kind of this buzzword that, you know, nobody necessarily knew, there are all these different definitions, et cetera, et cetera. And now we're talking about all of this, you know, resource and spending going in specifically to agentic systems. Chris, what can we glean from this? And I know you spoke to it a little bit just now, but, you know, what are the kinds of things that the AI factory can do for an enterprise infrastructure-wise, but confidence-wise,

6:17as you were talking about, when it comes specifically to figuring out how to deploy and integrate these agentic systems? Well, if you think about that notion of transforming the enterprise and leveraging internal data and focusing on your core business about how do you improve it or grow it, there's a whole set of things that are underneath that. Obviously, the data piece that we talked about, but also it is the existing tools that operate your business that are not going to just go away. They're fundamental. They're the baseline, the business as usual components,

6:50pretty critical and fundamental. So part of this is how do you carry that forward and really modernize your entire infrastructure to bring these two worlds together, this highly modern AI native world and the traditional set of applications that literally run the business? Because you need to bring AI capabilities, not just in the net new, but also in the existing content that runs all the enterprise. And to me, that's exactly what the AI factory does. It helps bridge these two worlds. I

7:23mean, in the end, we've got models, but we also have, as Justin described at the beginning, agentic content or AI-enabled applications and then also the traditional applications. So bringing all of that together and then doing it in a context with consistency across the enterprise so that you're not asking every team to go figure out their own, choose your own adventure path forward. And that consistency, you build best practices across your organization. And then you're ultimately improving your chances for success and reducing the failure rates. There's so many studies

7:55that suggest a lot of AI projects can fail. There's a number of reasons for that. One of those is having the right tools and having the best practices and access to the data and combining, essentially combining forces as a company to produce an output rather than devolving into sort of the next generation of shadow IT and, you know, everybody building their own thing and creating this highly fragmented internal environment, which is then, you know, kind of difficult to get your arms around, if not just

8:29produce zero, very little success.

Scaling AI Factories

8:32Yeah. Justin, are you seeing similar things? Well, I got to say, you know, what's interesting is in the last three months, it feels like the market has really started moving even faster. I'll just say this. You look at coding companies. Every one of you guests who comes on this podcast says that same thing, moving faster, moving faster. Well, well, but you can actually really feel it now. And I say that because like these, the, you know, the first area of agents, like product market fit, you know, really was in software development. And we see it really as a, as a software company ourselves. You know, we, we can feel

9:08these agents doing so much more work for our developers and running longer, more complex software tasks. So you give them design goals and they can, you know, work towards those goals. And at the same time, you know, like you said, this, this moment of, uh, clause came out and clause basically take it sort of to a new frontier of, you know, full autonomy. And so, you know, we're, we're getting to this point where, you know, agents are going to have a lot more agency within our enterprise. A lot of those studies that you mentioned where people were having a hard

9:40time getting AI to work, I think was at a previous era of the world where people were trying to do like chatbots, like just very basic chatbots. And that was before reasoning. And it was before, you know, this level of autonomy that I'm talking about. And so I feel like, uh, a lot of what enterprises might've been experimenting with might be a couple generations behind where state of the art is right now. Uh, and so as we deploy agents internally now that can use, you know, a very, uh, I'll call it deep agent like reasoning framework, they can, you know, plan and reason and act across

10:15many different business systems to do a deep research as an example, to understand kind of the intent of what a user might be asking and help them get to, uh, the, the information across the enterprise in a way that's faster and more efficient than ever, you know, previously thought imaginable. And the nice thing about running this on a factory, an AI factory, uh, within the context of an enterprise is as Chris mentioned, it delivers data privacy and security, uh, by running that all across open models in this on-prem world. Um, and then you can do things where you, you still,

10:47you know, potentially use the frontier models, but you can use the frontier models in a way where you might only use it for the planning stage of the agent and all the search and summarization is using open models. And so that drives a lot of cost efficiency. In some of our, uh, newer blueprints, we see a 30 X cost reduction, uh, by doing a hybrid model architecture. Uh, across your, you know, private, um, unstructured information. And so that is a use case. The enterprise search, I think is, you know, very, uh, uh, a broadly generalized use case, uh, that

11:17gets us from, you know, I'll say these early adopters that were getting, seeing the benefits of agents for coding into really how knowledge workers are going to start to use agents to help them, uh, do their jobs in a much more productive and efficient way. Uh, as somebody who sits more on the knowledge worker than a software developer side of the fence myself, um, getting me more towards that and away from vibe coding is probably a good idea, but that's just my own sort of personal use case there. But that does make me want to double click a little bit on, you know, on, on security and, and governance and things like this. Um, which,

11:51you know, I think Chris, you mentioned at the top with the advent of, I mean, joking aside with the advent of, you know, coding tools, vibe coding tools, and, and these more advanced agenda coding tools in the hands of anybody, including folks like me, it's easy to spin something up. I don't know that it has, you know, a hole in it waiting for a prompt injection attack or, or whatever the case may be. Right. And, and get into that shadow IT world, Chris, you were talking about. So I want to ask you both and Justin, I'll start with you because you were talking about a little bit just

Non-Negotiable Capabilities

12:22now. When you talk about planning and building an AI factory, what are the non-negotiable capabilities that have to built, be built in that the enterprise must have to move from, you know, kind of first experiments and prototypes with AI to production, getting into industrial scale production, AI use cases with confidence. And, you know, Justin, you mentioned some of these, but there's security, there's governance, reliability, obviously moving to scale. Um, you talk a little bit about some of these factors. Yeah. And I think, um, I'll say in the software development world, we're really

12:55good at separating the notion of development versus production. And I think that's obviously the best, best practice, uh, is enterprises get going is to separate the two. And on the one hand, you want to help your internal, I'll call it AI development teams, uh, do discovery in a development environment, but separate that access control from, you know, production data until you've basically, you know, proven the verification or done the functional verification of the, the outcome that you're trying to get to, you've QA'd it, you pen test it. It's got things like role-based access

13:29control so that if a user is using that agent, it inherits their permissions to access business systems. And you're going to move, you're going to promote, you know, the agent from this development environment into a prod in that way. Um, and so I, I think, um, you know, I think, uh, the, the worst thing that enterprises can do is overanalyze this though, and try and get to, uh, like, how do I get to how do I prove the TCO upfront before I start to make the investment? You've got to, you know, believe that, you know, AI is this new frontier and the companies that are able to harness it and

14:01put it to work for them are going to have a massive competitive advantage. And so the sooner you get going, the better. And you can start in this, this dev environment with, I'll call it narrow use cases that are aligned to your core business goals. And then, you know, scale as you, as you start to see success, but to your point, so you want to make sure these, these agents, you know, have, you know, there, there's a clear set of governance, uh, there's, you know, clear ability to trace like the data systems that they access, um, and that you can continuously evaluate them against known

14:32business outcomes that you're, you're trying to achieve. Um, and then that accuracy, uh, against certain use cases is what allows you to promote it then in, into production. Chris, how can I ask you how things like, well, inference, obviously we, uh, did an episode recently about it was energy focused, but talking about the coming wave of inference and, you know, the, the shift of the load moving to some extent from training to inference, you know, maybe in this calendar year or whatever, kind of the next wave is. Um, but talking about things like high performance inference, um, and also hybrid cloud agility, how does the AI factory sort of figure in

15:08it and support these two things in particular? Simply put inference is your production environment. Uh, so training, whether it's pre-training or post-training, those are things that are happening pre-production and inference is where you're bringing this intelligence to life. So scale, efficiency, security, uh, you know, robustness, reliability, compliance with policy, compliance with SLAs or SLOs. These, these are like the table stakes and an AI factory is, it's a significant investment for an enterprise. The expectations are it produces significant

15:42business outcomes. And so we're focused on optimizing that production of outcomes, which you could back up and say, those are business intelligence, or you can back up a little bit more and say, it's simply tokens, optimize that, that throughput of tokens in the context of costs, in the context of power consumption, because we're, we're also power constrained. And so how do we do that? That's through this scaled out inferencing, um, which is part of the AI

16:13factory. It's, it's really the, the, the core underlying platform that you're running all of your, your models and, and above that, the agents and AI applications on top of. So to me, it's the, it's the critical substrate and the agility that comes with flexibility and choice of where and how you deploy your models or, or your, uh, your workloads, that notion of pre-production environments and production environments and where production data versus non-production data is used. You, you, you get

16:45some choice in where you deploy. And that to me is really the hybrid cloud, your, your optionality. There's cloud environments, there's enterprise environments, there's even edge environments where you may want to deploy, uh, your workloads and taking advantage of all of that with, with a consistent footprint. Um, like we're building with this AI factory, it gives you the best of all, uh, of, of your alternatives. Uh, and so, you know, I think we're, we're bringing the efficiency, we're bringing the flexibility, uh, we're, we're ensuring that we have those confinements, whether

17:21it's confidential computing or guardrails or, uh, any kind of sandbox technology that I think becomes really critical as we're building and delivering these, these new capabilities. And if you, if you go back in time before the focus on AI, we developed through decades of experience, uh, what Justin highlighted that pre-production, you know, dev, test, prod kind of best practices. There's a whole set of learnings and rigor and discipline that we've built in building and

17:54delivering applications into production that we're bringing as part of an AI factory for building and delivering AI applications into production. I'm speaking with Chris Wright of Red Hat and NVIDIA's Justin Boitano, and we're talking about the AI factory and how enterprises can build AI factories that they can go to production with, with confidence and can scale up to the future and really help transform companies into AI natives as we've been talking about. I want to

Infrastructure and Software

18:23get into a little bit about specifics, infrastructure and software and platform components. And Justin, I'll start with you for customers who are thinking about an initial AI factory footprint and might want to start small, but have that ability to scale as, as they, as they scale. How should those customers think about sizing and selecting NVIDIA infrastructure and software? Yeah, I think, you know, as we, as, as the customer starts to try and build the AI factory, they, they got a thing through the five layer cake, uh, that I mentioned previously. So it's, uh, where do I have data center power? Uh, what is the power

18:57density of the data center? Uh, do I want to run air cooling or liquid cooling? That seems to be a decision point right now. Um, a lot of enterprises still run air cooled data centers. And so, you know, platforms like our RTX 6000s give you, you know, very good, uh, price performance. That's kind of a general purpose GPU to do experimentation with. So if you don't know where to start like that, that kind of gives you a great platform for many different use cases. And then from there you, you start to ask yourself, well, what's the orchestration management platform that I want to run my business on? Um, and

19:28that's why we work very closely with the Red Hat team. Red Hat's AI factory takes care of, uh, really the next few layers of the technology stack from software orchestration and management, uh, model delivery, all the, I'll say commercial security patching, lifecycle management of all of that open source software so that you can run it with confidence and, uh, and kind of get the factory up and running. Uh, and then you get up into the application layers and the application layers, the way we try and make it easy for customers to start as we provide, uh, reference blueprints,

20:00uh, which are examples of proven use cases, uh, that even we run on our AI factories at NVIDIA for things like enterprise search that make it easy to then connect into your enterprise documents and do document ingestion and then start to provide benefits to your users. And then from there you can start to expand into, you know, your own developed use cases and such, but, uh, that thinking through that full stack is really the easiest way to get going. And then I think taking some of these proven examples is like kind of the, the quick way to get an early win with kind of your executive,

20:32uh, leadership team with them, the benefits. And then from there, usually you pivot into, you know, what's the most important business outcome for the company to be competitive that you got to ask yourself for NVIDIA. We're a chip company, uh, we're a software company and we're a supply chain company when you really boil it down. And so we then go super deep into those use cases to make sure that we're enabling, you know, tens of thousands of chip designers or software engineers, or all the people dealing with all the components that allow us to have supply and availability of, uh, building this rack scale infrastructure. And if we're world-class

21:04at those, then we can be world-class in market. And I think that's generally how companies should think about it. Chris, on the Red Hat side, what are the key platform components that you see as foundational for this first AI factory deployment? Thinking about things like OpenShift, Red Hat AI enterprise, AI factory with NVIDIA, you know, when thinking about this first enterprise AI deployment, what are the key platform elements to start with? And also how should customers think about sequencing them? Yeah, I think for us, the stack starts with hardware, hardware enablement,

21:37and then the distributed nature of rack scale architecture. How do you make, get access to that whole distributed system? Uh, and then, you know, going up from there, we start getting into specifics of models and agentic applications and AI-enabled applications. So the bottom of the stack, very clearly, that's, that's the world of Linux, right? Hardware enablement, device drivers, low-level system software, and, you know, near and dear to our hearts, we spend a lot of time in that space and making sure that we work closely together with NVIDIA to do that first phase,

22:09you know, right against the metal enablement. The next layer above that is the distributed layer. Bringing that rack scale architecture to life includes a distributed system like Kubernetes. Kubernetes is tried and true in the application space, and it's supporting well delivery of agents or models or other content as containers on this distributed system with access to all the accelerators, uh, down at the bottom of the stack. And then that Red Hat AI enterprise layers on top.

22:40Uh, and this is where we start to integrate directly with some of the key capabilities, uh, that NVIDIA brings, like optimized models like Nemotron or some of the NIMS. And that's where we bring that, that distributed inferencing stack that is the, the foundation for intelligence for the business. Uh, so, you know, we sometimes call this the metal to agent stack and starting on, you know, with that layer right above the hardware, building up through inferencing and then supporting the models that is, is what we're building together to enable those key reference architectures that, that Justin

23:16highlighted or, or the validated blueprints or the reproducible plays that you want to bring into the enterprise. Cause I think it's important to have those early wins. Justin highlighted that it's important to have those early wins. It's an interesting tension. Perfection is the enemy of good enough. So if you have this like perfect view of your future world where you've normalized all your data and everything is well-defined, you'll spend all of your time doing that. And you'll never be able to get to showing some business value. But if you over-rotate to the easiest thing to do,

23:49the flashiest thing I can show, it might not have much business value. So picking those right first key use cases and also having in parallel this long-term mindset of it's a pretty fundamental shift in how we operate, you know, living in that duality, that's, that's the future on building the right stack to support rapid movement, consistent reproducible or, or replayable plays. And, you know, building from infrastructure that IT operations teams already understand. They know Linux,

24:21they know Kubernetes, they're, you know, they're learning a lot of new things in this context. So we'll give them as much stability as we can along the way. Yeah, no, it makes sense. Going along those lines of the, uh, getting those first wins, right? Which is a great strategy for, for lots of workplace projects to take on. But I think talking about such a big shift, you know, to the AI way of working, if you will, let's look at those first 90 days that, and can you lay out some kind of practical first steps? We've got the elements of the joint stack laid out, the hardware, the software that, you know, metal to agents, as you call to Chris,

24:54what are some practical things that folks listening to the podcast, enterprise leaders can do and kind of structure their first 90 days to get some wins and really start building that AI factory that can grow? Yeah. Uh, I'll assume the data center infrastructure is built out. Okay. Fair enough. Let's assume the data center is built in, infrastructure is built out. So, you know, what we, what we publish is what we call validated designs that sort of walk you through a lot of the design decision points of the software. And, uh, you know, you, you have to think of like,

25:25how do I want to, you know, how do I bring all of my software into this factory? Uh, how do I make sure I do, you know, security scanning? You know, if I'm going to want to rescan everything and, and operate it, how do I have automation to stand it up? Uh, and then, you know, quickly, how do I get these first, we call them blueprints, but think of them as like Kubernetes services that you deploy on the clusters, uh, to then get users on the system. Uh, and then ultimately what we do is we have, we call them like user acceptance test teams that we will roll an application out

25:56to, to have them use the application. Um, so they can start to, you can start to survey them and understand how are they doing work now versus how did they do it before? How much time are they saving versus how they did it before? And really that time savings is the productivity gain that you're going after. And you can really quickly get to, you know, from time savings across a user group to productivity, you know, gains. And so if you can get a 2X, uh, productivity gain, you know, across a big population of users, then, you know, you're, you're onto something really big.

26:27Absolutely. Yeah. Chris, anything, dad? The learning that you'll gather along the way, I think is really important. And so the notion of starting with a focused, you know, to have a hypothesis, uh, and a focused outcome and also iterating as you go. So it's about how quickly can you move forward? Um, I, I think that's really important. Our, our experience internally is reinforcing that. And we started with some really focused examples of data that we want to bring together, uh, within, within Red Hat, the research

27:03we wanted to do across that, that data and having evals. I can't understate the importance of evals. I think that it's often, often overlooked, um, part of the, part of the stack is they help you ensure the quality of what you're trying to produce. Uh, and so, you know, building iteratively towards improving your evals, we see this in the public with frontier labs focused, uh, on benchmarks and evals. Um, but they're just as important within, within the enterprise and that iterative process

27:34of refining any portion of the stack. It could be your prompting. Uh, it could be how you're managing the data sourcing. It could be even the scoping of the problem that you're trying to solve. I, I think that's, that's really important. And that notion of picking something that's real. So it's not so artificial that you can just show it. It's flashy. You get, you get high fives all around, but it doesn't really change anything internally. Right. I don't think that's particularly useful. So focusing on, on those things that are, are real, but again, not making it too big. So

28:08right. So the right sizing and the iterative process of learning as you go is how you start building the thing that ultimately is quite big. Uh, but yeah, I think it's starting small and iterating, which we do a lot in open source, uh, we do a lot of software development and having a little bit of diversity. We have touch points across every different function in our organization. There's different personas, um, but there's also different use cases. You know, it's more software development oriented. It's more finance oriented. It's more sort of sales and pipeline

28:41oriented. Each of these brings a little different dimension that is, again, it's helping you flesh out your end to end view of, of what's needed to go through whole scale AI transformation and be, uh, operating with, uh, uh, a full tilt AI factory powering your business. Yeah. And as you get these first projects going and not, not to, you know, sort of skip all the hard work in between, but as you mentioned, thinking about, you know, getting something going with an eye towards building out to scale and transforming the whole org, looking at it from the other

29:14perspective, what kinds of guardrails would be not just could people put in place, but what kinds of guardrails would you recommend would be appropriate kind of from the get go to make sure that as things scale, as things expand, as you know, wins are won and people get excited and want to use this stuff more and go faster. What are the things you can lay down kind of from the beginning to make sure that, you know, technical and process and governments and, you know, the guardrails are in place for these kinds of things? You know, I think, um, so what, one thing that, that we did, uh, up front

29:47was we made sure obviously our security teams were, were deeply involved as we did this just to make sure. I think you, you learn a lot about your organization as you start to put AI to work and you'll find AI is really good at doing discovery in business systems that it has access to. And you might realize, you know, you've got user permissions over scoped in areas. Um, and so having the security teams understand, you know, are we allowing too broad of access to what we want to keep confidential within the organization you'll discover as you start to connect agents into your business systems.

30:20But you know, there's, there's all kinds of techniques for, you know, guardrailing data access. Once you do find systems that it might have access to, you're going to realize that you've got to, you know, change permissions of, uh, many different business systems. You know, ultimately what you're going to want to do is scope the agents potentially as users. You think of them as digital employees. So where a lot of people start is they scope them to the user that's using the app, their permissions. Uh, so they see access to the information that they've been granted as a, as an employee in the organization. But as we go forward, you know,

30:53these agents are going to start to work more and more autonomously and we're going to have to treat them almost like contractors. We bring in, you give them least privilege access into your business systems and then they got to come back to you and check in with you and ask you, uh, for access to more business systems. And you're going to have to have a process in place where you can, you know, slowly get, grant them more access to do the job and fully onboard them into the job that, that we're asking them to do. Chris, I'm going to, I'm going to turn this one to you first, but, um, you know, Justin, you can, you can be thinking in the background about your answer. Uh, we like to end these.

Future of AI Factories

31:25It's so the, the more, the more time passes, the more I feel like it's an unfair thing to ask at the end of the AI podcast, what's the future going to look like, right? For, for obvious reasons. But if we look ahead, Chris, a year or two, maybe even three years down the line, if you're feeling really bold, what does the AI factory look like as, you know, agentic AI develops and models keep developing and the infrastructure keeps developing, but especially as, you know, more enterprises put these systems, build these factories and use them and put them to use solving real problems and, and driving

32:00new ways of working. What do you think the AI factory looks like a couple of years hence? I think you take it from a few different points of view that the one angle would be the layer cake picture. And that one, we have a pretty good understanding of, of the layer cake. Uh, so while there might be some subtleties, certainly in terms of specific tools that will come and go over time, that layering of, of what we're describing from hardware up through AI enabled, um, applications, I don't think is something that will fundamentally change.

32:35So you look a few years ahead, we'll see something that looks quite similar. Uh, how it's used by the enterprise, I think is what's going to shift completely in that timeframe. Uh, today, a more sophisticated enterprise has some agents in, in production, but they're not entirely agentic, uh, and it's not translated into the core of their, of their operations essentially. And so that to me is the shift that we should anticipate the autonomous nature of agents and,

33:09and the scoping of tasks will continue to grow. So initially it was, you know, the, the simple chat bot, which is just essentially fetching information. Right. Then you, you got a little more sophisticated with stronger and stronger recommendations. You could call that some kind of an assistant, the, the doing phase of agents and total autonomy. We're, we're seeing that time horizon just stretch. It feels like almost daily stretch out to be longer and longer. So you can, you know, in the coding context, you can give coding agents very sophisticated tasks

33:45and they will spend hours and hours producing very sophisticated code as a result. Right. That's just the coding example. It's, you know, it's a language that's well-structured. It's a good template for how we should think about the, the breadth of the enterprise. And so in the end it, the AI factory, um, you know, the layers look similar, the sophisticated sophistication of the tasks grows and it becomes the core of the business. Um, it becomes the place where we do our, we build our, our operational practices around. And so

34:17in the end, it's, it's not that we're going to go through and kind of augment each of today's processes. Cause if you just think of it like that, you, you take a bunch of questionable in some places, even stupid processes and automate them. And then you get an automated, stupid process. It's really redefining how we work together completely end to end and where agents take on critical tasks in the business. Uh, that I think is that, that future view, which again, you, you, you put some timeframes on it. We're not talking decades. We're talking quarters away,

34:52um, which is itself kind of phenomenal, but, um, yeah, I think that's to me that that's the, that, that future outlook. Well said, um, Justin, your thoughts? Yeah, I think the way Chris framed it is right is, uh, you know, software development, even I'm saying in the last six months has evolved where you can give AI, I'll say almost like a design document and let it go off and think and produce the code and then do, I'll call it functional verification of that code to make sure that it's, it's accomplished its task, right? Before it comes

35:26back to you. And so it's doing very long running thinking and work, you know, uh, that is the work of, you know, many, many, many, many software engineers, I'll just say. Um, and I think in the software engineering world, like I said, we've seen this product market fit where, um, you know, we're seeing a two to three X productivity gain with software engineers that can use these long running agents. And if you extrapolate that out, you know, the, the productivity gains, you know, for the whole software industry is massive. Um, but we're now seeing that move into this, um, knowledge worker world and, you know, CAD designers, uh, you know, engineers across every

35:59industry, um, where they can do the same thing, uh, where they can start to give design document goals to long running agents, where they can basically explain the exit criteria and give the agent the tools to do the functional verification and say, come back when you're done. Uh, and so I think that's what the future of work is going to look like in two to three years. You're going to have different agents working for you that you, you give these more structured long running tasks to, they go off and think and do the work and then they come back to check in, uh, in a period of time.

36:29And, you know, that will make us all, you know, infinitely more productive than, than we are today. And, you know, searching through UIs, trying to find information on our own, uh, and so I think, you know, we're, we're going to live through a big, you know, change in how we work in the next couple of years, but every company across every industry and every job function, uh, will, will really be transformed, uh, with the use of an AI factory. Perfect place to leave it. Chris, for listeners who would like to learn more about your work, the work that Red Hat is doing, places online, they can go, obviously website, um, social media,

37:04technical blog, other places, where, where would you direct a listener to learn more about what Red Hat is doing with AI factories? The easiest one would be learn more about the Red Hat AI factory with NVIDIA. So that's sort of an easy thing to search and you, you, you'll find information from redhat.com. You'll find more information together with NVIDIA on the NVIDIA website. And that's a really easy place to start digging into, uh, the Red Hat view on, on all this content. Fantastic. Chris Wright, Red Hat, Justin Voitano of NVIDIA. Again, thank you both so much for taking the

37:40time to come on the pod and talk about AI factories and, uh, really the, the future of work as we landed on Justin. It's, um, exciting time to be alive. Thank you guys. Thanks, Dylan. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.

38:26Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.

Building AI Factories: How Red Hat and NVIDIA Turn Enterprise Data Into Intelligence - Ep. 293

Show notes

Highlighted moments

Transcript

Introduction to AI Factories

Defining AI Factories

AI Factory Layers

AI Adoption Statistics

Scaling AI Factories

Non-Negotiable Capabilities

Infrastructure and Software

Future of AI Factories

More from NVIDIA AI Podcast

How Mistral Is Building Frontier AI for the Enterprise | NVIDIA AI Podcast Ep. 301

Everyone Can Build a Robot: Open Source Embodied AI With Seeed Studio | NVIDIA AI Podcast Ep. 300

Inside AI Tokenomics: How to Profitably Turn Tokens Into Business Value | NVIDIA AI Podcast Ep. 299

Snap’s Secret to Processing 10 Petabytes a Day: GPU-Accelerated Spark | NVIDIA AI Podcast Ep. 298

Harrison Chase of LangChain on Deep Agents, LangSmith, and Earning Trust | NVIDIA AI Podcast Ep. 297