Huang Renxun: Nvidia's AI algorithm, has been sold at a discount

July 5th, 2023

Huang Renxun, wearing a leather jacket, stood on a blue surfboard and struck a few surfing poses.

This is not the American "Netflix Festival" VidCon, but a scene from the developer conference of Snowflake, a famous American data platform.

On June 26, local time, Nvidia founder Jen-Hsun Huang and Snowflake CEO Frank Slootman discussed "how to bring generative AI to enterprise users". The moderator was the former GP of Greylock and now founder of investment firm Conviction.

At the meeting, the "Godfather of Leather" was as surprising as ever, compared to the professional managerial sophistication of the "host" Frank, who not only called the collaboration "We are Lovers, not Fighters", but also joked that the trained models provided to Snowflake were equivalent to a "10% discount" for customers.

Snowflake users can directly use NVIDIA's pre-trained AI models to analyze their own company's data on the cloud platform and develop "AI applications" for their own data, without leaving the platform.

"The major change now comes from data + AI algorithms + compute engines. Through our partnership, we're able to bring all three together," said Jen-Hsun Huang. said Jen-Hsun Huang.

Talking Points:

Big language models + enterprise-specific databases = problem-specific AI applications; What used to be Data going to Work is now Work going to Data, allowing computation to go where the data is and avoiding data silos; NVIDIA provides pre-trained models that have been trained in the NVIDIA AI factory at a cost of tens of millions of dollars, so calling the compute engine on Snowflake is already "0.5 percent off"; Software 3.0 era, based on models, databases, enterprises are able to build their own proprietary applications in a matter of days; a future where enterprises are able to produce many intelligent agents and run them; The real challenge for enterprises is the mixed structured, unstructured data, how to be mobilized. This may be able to bring about a renewal of business models.

The following is the main content of the conversation between the two parties, as compiled and edited by Geek Park:

01 Talking about cooperation: Bringing the best computing engine to the most valuable data

Frank:

NVIDIA is currently playing an important role in history. For us, the ability to bring data and large enterprise relationships. We need to enable this technology, as well as get the entire service stack to use it effectively. I don't want to use the term 'match made in heaven,' but it's a great opportunity for a layman to get in the door of that opportunity.

Jen-Hsun Huang:

We are lovers, not adversaries. We want to bring the best computing engine in the world to the most valuable data in the world. Looking back, I've been working for a long time, but I'm not that old. frank, you're a little older (laughs).

These days, for reasons that are well known, data is huge, data is valuable. It has to be secure. Moving data is difficult, the gravitational pull of data is real. So it's much easier for us to bring our compute engine to Snowflake. Our partnership is about accelerating Snowflake, but it's also about bringing artificial intelligence to Snowflake.

At its core, it's a combination of data + AI algorithms + compute engine, and our partnership brings all three of those things together. Incredibly valuable data, incredibly great AI, incredibly great compute engines.

What we can do together is help our customers use their proprietary data and use it to write AI applications. You know, the big breakthrough here is that for the first time, you can develop a large language model. You put it in front of your data, and then you talk to your data as if you were talking to a person, and that data will be augmented into a large language model.

The combination of a large language model plus a knowledge base equals an artificial intelligence application. This is simple, a large language model turns any data knowledge base into an application.

Think of all the amazing applications that people have written. At its core is always some valuable data. Now you have a query engine generic query engine up front that's super smart and you can make it respond to you, but you can also connect it to a proxy, which is the breakthrough that Langchain and vector databases have brought. The breakthrough stuff of overlaying data and big language models is happening everywhere, and everybody wants to do it. And Frank and I are going to help everyone do that.

02 Software 3.0: Building AI applications that solve a specific problem

Moderator:

Looking at this change as an investor, Software 1.0 was very deterministic code, written by engineers according to function; Software 2.0 is optimizing a neural network with carefully collected labeled training data.

You're helping people pivot to Software 3.0, which is a set of underlying models that have incredible capabilities on their own, but they still need to work with enterprise data and custom data sets. It's just much cheaper to develop those applications against them.

One question for those who are deeply involved in this space is, is the base model very generalized and can it do everything? Why do we need custom models and enterprise data?

Frank:

So we have very generalized models that can do poetry, deal with The Great Gatsby's do summaries, do math problems.

But in business, we don't need that, we need a Copilot to get extraordinary insights on a very narrow, but very complex data set.

We need to understand business models and business dynamics. This doesn't need to be computationally expensive, because a model doesn't need to be trained on a million things, it only needs to know very few, but very deep, topics.

As an example. I'm on the board of Instacart, and one of our big customers, like DoorDash and all the other businesses often face the problem that they keep increasing their marketing spend, and a customer comes in, the customer places an order, and the customer either doesn't come back or comes back 90 days later, which is very volatile. They call that churning customers.

This is the analysis of a complex problem because there can be many reasons why customers don't come back. People want to find the answers to these problems, and it's in the data, not in the general Internet, and it can be found through artificial intelligence. This is an example of where there could be tremendous value.

Moderator:

How should these models interact with enterprise data?

Jen-Hsun Huang:

Our strategies and products are various sizes, state-of-the-art pre-trained models, and sometimes you need to create a very large pre-trained model so that it can generate prompts to teach smaller models.

And the smaller model can run on almost any device, perhaps with very low latency. However it does not have a high generalization capability, and the ZERO SHOT (zero sample learning) capability may be even more limited.

So you might have several different types of models of different sizes, but in each case you have to do supervised fine-tuning, you have to do RLHF (reinforcement learning with human feedback) so that it's consistent with your goals and principles, and you need to augment it with vector databases and things like that, so all of that comes together on one platform. We have the skills, the knowledge and the basic platform to help them create their own AI and then connect it to the data in Snowflake.

Now, the goal of every enterprise customer shouldn't be to think about how do I build a large language model; their goal should be, how do I build an AI application to solve a specific problem? That application might take 17 questions to do prompt and eventually come up with the right answer. And then you might say, I want to write a program, it might be a SQL program, it might be a Python program, so that I can do this automatically in the future.

You still have to guide this AI so that he can eventually give you the right answer. But after that, you can create an application that can run 24/7 as an agent (Agent) that looks for relevant situations and reports back to you ahead of time. So our job is to help our customers build these AI applications that are security guarded, specific, and customized.

Ultimately, we're all going to be smart makers in the future, employing employees, of course, but we're going to create a whole bunch of agents that can be created with something like Lang Chain, connecting models, knowledge bases, other APIs, deploying them in the cloud, and connecting them to all of the Snowflake data.

You can operate these AIs at scale and keep refining them. so each of us will be making AI, running AI factories. We will put the infrastructure in Snowflake's database where customers can use their data, train and develop their models, operate their AI, so Snowflake will be your data repository and bank.

With your own data goldmine, everyone will be running AI factories on Snowflake. That's the goal.

03 Although "nukes" are expensive, using the models directly is equivalent to a "10% discount"

Huang Renxun:

We have built five AI factories at NVIDIA, four of which are the top 500 supercomputers in the world, and another one is coming online. We use these supercomputers to do pre-trained models. So when you use our Nemo AI Foundation Service in Snowflake, you're going to get a state-of-the-art pre-trained model with tens of millions of dollars already invested in it, not to mention the R&D investment. So it's pre-trained.

And then there's a whole bunch of other models around it that are used for fine-tuning, RLHF. all of those models are much more expensive to train.

So now you've got the pre-trained model adapted to your function, adapted to your guardrails, optimized for the type of skill or function you want it to have, augmented with your data. So this would be a much more cost effective approach.

More importantly, in a matter of days, not months. You can develop AI applications in Snowflake that connect with your data.

You should be able to build AI applications quickly in the future.

Because we're seeing it happen in real time right now. There are already apps that let you chat with your data, like ChatPDF.

Moderator:

Yes, in the software 3.0 era, 95% of the training costs are already being covered by someone else.

Jen-Hsun Huang:

(laughs) Yes, at a 95 percent discount, I can't imagine a better deal.

Moderator:

That's the real motivation, as an investor, I see very young companies in analytics, automation, legal, and so on, whose applications have realized real business value in six months or less. Part of the reason for that is they're starting with these pre-trained models, and that's a huge opportunity for companies.

Jen-Hsun Huang:

Every company will have hundreds, if not 1,000, AI applications that are just connected to your company's various data. So all of us have to get good at building these things.

04 It was data looking for business, now it's business looking for data

Moderator:

One of the questions I keep hearing from big business participants is, do we have to go invest in AI and do we need a new stack (Stack)? How should we think about connecting to our existing data stacks?

Frank:

I think it's evolving. Models are getting cleaner, safer, and better managed. So we don't really have a clear view that this is the reference architecture that everybody is going to use? Some people will have some central service setup. Microsoft has a version of AI in Azure, and a lot of their customers are interacting with Azure.

But we're not sure what models will dominate, and we think the market will sort itself on things like ease of use, cost. This is just the beginning, not the end state.

The security sector will also be involved, and issues about copyright will be revolutionized. Right now we're fascinated by technology, and the reality of the problem will be dealt with at the same time.

Jen-Hsun Huang:

We are now experiencing the first fundamental change in computing platforms in 60 years. If you just read the IBM System 360 press release, you'll hear about central processing units, IO subsystems, DMA controllers, virtual memory, multitasking, scalable computing forward and backward compatible, and all of these concepts, in reality, are from 1964, and these concepts have helped us scale CPUs over the last six decades.

Such scaling has been going on for 60 years now, but it's come to an end. Now we all understand that we can't scale CPUs anymore, and all of a sudden, software changes. The way software is written, the way software operates, and what software can do is very different from what it used to be. We call the previous software Software 2.0. now it's Software 3.0.

The truth is that computing has fundamentally changed. We see two fundamental dynamics happening at the same time, and that's why things are shaking up dramatically right now.

On the one hand, you can't keep buying CPUs. if you buy a bunch more CPUs next year, your compute throughput won't increase. Because the end of CPU scaling has come. You're going to spend a bunch more money, and you're not going to get any more throughput. So the answer is you have to go to acceleration (Nvidia Accelerated Computing Platform). The Turing Award winner talked about acceleration, Nvidia pioneered acceleration, and accelerated computing is now here.

The other side of that is that the whole operating system of the computer has changed profoundly. We have a layer called NVIDIA AI Enterprise, and the data processing, the training, the inference deployment of that, the whole of that is now integrated or being integrated into Snowflake, so the whole computational engine behind it, from the beginning data processing, all the way to the final deployment of the big model, is accelerated. We're going to empower Snowflake where you'll be able to do more, and you'll be able to do more with fewer resources.

If you go to any of the clouds, you'll see NVI

Subscribe to leaf

Receive the latest updates directly to your inbox.

Mint this entry as an NFT to add it to your collection.

Verification

This entry has been permanently stored onchain and signed by its creator.

Arweave Transaction

2LUpWsz-Yh_CfWD…to5JHXVIxpB0nvc

Author Address

0x2bdD000643D993c…Aadf49EcBf345f1

Content Digest

GbAR--Z8xkj8r8L…q2qfT3VZ5syFRGo