For America, is big or open best for AI models?

Sign up for the riskgaming newsletter

Since the launch of Project Stargate by OpenAI and the debut of DeepSeek’s V3 model, there has been a raging debate in global AI circles: what’s the balance between openness and scale when it comes to the competition for the frontiers of AI performance? More compute has traditionally led to better models, but V3 showed that it was possible to rapidly improve a model with less compute. At risk in the debate is nothing less than American dominance in the AI race.

Jared Dunnmon⁠ is highly concerned about the trajectory. He recently wrote ⁠“The Real Threat of Chinese AI”⁠ for Foreign Affairs, and across multiple years at the Defense Department’s DIU office, he has focused on ensuring long-term American supremacy in the critical technologies underpinning AI. That’s led to a complex thicket of policy challenges, from how open is “open-source” and “open-weights” to the energy needs of data centers as well as the censorship latent in every Chinese AI model.

Joining host ⁠Danny Crichton⁠ and ⁠Riskgaming⁠ director of programming ⁠Laurence Pevsner⁠, the trio talk about the scale of Stargate versus the efficiency of V3, the security models of open versus closed models and which to trust, how the world can better benchmark the performance of different models, and finally, what the U.S. must do to continue to compete in AI in the years ahead.

‍

<- Previous

continue

reading

Next ->

Transcript

This is a human-generated transcript, however, it has not been verified for accuracy.

Danny Crichton:
Jared, thank you for joining us.

Jared Dunnmon:
Thanks for having me, folks. Appreciate it.

Danny Crichton:
So you just published a piece in Foreign Affairs called The Real Threat of Chinese AI. And the focus of this piece is really not just China, but really the focus on open source and the importance of open source to the future of American competitiveness in artificial intelligence and machine learning. And as we record this, we're actually recording this on April Fool's, so we're dating ourselves a little bit. We'll see when it actually comes out. But yesterday at the same time that you had just published a story, an essay in Foreign Affairs, Sam Altman, the CEO of OpenAI, announced that for the first time in six years since the launch of GPT-2, which was a product that unless you're really into AI that basically no one really recognized back in 2019, six years ago, that they're going to launch and publish an open weights model of some undescribed, but theoretically a frontier model of some sort from OpenAI.
And I wrote a piece today that was basically emphasizing that Deepseek and R3 was launched on December 26 of last year. And the next day OpenAI announced that they were going to switch from a nonprofit status to I believe a public benefit corp in Delaware as part of the whole X board situation from last year. And so it just seemed to me like you wrote this piece. And obviously Foreign Affairs has one of the most rigorous editorial processes, so I'm sure this was a multi-week, multi-month process. Very painful. Our former editor from Foreign Affairs is one of our editors here at Lux. And so you time this perfectly unintentionally. And I'm just so curious when you think about open source and the future of AI and the thesis here, why that timing? Why is this so important and why it's all happening together all at once?

Jared Dunnmon:
I think there's actually a bit of an interesting origin story. I started writing that paper when I was running on the beach and I was pinging Deepseek, the app, just from well, my phone, because I was just curious how it was going to do it, various things. And so this was on Christmas day. And so I started pinging it. At some point I started asking it a variety of other things and it stopped answering me. And I said, "Okay." I mean, I expected there to be various things I could and couldn't ask about. That's fine. But then I kept running into them. And then I said, "Okay, can we have a productive conversation here? Can you tell me what your guardrails are?" To which it said no. At which point I said, "Well, that's interesting." So I don't know what they are. They're not transparent. And I can't figure out what to do. If I were to put this thing in the API layer underneath a bunch of different applications, that would have some interesting consequences.
And so I started digging into this and was just kind of like, okay, is there a way that I can figure out what the cyberspace administration of China says these things can say, how much of this is being done kind of server side versus in the model. And then I was going through the documentation and I noticed way down on the page that, "Oh, if you want to run down Huawei Ascend, here's how you do it." At which point, my Spidey sense said, "Oh, okay, I see." Because it very quickly becomes a logical train of thought, which is, "Okay, we're putting out this thing which is very, very highly performing, very good on quality objectively, an engineering tour de force. It's very impressive. You can run it for effectively the cost of the electrons and the amortized CapEx, and that tends to get you market share." And then when you start building market share, you start being able to do things like say, okay, well we're going to have later versions of this run on my chips or we're going to have these sorts of dynamics.
And so it very quickly starts to look like a story that we've seen before, which is low cost Chinese entry into a current sector followed by massive gain of market share, followed by potentially sucking in of capital and resources into the Chinese ecosystem in a way that does not go well for American competitors. That's really what I spent most of that Christmas break writing that, playing with it. And then amusingly, I did submit the paper I think to Foreign Affairs actually the day... Martin Luther King Day. It was basically the week that R1 came out. And so I submitted it and then the next day, front page of the Wall Street Journal, everything, I was like, "Yep, this is a thing." And then obviously it came out a month later.
But overall, very appreciative of the Foreign Affairs folks for taking the time to go through the editorial process. I think they made the paper better. That's really where it came from. It was just a matter of, I was digging into some of the censorship pieces actually, and then I just noticed in the documentation the focus on enabling it to run on Ascend. And once you start down that path, I think a lot of things become pretty apparent in terms of how we should think about these systems from a US and Western perspective.

Danny Crichton:
Well look, obviously China's been focused on de-risking around technology for more than a decade. Going back to main China 2025, which was written in 2015, which has been mostly executed. Research that was done in the last couple months has shown that I think something like 90, 93% of all the targets were actually met within the 10-year period. And there might be some fudge factors that come from the CCP model of bureaucracy, but the reality is that they have done very, very well. The incentives are out. You have Deepseek clearly getting access to enough NVIDIA compute, enough compute from other sources, it's coming together.
And then in the last couple of weeks, there's also been discussion around extreme ultraviolet lithography. And there are questions around whether China's actually getting closer and closer to being able to displace ASML, which has traditionally been sort of this block that theoretically everyone thinks in the west exists. And I think it's sort of a fake dam that we're going to suddenly find out you can reroute around.
But nonetheless, your piece emphasizes that from top to bottom, soup to nuts, from chips to compute to data, basically we've proven that this open source model works, this open source is out. And interestingly, China has been very open across all of their models. So it's not just Deepseek, but it's also Alibaba with its QAN infrastructures published on Hugging Face and other sites. And that is in contrast to the proprietary nature of the American model. So OpenAI is proprietary, Gemini is proprietary.
The exception traditionally in the last recently until OpenAI's announcement was with Meta, with Llama has had a fully open source strategy. And so your proposal in Foreign Affairs, one was analysis, but then you had a list of proposals, was really to say, "Look, Open is going to win in AI and we need a sequence of proposals to incentivize companies to make sure that we're not keeping everything in the wall of garden in the proprietary world, but opening this to be competitive to ensure that Chinese models do not take the lead and therefore dominate the information space with censorship."

Jared Dunnmon:
There's a lot going on there. First of all. Yeah, and I would also give Google credit for releasing Gemini 3 recently, which is performing some reasonable approximation of Deepseek and runs on one GPU. So I think you're already starting to see some of the dominoes starting to fall.
To your point about the whole overall supply chain and how competition is happening up and down the AI stack, so to speak, when you start losing market share, there are a lot of implications. So now in theory, and I differentiate, China is a country from the CCP, from the Chinese Communist party. And just to be clear, we now have CCP-influenced entities that can take good enough chips. Ascend is not the best thing in the world, but it's good enough to run good AI that is good enough for a huge percentage of applications. Absolutely good enough.
I mean at some point, once you get to a certain level on a task, you don't care how good your model is, it's good enough. So you can take good enough chips, run good enough AI on them, and you can offer it at good enough price with or without subsidy, it's almost immaterial, to freeze out competition, particularly in a world that's dominated by test time inference compute. If you believe that's the world we're headed to with things like reasoning models, et cetera, that starts to drive capital to places like Smick, to Huawei. And actually that's how NVIDIA, TSMC, ASML, the entire ecosystem, the Western chipmaking ecosystem starts to lose. In a lot of ways, this has shades of what happened to Wintel back in the earlier era of general computing. So when IBM realized that it couldn't compete with Wintel because Wintel, the partnership between Microsoft and Intel, between them, they design chips, they manufacture chips, they own the instruction set and they owned the OS.

Laurence Pevsner:
Yeah. So let's pick up on that threat, the geopolitical threat. I love the idea that you were running on the beach and trying to figure out if, "Okay, you can't tell me about Tiananmen Square, what else can you not tell me about?" And it refuses to tell you the rules themselves are secret. I mean that's just fascinating.
I was particularly interested in your piece you talked about these sleeper agents, this idea that the developers could actually put in dangerous behaviors that would be embedded that only arise in specific context. But I learned recently that in Silicon Valley, the term non-technical is a slur. So I will impinge myself. I am one of the more non-technical folks here at Lux, even though I have programmed, I know how to program a bit. But anyway, I found myself not quite understanding how it is that you could be both open source and also have sleeper agents embedded with it in you, right? So you have this, in Linux to use that example, there couldn't be code there that you wouldn't know about, right? That's the whole point. How is it that in OpenAI there still could be something that's hidden within there?

Jared Dunnmon:
So that's a great question and this actually was a terminology issue that I ran into and it was actually quite frustrating. I wanted to refer to the class of systems that includes open source and open weight systems as open systems, but then I started saying just OpenAI. And just for simple reasons, that becomes really difficult. So I think this is a really important differentiation between open source and open weight.
So open source, arguably I have access to the training data, the code that I use to train it, every parameter, everything that went into the... I have the recipe, not just the dish. With open weights, I kind of just have the dish. Unlike traditional software, I can't just read line by line and say, "Oh, well there's where this particular behavior is inserted."
Particularly, and you're right, in open source, if you were going to do something very, very kind of bald faced, folks would say, "Hey, I can see you're doing this thing." But this is where the differentiation comes in. So Llama is this way. Deepseek is this way. A lot of these models are open weight. So that means that they give you the dish. And you can use it as much as you want, but you don't have the full recipe. You don't know exactly what data went into it. That particular thing that you downloaded, someone could report they did something and they could have actually done something different. This is where I think there is a really reasonably good argument for thinking about open source and open weight models differently from a security perspective. And I think that's pretty straightforward argument.
And this is not a perfect analogy, so take it with a grain of salt, but it's kind of like having a binary versus having source code. If you have the source code, you can look through and be like, "Okay, I have a sense of what happened here." If I have the training data and the recipe, I can at least see what the training process was. However, if I have the binary, like having the weights, I mean I can poke and prod it and see what it does, but I mean I can't really give you any guarantees about what's going on inside it. And so I think that's a very kind of natural... It's not a perfect analogy, but it's a reasonable analogy to the types of security approaches you could start to think about. Although in the AI case, it's just going to be much harder because while the process is baked, your testing surface is effectively infinite. So it's very, very difficult to actually do.

Danny Crichton:
The other answer is that all the code can be available, but there's millions and millions of lines of code and only so much time to read it all. And so it can be with single line, it could be a missing semicolon and that breaks a particular code. It could be a register overload. And so you can get get into the memory and be able to break out of the memory containment, and so-

Jared Dunnmon:
I mean I'd also just say as a somewhat tongue in cheek aside, I mean if you download the weights just for Deepseek V3, it's like over a terabyte. I mean try convincing me, there's not something bad written in there or that couldn't be parsed into something bad given any number of environments or context or whatever. There's just a ton of data sitting in there. And a lot of the neurons won't even activate some of the time, I'm sure. So I mean there's just a bunch of reasons why there's a can of worms with the open weight models in particular.

Laurence Pevsner:
And China is notoriously very clever about this. I mean, I remember talking to an expert once about censorship and how they work with Western news sources. They said, "Well, the most effective censorship that the CCP does is not block websites, but just make them very slow to load." And so you don't block the Washington Post, you just make it so that it takes forever for the page to show up. And so you just get bored and log off. Of course you can go read the Washington Post if you want, but human behavior, no one ever will. Same thing with this terabyte of data that no one will ever actually go through line by line.

Jared Dunnmon:
[inaudible 00:12:07] what it's worth. I actually did eventually get it to write me a poem about what it wasn't allowed to talk about. It was quite interesting. It's actually a great poem.

Danny Crichton:
There were some really good discussions online of trying to jailbreak out the AI models. And this is fun for all the models. It's not unique to this. Everyone tries to break how to learn whatever. And people were successful in getting a pretty good set of rules of what was built in there.

Jared Dunnmon:
Yeah, but it was interesting. There's some stuff I didn't expect whether how correct it is is another question.

Danny Crichton:
I think it was China Digital Times. There's a long time hand of people who love reading this. They used to have a column called the Ministry of Truth. It had sources at the different media companies. And you would get the weekly update on what not to talk about. It's truly 1984. But it was funny because it would be like, "Do not talk about the crash that happened on the 3rd Ring Road with the person who was the Communist Party member' or whatever the case may be. And you're like, it is so specific, it's like you end up realizing it's not some big national agenda. It's like the guy with the controls, his son caused a problem, therefore that's what's not included. I love small scale corruption like that. That's great.
Anyway. But look, to Laurence's point, I do think there's large security issues. And look, I've talked to a couple of folks on AI security. It's interesting to think about the future here where AI itself is trying to secure AI. And so you are creating a arms race dynamic of security here, which there's companies like Socket that do supply chain security around source codes. So it scans NPM and the JavaScript world and other major open source libraries to say like, "Look, was any changes made? Did those changes change the functions?" There was just a massive scandal two months ago in which a major package, it's basically in every JavaScript app, it's like an array calculation tool had a massive bug that someone deliberately added in. And it was there for three weeks before anyone noticed because thousands of things are happening at once. And people are busy and most open sources and paid. There's not a lot of folks who actually get sustainably funded to evaluate these tools and make sure they're reliable.
Now, it'll be interesting to see in the AI world, one that the AI tools are there, these companies are well capitalized, they have a huge incentive to be secure. I don't know. But to your point, there's always a way to put a back door in. That doesn't mean the company even knows. I mean, they may not have their own control over their entire supply chain either. And so whether it's OpenAI, whether that's a Chinese company, so many ingredients are going into that that I think the security surface is really hard to deal with.
But I want to bring back, so we talked about the security issues, we talked about this, but you had proposals. So it's not just make open source a bigger part, but you had three or four specific proposals for the federal government for policymakers to say, "What should we do to encourage the creation around open source?" And I think critically, this is a huge difference because two, three years ago, open source was not part of the conversation DC. We ran a game with a bunch of defense leaders, et cetera. Open source was a very new concept then it seems to be entering the lexicon a little bit more. And so I was curious why the proposals you put forward in your piece.

Jared Dunnmon:
Stuff I put forward in the piece was really focused on a very basic set of actions that one could take. It's certainly not a panacea. And there are certainly, I think,, other things that one could do. But to start with, one court idea is simply just make sure that from a Western perspective, and certainly from a US perspective, we're actually incentivizing the development of good responsibly-built, highly functional AI capabilities that are released open in some appropriate way. Because otherwise I think we've seen what happened with Deepseek. It's probably not the path that we ultimately want to consistently go down.
And so there's a variety of things you can do in terms of making sure that the folks who would build for open source, particularly researchers, et cetera, actually have access to some of the compute you need. So as an example, even if you're going to build something that is really efficient, as I think this was shown on Deepseek's kind of final... The cost of their final run wasn't that big. They had a lot of compute that went in to getting there.
I think there's also a lot that we could do from an evaluation perspective, whether this is probably a public private partnership type of thing. Some of the best people we have thinking about how to evaluate models to figure out are they safe and appropriate to release sit in the private sector right now. And I think there's been some progress there with AI Safety Institute and some of that. The question is, how do you get the most eyes on these various models that are coming out all the time?
There's some combination of folks who have seen this stuff on the industry and research side, and secondarily, systematic evaluations. So giving some of the folks from Stanford HELM their flowers, there are some kind of systematic evaluation frameworks that one could imagine using and thinking about how to build as almost a public capacity to say, "Look, when we have a new model, we should run it through all these things and ideally make sure that it's not exhausted necessarily, but that we're continuously adding to the set of things that we're evaluating against and making sure that the very least we're not letting really low hanging fruit slip through where we don't want." I thinks it's important for it be a public-private kind of collaboration because there are some legitimate freedom of speech, other issues involved there.
But thinking about how a structure like that would work is probably a useful thing because that also allows you to evaluate, for instance, if you have Chinese models, you can actually evaluate. And some of them may be perfectly fine, but it's just hard to know a priori without putting in the work.
I think that there are a couple of pieces around making sure that the US ecosystem is sticky. So right now if I want to go and build things on Apple M series or TPUs or AMD, maybe I have to do a little bit of futzing back and forth to deal with that versus NVIDIA. And if you were to support, making sure that we have a nice software ecosystem where all those things are very easy to run, you could have increase the stickiness of that ecosystem, there's I think some value in that.
There's a lot also to do from a competition perspective. And I wouldn't say this is necessarily a shooting from the hip hop take, but I would argue that a lot of our structures for thinking about antitrust and competitiveness in a free market were built for a world that did not have state-owned enterprises competing with US companies. We need to really strike a careful balance between what the well-intentioned folks at places like the FTC are thinking about in terms of antitrust and the implications for US leadership and AI. It's not saying that you should do this to bargain with the devil where you have monopolies all over the place because they give you AI. However, I think it is entirely reasonable to consider company's contribution to the open source and US open source leadership in deliberations in and around antitrust actions.
So again, Meta supports PyTorch. They don't have to. They do. Google supports... I mean, they support TensorFlow. Obviously the transformer came from Google. Any number of things you can think of, the Gemma models, I can think of a ton of them. Microsoft supports things like a number of open models as well in addition to developer tools. VSCode is pretty cool. So there's a bunch of things that we might want to think about in terms of how we set incentives for these large companies, which frankly have the user mass and the capital mass to compete with state-owned enterprises. Because I would argue to you that if you were to take strong antitrust action and break a lot of these companies up, they wouldn't have the mass to compete with state-owned enterprises. And so we need to be very careful and strategic about how we think about that in an international context.
And that also goes for what I would call kind of AI dumping. So what you'd also don't want is folks just... A bunch of state-subsidized cloud infrastructure clusters offering inference on, take your Chinese model at pennies on the dollar drawing market share away, drawing revenue away, hurting the US companies. I think that's also problematic. And I think it's important to make sure that there's a subtlety here, which is that loss leadership on a private balance sheet, I would argue to you, is acceptable. That is a risk you can take as a business. I would argue that loss leadership on a state balance sheet is not acceptable. And I think we need to be very clear about when and where that's happening because just to be clear, maybe it's not happening. Maybe the folks in China are doing incredible engineering and they are just beating the pants off us in a free and fair way. That happens. In that case, it just get better. But there are also cases where that's not true and we need to be very clear about what is what and make sure we disincentivize that latter thing.
Last piece, there's a bit in there about infrastructure. This is just making sure that we're not limited by the fact that China builds a ton of power and we're currently not very good at building a bunch of power in this country and AI basically takes electrons and turns it into intelligent operations. There's a lot that we can do there that includes fixes to the grid, fixes to generation, clean firm power, but also making sure that we use systems that intelligently reduce demand from a power perspective. So that can be using chips like some of the ones from your non-NVIDIA folks. And to be clear, NVIDIA is great. They should keep doing their thing. But you've got the SambaNovas of the world, the Groks of the world, the Cerebras of the world, they're building new chip architectures that in some cases have substantial energy advantages versus the alternatives.
And then similarly, a lot of the curves that you would see that would tell you, "Well, if AI keeps using all this power, we are going to be using some huge number of gigawatts." There's some kind of an assumption there that we're going to be hitting every query over the head with a giant model, and that's also probably what we shouldn't do. So we should probably think about building systems that are efficient and using small models when we can, using local compute when they can. There's been some exciting work recently on that.
And the last piece I would say is there's Biden administration released a framework called the AI diffusion framework kind of near the end of its term. And this is not necessarily the only answer, but I think one of the implications of that, of part of it was interesting, which is that, effectively the open models that would be "allowed to be released," I don't know exactly the enforcement mechanism but that would be allowed to be released, were those that were effectively basically as good as the best existing one.
And I think if you were to modify that slightly and say, by some appropriate definition of slightly, what you actually allow is for the release of models that are slightly better, then the best open source one, what you would allow you to do is have a bunch of folks working on really, really good models, giving themselves as much time as possible to figure out some of the safety and other implications of those models.
And then when there's a big release from, say, a CCP-backed company, you release something that's slightly better on top of it. And then you mitigate that instinct, that kind of the capitalist instinct to go to the cheapest thing. Now how achievable is that instinct? TBD. But I think that and other strategies like it are probably what we want to think about because it's not an either/or question. It's not a, do you open this up or do you close it? There's probably a consistent gap between the closed source and the open source. It's persistent. Now how big it is, how long it is, that's open for debate. But I think the last couple of years has shown you, I mean probably on the scale of months, not decades somewhere, so I think calibrating to that world and thinking about how we should create the right incentives in that world is probably the right question or one of the right questions for policy folks to be asking themselves going forward.

Laurence Pevsner:
Let's start with maybe a question that gets at a bunch of this, which is Stargate to me, right? So Stargate is this case where we are building the AI infrastructure exactly like you're talking about, but the main partner involved from the US, OpenAI, as we've been discussing, at best is not a full throated supporter of open weight and at worst is kind of antithetical to it.
Do you see that effort as we should just scrap it and we should be doing all the great open weight incentivizing stuff that you've been talking about? Is it complimentary? Is it orthogonal? How do you think about Stargate in all of this?

Jared Dunnmon:
So I'll admit I haven't spent a ton of time thinking about Stargate in particular. So what I'm going to say is my understanding, which is that it is a very, very large amount of compute. And I say that somewhat glibly, but I mean it, right? It's a giant amount of compute. So the question is, is it worth building a giant amount of compute? I would argue to you that it is. Now, it may not be for reasons of training the world's monstrosity of a model, but what it might be worth doing it for is a combination of giving folks some of the academic and industry researchers who would release stuff openly, access to those computing resources, making sure that we have the actual compute resources to be consistently running potentially massive evaluations on all these models constantly. So that idea that I gave you before of like, let's run a bunch of evaluations on all these models constantly, that's going to take compute. It's not going to be free.
I think you also care there about, from a public sector perspective, I would argue that if I was the public sector and I was supporting Stargate strongly, I would want to classified enclave part of Stargate. I would want to make sure that the US government had an ability to easily take and rapidly take some of the innovations that might occur on that cluster or nearby and very rapidly port it over to test or run on classified workloads. And you could think about designing for something like that.
So I think there's a lot of good reasons to have a massive amount of compute. Am I convinced that it would be the decision that I would make if I was thinking about allocating an arbitrary pot of capital? I might think about that a little bit differently because I think the world's going to go in a place where we're going to have a lot of smaller models running in a lot of different places. But in the net, having a big bolus of compute sitting there for a broad set of use cases, I think it's very worthwhile. Then we will be in the details, what is it used for, who has access to it, et cetera and so forth.
I would say on balance, more compute is good. The question is going to be, how much value are we going to get from it? And to me that's going to be a matter of how it is leveraged to support some of these other broader objectives versus if it's just like a giant pre-trading cluster. I mean I don't know if that buys you as much.

Danny Crichton:
No, I think we're trying to figure this out because the AI world is evolving rapidly. There's a whole news generation of four foundation model writers, including some in our own portfolio, some who are non-portfolio who are trying different architectures, trying to be more efficient, trying to optimize either for test time, for inference, et cetera. And I think what's interesting here is, one, obviously I'm with you, Jared, there's no world in which less compute is better than market compute. I think you can go to the basics of Charles Babbage and the differential equation analyzer back 150 years. Entire history of compute is we always will find a use for compute.
I think you get at one aspect, which is there's this question of who gets access to it and it's very expensive today. And so training a model is still hundreds of millions of dollars. That is not accessible to academics. That is not accessible to all, but maybe a couple of national labs and giving the dojing that's going on these days. At the national labs, not sure they have any access to this stuff anymore either. And then there's a geopolitical component which is, this is funded by Saudi Arabia, this is funded by Softbank, Japan's largest telco. These are US allies. So there's not an alliance concern, but there is this interest of still the capital's not coming from the US so much as it is from foreign sources. Many of these data centers will be placed elsewhere, et cetera. And that's the zone dynamic. I don't know what it all combines together. I think it's very open. I mean, I think we actually talk about this weekly here. And the answer is we don't know.
There's this balance of, I do think there are mathematical improvements that will make it a hundred times more efficient, in which case temporarily, we will go through a period in which compute is not the constraint on any part of the system, but then we will use it again and then they will get faster, more advanced, et cetera. And we'll be back to the data centers. And we will always find a use for AWS if AWS is affordable.

Jared Dunnmon:
Yeah. And I generally agree, right? I mean, I am a bit of a sympathizer for the Jevons Paradox folks who just say, "Look, if it gets cheaper, we're just going to use more of it." And I think that's right. I mean I think broadly speaking, that's right.
I think the bigger question for a lot of these data center build outs is just, if you're going to spend hundreds of billions of dollars, what is the optimal place to put it given what the world of AI may look like? Is there an argument for putting it into just buying chips and building data centers? Absolutely. If you took $10 billion and said, "Let's just be awesome at open source and let's just be awesome at OpenAI," like $10 billion would get you a long way, right? So I think -

Danny Crichton:
$10 billion would also take my life a long way too. Yeah.

Jared Dunnmon:
Right. Right. Yeah. And that's like a tiny fraction. So I think it's less a question of, is it a good idea objectively, versus just as always, what's the opportunity cost. And to me, if we build stuff like Stargate at the opportunity cost of open leadership, I think that's not a great outcome. I think that we should make sure we maintain leadership on both the open side and support our Frontier Labs doing the awesome work that they're doing. But that means making sure we find the capital to allocate to some of the folks doing the former thing who, as you mentioned, half of the scientific computing stack runs on packages that nobody is paid to maintain. That's actually insane. We should actually find a way to compensate those people.

Danny Crichton:
And believe it or not, I mean I had a huge piece on this in 2018 of new models for sustainability around open source. It has not gone better in seven years. And in some ways it's gone worse because the surface area's expanded. There's more and more code that's open source that needs to be maintained. There's more and more people want to do it. There's actually talent for it. It's a pure cash consideration, and that is not getting solved at all. The thing that we've been focused on, we've been analogizing towards the dot-com situation. There's massive investment in networking. And we basically had fiber. People are outputting fiber everywhere across the United States, around the world, the submarines, et cetera. It was a huge industry. You had all these companies. At this point, Worldcom and everyone else are all dead. And the reality is like, it was this massive boom cycle.
We actually had Tobias Huber and Byrne Hobart about Boom. But their thesis was basically that progress starts because we're going to massively over-invest in this technology. And if you look at fiber, we basically had dark fiber for a decade. A bunch of random stuff came out of that. So for instance, Google, it was the original purchase at Google who actually purchased all the fiber that allowed Google to get economics for YouTube. And Google's one of the largest fiber owners in the world and they bought data access to all this because it was basically in bankruptcy in many cases. And so they were able to buy core pipes of the internet, which gave them a cost advantage on video at a time when no one else was able to do that. It was the only way that made YouTube successful for basically a decade and why there was no competition here.
So my question is always like, look, $500 billion will go in these data centers. Let's say it booms, it blows up, we have a super-efficient model, we aren't using it. Someone will buy it in bankruptcy. It exists and someone will do something creative with it. And it may be an AI model, might be YouTube. Who knows what the solution that comes out of that> but the fact that we invest in it, we'll create that. You just don't want to be the VC's doing it. Sort of the question. And I always think that that's the hard challenge over here, is I love the data centers, but as an investor who put the hat on, the question is, is that what you want to put your sprig of cash or do you want to put it somewhere else?

Jared Dunnmon:
Yeah. And again, I think it depends on what it's for, right? If you were building me a giant inference cluster and you had capacity booked out for goodness knows how long, I think that's potentially a different conversation. But I also agree with you because again as an American with my American hat on, do I want a giant data center being built in America? Like yes, I do, right? There's no question about that. The question is, again, it's an opportunity cost question. What's the right size of that investment relative to some of the other ecosystem investments that we need to make? I mean, you can make a reasonable argument that our situation in the chip making arena isn't as tenuous as it's ever been.
And obviously the Chips Act has helped. There's been some progress there, but you can imagine taking some resources of that scale and diverting it into stuff like that. That's the level of resourcing it would take to make sure that we're competitive across that supply chain. I think there's some due consideration to be thought about, which is, it's not in the interest of a given company. If the company wants to build a model to be clear, of course they want the chips, right? There's no question about that. If I'm thinking about this from a broader perspective, I am looking at this from a broader perspective. My internal capital allocators is just kind of thinking, "Are we putting too many eggs in that basket?"

Danny Crichton:
You had a good framework here, which is a company balance sheet versus nation state balance sheet. And we had a senator here a couple of weeks ago, private conversation, so it won't be specific, but one of the things that came out of that was basically a person was very informed about AI but said, "I have no opinion about AI. It was changing too fast. There's all these different inputs. I don't know what to prioritize. I don't know what the government should do. These policies help certain companies, they help other companies. Some of these companies seem to help China, they also don't help China. OpenAI went from close to open vaguely in the last week. And OpenAI is absolutely Sam Altman's in DC regularly. He's a massive leader who runs one of the largest companies. Of course, he has access. He has pushed sort of national preemption legislation to try to get state legislation out. He's talking about making national AI champions going on here. But [inaudible 00:32:15] really got a major business strategy decision wrong."
And so it was great that we have either people who are informed but are unwilling to make decisions right now because they don't know which way to go. Or we just have political dysfunction so nothing actually just gets done. And so in this case, we sort of lucked out in that we did not build an entire AI regulatory regime around three companies where Deepseek would have come in and really I think effed over the United States quite royally.
So the question to you is, if you're at DC, you're writing in Foreign Affairs, you're very engaged in the policy world, if you're a policymaker today, are you making decisions? Are you trying to fix things? Are you still in a learn and seek mode? Are you in a adaptive mode? How do you handle how much the change is happening here? Because as you know, I mean Lawrence Lessig, 15 years ago with Code, the code of the law changes at a speed that is completely incompatible with the code of software.

Jared Dunnmon:
If I'm a policymaker, I desperately want more information. I want to know how are we evaluating these models, where are they working, where are they not. Are the existential risks, not to discount them, how real are they and how concerned about them should we be? I would also want to know, are those same existential risks enabled by a model that just got released openly and does that change my policy framework. Protect to promote to a degree, right? I would also want to know again on the energy side, which I think a good bit about, what is the energy cost that most of the queries that we're running? Are we running things in a way where folks are just paying out the nose for capacity and latency? Is there the ability to allow folks to have higher latency and cost ourselves less in terms of the change that we have to make to the grid and the energy infrastructure? That information's hard to come by.
There's not great information necessarily on how fully utilized are the data centers that we have? What is the query volume that's going into these things and what applications are they being driven by? So to be able to make a lot of these decisions, and again, even what's running to a degree, how many folks are running this model versus that model versus the other one? In what applications? To a degree, I would argue that what we want is more information. We need to get the instrumentation of our AI ecosystem right first so that we understand what is being run, how is it being run, what capabilities does it have, what are the infrastructural costs of those things, and what does that imply for not just the future, but just even getting a read on the present.
I think the way that you do this, if it were me, is I would be asking for a bunch of measurements effectively. And then what I would be trying to do is think about those measurements in the context of when those measurements get to a certain point, how does my policy change? Because it's probably not one policy. It's probably a decision tree. It's, if you get to this point, then do that. If you get to this point, then do that. And instead of getting to that point and then being like, "Oh crap, we have to think about that," I would argue that what we probably want to do is say, "Okay, what do we think the possible end states of this world are and what do we think the possible paths there are?" And if we observe thing X, thing Y, thing Z, how [inaudible 00:35:15] our policy change?
And ideally we have those things pre-baked, and we have them pre-baked and related to a set of measurements. And as the ecosystem evolves and as capabilities evolve, you're adding measurements and you're adding kind of pre-baked thoughts. And so again, obviously that's a perfect world and that requires a lot of continuity and planning that may be difficult. But the very least, I would just be asking for measurements, because right now, I think one of the reasons that folks are having a hard time making decisions is just the measurements aren't very good. And when they're good, they're maintained inside private companies who understandably may be loathe to share them in various forums with the government.
And I'll say props to the Frontier Lab, to the Frontier Model Forum, to OpenAI, to Anthropic, to some of those folks, who actually have actually gone and briefed the government on how various things work. I think that's been really important.
But the question is, how do you make that less an activity that's done by well-intentioned folks who are trying to share what they can versus making sure that our policy community actually has as comprehensive visibility as possible into the entirety of the ecosystem? Because right now they're probably getting afraid. They're kind of feeling the different parts of the elephant.

Danny Crichton:
It sounds a lot like suspicious activity reports in financial reporting, which is it's like, "Hey, there's something that's interesting. Send it over here." It's up to you at the Treasury and the investigators there to connect dots and potentially so, like, "Oh four banks sort of said the same thing, but the same individual, whatever. We were able to connect those dots in a way the private industry."
But there are ways, I mean, I feel like whenever we talk about regulations, there's always this top down, everything has to be controlled, whatever, but there are a lot of regulatory models in the United States. SEC, you file all the paperwork. Most things don't get comments. You just get to do what you want to do. HSR mergers, if after 30 days the FTC doesn't send you an extension notice, it's over. You can just do it the next day. You sort of offer time. And in this case, I think you're getting at this point of like, you don't have to do information. It can just be information. It doesn't actually actually be action, but you just say, "Hey, we just launched this new model. Hey, folks are asking questions that are very strange from these IP addresses. You figure it out if that's important to you." But having that information awareness is maybe step one before going anywhere else.
But Jared, thank you so much for joining us.

Jared Dunnmon:
Yeah, likewise. Thanks for having me, folks. This was a lot of fun.

‍