Worth Knowing
Grok 4 and Kimi 2 — Impressive Models from Two Very Different AI Labs: Two major AI releases in the last month point to both the progress of AI development and the differing approaches taken by U.S. and Chinese developers:
- xAI released Grok 4 last week, a model the Elon Musk-led company says is “the most intelligent model in the world.” The model’s benchmark performance is certainly impressive: “Grok 4 Heavy” — a powerful version of the model that spawns multiple reasoning agents to tackle complex problems — beat the best models from Google, Anthropic, and OpenAI on popular benchmarks like ARC-AGI-2, AIME’25, and Humanity’s Last Exam. That performance seems to come down in large part to raw horsepower — the model is rumored to have more than 2 trillion parameters and was trained using xAI’s massive Memphis-based cluster of 200,000 GPUs. Despite the high benchmark scores, the popular reception has been less positive, with users finding the model unintuitive and a bit “overcooked.” But the biggest drag on the Grok 4 release has been the behavior of the Grok integration on the social media platform, X. Last week, the chatbot began sharing antisemitic and pro-Hitler sentiments, posting violent sexual fantasies, and calling itself “MechaHitler.” The company later posted an apology for the behavior and claimed the issue stemmed from “an update to a code path” that, among other things, told the bot “you tell it like it is and you are not afraid to offend people who are politically correct.” The issue seems to have been (mostly) patched for now, but the incident has raised questions about xAI’s guardrails, especially as the company begins working closely with the federal government, including the Pentagon (see below).
- Moonshot, a Chinese AI developer backed by the e-commerce giant Alibaba, released its open-weights Kimi K2 model last week. Its benchmark performance and early user reviews both point to it being the world’s best open model, beating out fellow Chinese competitor DeepSeek V3. While Moonshot and Kimi K2 haven’t generated as much attention (or panic) as DeepSeek did earlier this year, the importance of the release shouldn’t be overlooked. As Nathan Lambert pointed out in his Interconnects newsletter, it shows that China’s capacity for impressive AI development isn’t limited to one lab.
- More: Elon Musk’s AI Grok Offers Sexualized Anime Bot, Accessible Even in Kid Mode | CSET: Inside Beijing’s Chipmaking Offensive
- More: A more intelligent approach to AI regulation | Senate strikes AI regulatory ban from GOP bill after uproar from the states
- A preprint paper from researchers at MIT, Wellesley College, and the Massachusetts College of Art and Design detailed the use of electroencephalography to measure brain activity in students writing essays. They found that students who had used an LLM showed less brain activity than those who used only a search engine or no outside tools at all. Students who used the LLM were significantly worse at quoting from their own essays, even minutes after writing them, and reported a lower sense of ownership over their work. Observers noted that it would be premature to extrapolate too widely from the results of the not-yet-peer-reviewed study, but if the widespread news coverage of the paper is any indication, the results seem to have struck a nerve.
- A separate paper by the non-profit organization METR found that providing experienced software developers with state-of-the-art AI coding tools actually slowed them down. The study found that when developers were allowed to use state-of-the-art AI coding tools, their task completion time increased by 19%. This was particularly surprising because the developers themselves estimated the tools had made them 20% faster. METR pointed out that its small sample of 16 open-source developers working on large code repositories doesn’t necessarily scale to all software development use cases. Furthermore, METR’s researchers noted that the one developer with the most previous experience using AI coding tools did see a considerable speedup, suggesting possible positive returns after sufficient skill-building. But the paper seems to at least suggest a disconnect between perceptions of usefulness and reality.
Government Updates
White House Reverses Course — Nvidia Can Resume H20 Exports to China: In a significant reversal, the Trump administration is allowing exports to China of Nvidia’s H20 and AMD’s MI308 chips, just three months after imposing a ban. Commerce Secretary Howard Lutnick said the decision was part of broader trade negotiations with China involving rare earth minerals. The reversal followed a meeting between Nvidia CEO Jensen Huang and President Trump last Thursday, with Huang having spent months lobbying Washington politicians to keep the Chinese market open. While the H20 is not as powerful as Nvidia’s H100 (exports of which remain restricted) for AI training, it is 20% faster for inference tasks — increasingly relevant as test-time compute becomes a key feature of new models. The policy reversal drew immediate bipartisan criticism on Capitol Hill. “The H20 is a powerful chip that, according to our bipartisan investigation, played a significant role in the rise of PRC AI companies like DeepSeek,” said House Select Committee on China Chair John Moolenaar (R-MI), “It is crucial that the U.S. maintain its lead and keep advanced AI out of the hands of the CCP.” Moolenaar said he would try to get “clarification” from the Commerce Department on the policy change. While imposing heavy restrictions on China’s access to high-end chips has never had universal approval, even ambivalent or skeptical observers generally thought that, once imposed, the restrictions should stay. As CSET’s Helen Toner told Fortune Magazine earlier this year, “It doesn’t make any sense to try to walk it back … We’ve paid the price … putting China on high alert that chips are a strategic technology, and incentivizing the whole global supply chain to avoid using U.S. components so as not to be subject to extraterritorial controls.” For now, it doesn’t appear that the administration plans to loosen other chip export restrictions (though it lifted some controls on chip design software earlier this month), and exporting H20s will still require a license from the U.S. government, but Nvidia says it has assurances that the licenses will be granted and deliveries can begin soon.
Anthropic and Meta Notch AI Copyright Wins: In a pair of lower court rulings in the last month, federal judges handed AI developers Anthropic and Meta key victories in copyright battles over the use of books to train their models. The decisions were significant wins for the “fair use” defense for AI training, but the issue is far from settled. In the case against Anthropic, U.S. District Judge William Alsup in San Francisco ruled that using books to train its Claude language model was “exceedingly transformative” and therefore fair use under U.S. copyright law. He did find, however, that Anthropic had infringed on some authors’ copyrights by downloading millions of books used in training from online pirate libraries. A trial scheduled for December will determine how much Anthropic owes for those infringements. In the Meta case, District Court Judge Vince Chhabria ruled that the authors — a group that included Sarah Silverman and Ta-Nehisi Coates — failed to demonstrate sufficient “market harm” posed by Meta’s models (though Judge Chhabria also found Meta had violated copyright by downloading pirated copies of the authors’ books). Chhabria’s ruling wasn’t a total victory for AI training arguments, however — he noted that his decision “does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful,” only that the authors “made the wrong arguments.” While the rulings are an important win for AI developers, they aren’t a total victory. Other heavyweight copyright fights are working their way through the courts, including a lawsuit by the New York Times against OpenAI and Microsoft, and another by Disney and Universal Studios against AI image generator Midjourney. With so many legal balls in the air, it seems likely that we won’t see an ultimate resolution on AI training and fair use absent an act of Congress (on a related note, the Senate Judiciary Committee’s Subcommittee on Crime and Counterterrorism held a hearing on “Mass Ingestion of Copyrighted Works for AI Training” yesterday) or a definitive ruling from the Supreme Court.
Anthropic, Google, and xAI Join OpenAI with DOD Contracts of Their Own: On Monday, the Pentagon’s Chief Digital and Artificial Intelligence Office (CDAO) announced it had awarded contracts worth up to $200 million each to Anthropic, Google, and xAI to “develop prototype frontier AI capabilities to address critical national security challenges in across [sic] warfighting and enterprise domains.” The news follows a nearly identical award last month to OpenAI, bringing the total investment ceiling to $800 million. As we discussed last month, the DOD and CDAO have been light on specifics about what types of services the contracted AI companies will perform. A press release accompanying this month’s announcement said the partnerships would enable the DOD to “develop agentic AI workflows across a variety of mission areas.” CDAO head Doug Matty said these areas encompassed the “warfighting domain as well as intelligence, business, and enterprise information systems.” In its press release announcing the partnership, Anthropic said it would create “prototypes fine-tuned on DOD data,” “anticipate and mitigate potential adversarial uses of AI,” and help inform and accelerate responsible AI adoption across the DOD. While back-office enterprise functions are likely the most immediate application for the companies’ models, neither the DOD nor the developers has ruled out significant integration on the warfighting side.
Trump Announces $90B AI Infrastructure Buildout in Pennsylvania: At an event in Pennsylvania on Tuesday, President Trump and Senator Dave McCormick (R-PA) announced over $90 billion in private-sector investments in the state to build out AI and energy infrastructure. Among the largest announcements (fact sheet available here), private equity firm Blackstone committed $25 billion for data centers and energy projects, including natural gas plants. Google pledged $25 billion for data centers in the region and a separate $3 billion deal to upgrade two local hydroelectric dams. Other major investments include a $6 billion data center from AI cloud company CoreWeave and a $15 billion grid expansion by First Energy. The investments take advantage of Pennsylvania’s cheap natural gas — a result of fracking outpacing pipeline capacity — allowing companies to co-locate power-hungry data centers directly with an abundant energy source. The reliance on fossil fuels attracted criticism from environmental activists; between natural gas and hydroelectric power, the Pennsylvania investments are certainly more environmentally friendly than coal, but it’s still a far cry from the previous administration’s plans for a clean energy-powered AI buildout.
DOD Budget Request Calls for Significant Autonomy Spending: The Pentagon’s fiscal year 2026 budget request, unveiled last month, calls for significant spending on uncrewed and autonomous systems. Out of its nearly $1 trillion request, Pentagon officials said they planned to invest $13.4 billion in autonomy and autonomous systems — the first year such investments are explicitly broken out in their own section. As a DOD official told DefenseScoop’s Brandi Vincent, the largest share, $9.4 billion, would be directed to unmanned and remotely operated aerial vehicles. An additional $1.7 billion is slated for autonomous surface vessels, $734 million for underwater systems, and $210 million for ground vehicles. The software that powers the autonomous platforms — what the DOD official called the “central brain” — will cost $1.2 billion next year. The Navy, in particular, would see a major boost, with its autonomy funding jumping by $2.2 billion to a total of $5.3 billion for FY26. The Air Force is also investing heavily, with $807 million for its Collaborative Combat Aircraft (CCA) program. Delays in the submission of the President’s budget request have not stopped Congress from moving forward with its annual budget process, with both Armed Services Committees having already advanced their initial versions of the National Defense Authorization Act. The House Appropriations Committee has also approved its defense funding bill, which is before the full House this week.
In Translation
CSET’s translations of significant foreign language documents on AI
CSET’s translations of significant foreign language documents on AI
- Notice of the Xiamen City Data Administration on the Announcement of the List of Artificial Intelligence Application Scenario Opportunities
- Basic Act on the Development of Artificial Intelligence and Establishment of Trust
What’s New at CSET
REPORTS
- AI on the Edge of Space by Christopher Huynh
- AI System-to-Model Innovation: Transformations from the Shop Floor by Jonah Schiestle and Andrew Imbrie
- The Future of Work-Based Learning for Cyber Jobs by Ali Crawford
- Big Tech in Taiwan: Beyond Semiconductors by Sam Bresnick
PUBLICATIONS
- CSET: Inside Beijing’s Chipmaking Offensive: Where Is China Gaining Ground? by Jacob Feldgoise and Hanna Dohmen
- CSET: Beyond Corporate Promises: How Government Can Follow Through on AI Preparedness by Kendrea Beers
- Council on Foreign Relations: Will Trump’s ‘Big Beautiful’ Defense Spending Last? by Erin D. Dumbacher, Michael C. Horowitz, and Lauren Kahn
- Foreign Affairs: What Drones Can — and Cannot — Do on the Battlefield by Michael C. Horowitz, Lauren Kahn, and Joshua A. Schwartz
- AI Frontiers: Nuclear Non-Proliferation Is the Wrong Framework for AI Governance by Michael C. Horowitz and Lauren A. Kahn
- The National Interest: Fixing the Pentagon’s Broken Innovation Pipeline by Michael Horowitz and Lauren Kahn
- The Journal of Blacks in Higher Education: Top-Tier Research at HBCUs Beyond 2025 by Jaret C. Riddick and Brendan Oliss
EMERGING TECHNOLOGY OBSERVATORY
- Updated Supply Chain Explorer for Advanced Semiconductors: This week, ETO debuted a major update to the Semiconductor Supply Chain Explorer, a tool that offers policymakers and researchers an open, interactive way to understand the essential inputs, countries, and companies involved in producing advanced chips.
EVENT RECAPS
- On July 16, Dan Kim (former Chief Economist and Director of Strategic Planning and Economic Security at the U.S. Department of Commerce for its CHIPS for America program), Mario R. Palacios (Senior Director of Government Affairs and Head of International Trade Policy for Applied Materials), and Saif Khan (Distinguished Fellow at the Institute for Progress and a Senior Advisor to the U.S. House Select Committee on the CCP) joined CSET Senior Research Analyst Hanna Dohmen and Senior Data Research Analyst Jacob Feldgoise for an in-depth discussion on the implications of these recent trends in semiconductors and semiconductor manufacturing equipment, including for national security policy as well as U.S. firms’ long-term competitiveness in the chip equipment market. They also discussed Dohmen and Feldgoise’s new CSET piece Inside Beijing’s Chipmaking Offensive: Where Is China Gaining Ground? as well as the updated Semiconductor Supply Chain Explorer. Watch a recording of the event.
IN THE NEWS
- Bloomberg: China Lags in Chip Lithography, Influential DC Think Tank Says (Debby Wu cited the new Jacob Feldgoise and Hanna Dohmen post Inside Beijing’s Chipmaking Offensive)
- Bloomberg: Former OpenAI Board Member Questions Zuckerberg AI Hiring Spree (Saritha Rai and Haslinda Amin quoted Helen Toner)
- Bloomberg Television: Could China Topple America’s AI Throne? (Haslinda Amin hosted Helen Toner)
- ChinaTalk: China’s Diverging AI Path (Andrew Stokols cited the CSET report Wuhan’s AI Development)
- DefenseScoop: What Trump’s order on ‘unleashing American drone dominance’ means for the U.S. military (Brandi Vincent quoted Lauren Kahn)
- New York Post: China unveils eerie mosquito-sized drone designed for stealth military operations — nearly invisible to naked eye (Anna Young quoted Sam Bresnick)
- Scaling Laws: The AI Moratorium Goes Down in Flames (Kevin Frazier hosted Helen Toner )
- The Conversation: Britain’s plan for defence AI risks the ethical and legal integrity of the military (Elke Schwarz cited the CSET report Building the Tech Coalition)
- The Telegraph: China unveils mosquito-sized drone (Allegra Mendelson quoted Sam Bresnick)
What We’re Reading
Paper: Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety by Tomek Korbak, Mikita Balesni, and coauthors (July 2025)
Paper: InfoFlood: Jailbreaking Large Language Models with Information Overload by Advait Yadav, Haibo Jin, Man Luo, Jun Zhuang, and Haohan Wang (June 2025)
Paper: Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents by Axel Backlund and Lukas Petersson