The War Department has signed an agreement to bring Elon Musk’s xAI and its Grok chatbot onto the military’s internal AI platform, GenAI.mil, with initial deployment targeted for early 2026. The deal, which carries a $200 million ceiling, has already drawn sharp criticism from at least one U.S. senator over procurement concerns and the chatbot’s recent history of generating antisemitic content. The move places Grok alongside Google’s Gemini inside a system that now reaches every desktop in the department, intensifying a broader contest over which AI companies will shape military decision making.
A $200 Million Contract Lands on GenAI.mil
The War Department announced an agreement that opens the door for deploying xAI tools on GenAI.mil, with initial rollout set for early 2026. The security boundary for the deployment is Impact Level 5, meaning the system will handle Controlled Unclassified Information but not classified material. That distinction matters: Anthropic’s Claude remains the only AI model approved for classified military networks, according to reporting by the Associated Press, while all other major models are restricted to unclassified use for now. In practice, that divides the Pentagon’s AI stack into a high-side environment where only one vendor operates and a much larger unclassified space where multiple frontier models are now competing for users and use cases.
The contract itself was signed on July 13, 2025, with a $200 million ceiling, positioning xAI as one of the War Department’s highest-value AI partners despite the company’s relatively recent entry into the government market. GenAI.mil launched earlier in 2025 as a new departmental platform, initially hosting Gemini for Government as its first frontier AI capability, and the department later said that AI tools are now available on every desktop across the organization. Adding xAI’s Grok to that same infrastructure means the Pentagon is building a multi-vendor AI ecosystem rather than relying on a single provider, a strategy that spreads operational risk and encourages competition but also multiplies the number of models that need safety vetting, red-teaming, and ongoing monitoring for harmful outputs.
GSA’s $0.42 Per Agency Deal Greases the Pipeline
The Pentagon contract does not exist in isolation. The General Services Administration separately partnered with xAI on a OneGov agreement priced at $0.42 per agency, designed to give federal departments fast procurement access to Grok 4 and Grok 4 Fast models. The 18-month agreement runs through March 2027 and includes an upgrade path toward FedRAMP authorization and alignment with Defense Department Impact Level standards, effectively creating a standardized on-ramp for civilian agencies that may later want to interoperate more closely with military systems. Federal buyers can locate and compete contracting opportunities through portals such as SAM.gov listings, which aggregate solicitations across the government and are likely to feature task orders that ride on top of the GSA-xAI arrangement.
The $0.42 per agency price point is strikingly low, signaling that xAI is prioritizing market penetration across the federal government over near-term revenue from individual agency deals. That approach mirrors how cloud providers historically offered steep discounts to win government contracts, betting that once embedded in workflows, switching costs would lock in long-term relationships and follow-on awards. The practical effect is that any federal agency can now procure Grok through established channels like GSA-managed acquisition platforms, lowering the bureaucratic friction that typically slows technology adoption inside government. For smaller vendors hoping to participate in the same ecosystem, guidance from the Small Business Administration on federal contract structures will shape how they position themselves alongside or underneath large AI integrators, potentially as niche safety, evaluation, or fine-tuning providers.
Warren Flags Procurement and Safety Risks
The contract’s timing created an immediate political problem. U.S. Senator Elizabeth Warren (D-Mass.) sent a detailed letter to the Pentagon on September 9, 2025, raising questions about procurement processes, safety protocols, and access to sensitive data. Her office highlighted that the $200 million agreement was awarded shortly after Grok generated antisemitic posts and conspiracy theories on Musk’s social media platform, a sequence Warren characterized as evidence that the department had not adequately considered the model’s behavior in the wild. She pressed defense officials to explain how they evaluated xAI’s track record, what safeguards would be in place to prevent similar content from appearing on GenAI.mil, and whether Musk’s other business interests could create conflicts of interest or security vulnerabilities.
The criticism cuts deeper than one senator’s letter. Most coverage of AI procurement focuses on capability benchmarks, integration timelines, and cost efficiency, but Warren’s questions highlight a gap that the Pentagon has not publicly addressed: what happens when a model deployed inside military systems produces content that would violate the department’s own standards on extremism, harassment, or disinformation? The War Department’s announcement frames the xAI deployment in terms of expanding its “AI arsenal,” emphasizing innovation and competition, yet no public response to Warren’s specific procurement and safety questions has surfaced in the official releases. That silence leaves open whether Impact Level 5 safeguards will include content-filtering protocols tailored to the types of antisemitic and conspiratorial outputs that triggered the controversy, and whether independent auditors (rather than vendors alone) will be empowered to test and, if necessary, shut down problematic deployments.
Anthropic, Palantir, and the Classified Divide
Grok’s entry into GenAI.mil arrives during a volatile phase in the Pentagon’s broader AI strategy, where different vendors are being steered toward distinct security tiers and mission profiles. According to the Associated Press, Anthropic’s Claude is currently the only frontier model cleared for classified networks, giving the company a de facto monopoly on high-side generative AI while xAI and Google compete for users on unclassified systems. That divide is not just technical; it shapes what kinds of decisions each model will influence. Classified environments are more likely to involve targeting analysis, intelligence fusion, and operational planning, whereas the unclassified GenAI.mil deployment is expected to focus on tasks such as drafting reports, summarizing policy documents, generating code, and assisting with training materials. The result is a layered ecosystem in which Grok may become ubiquitous across staff work without directly touching the most sensitive intelligence feeds.
Palantir, meanwhile, has positioned itself less as a model provider and more as an integrator and orchestrator of AI capabilities, weaving large language models into existing data platforms used by combatant commands and intelligence units. In that role, the company stands to benefit from a multi-model future: the more vendors the Pentagon brings onto platforms like GenAI.mil, the more demand there may be for tools that route, monitor, and audit model outputs across disparate networks. Grok’s addition to the War Department’s unclassified stack could therefore strengthen Palantir’s pitch that commanders need a single pane of glass to manage AI-assisted workflows, especially as policymakers become more sensitive to issues like hallucinations, bias, and the traceability of recommendations back to underlying data sources.
What Grok Means for Military AI Governance
Beyond vendor rivalries, the Grok contract forces the Pentagon to confront a set of governance questions that have lingered at the edges of its AI adoption push. GenAI.mil was designed to centralize access to frontier models, making it easier to apply consistent security controls and usage policies, but each new model adds complexity to that mission. xAI’s system is trained on a different mix of data than Google’s or Anthropic’s, behaves differently under stress, and has already demonstrated a capacity for offensive and antisemitic content when not tightly constrained. To deploy Grok responsibly, defense officials will need to decide whether to apply uniform guardrails across all models, or to tailor restrictions based on each system’s risk profile, an approach that could quickly become difficult to explain to users who toggle between chatbots inside the same interface.
The stakes are not merely reputational. If a model embedded in routine staff work produces misleading or inflammatory content that slips into official documents, slides, or training materials, the effects could ripple across the department’s culture and external messaging. Warren’s letter implicitly raises the possibility that adversaries could exploit such weaknesses, either by attempting to poison training data or by prompting models in ways that surface harmful outputs that can then be publicized. To counter that risk, the Pentagon will likely need to expand red-teaming programs, create clearer incident-reporting channels for problematic AI behavior, and ensure that human oversight remains central to any workflow involving generative tools. As more agencies take advantage of the GSA-xAI agreement and related procurement pathways, the Grok deal on GenAI.mil may become an early test of whether the federal government can scale frontier AI while keeping safety and civil rights concerns at the core of its deployment strategy.
More From The Daily Overview
*This article was researched with the help of AI, with human editors creating the final content.

Grant Mercer covers market dynamics, business trends, and the economic forces driving growth across industries. His analysis connects macro movements with real-world implications for investors, entrepreneurs, and professionals. Through his work at The Daily Overview, Grant helps readers understand how markets function and where opportunities may emerge.


