WWDC 2026: Apple just built the on-device AI framework the military needs

Core ML was never designed for LLMs

Core ML was built for classification and detection models: fixed-size inputs, single-shot inference, .mlmodel bundles optimized for the Neural Engine. Running a large language model through Core ML required workarounds. Tokenization was awkward. Streaming token generation was not first-class. Memory management for multi-billion parameter models was manual. Agent tool calling did not exist.

At WWDC 2026, Apple replaced Core ML with Core AI. Built from scratch for LLMs: async inference, streaming token output, large model memory footprints, and third-party model integration without .mlmodel lock-in. iOS 27 and macOS 27 'Golden Gate' ship with the new framework. For defense AI on Apple Silicon, the hacks needed to run Gemma or Llama are replaced by a supported, optimized runtime with ahead-of-time compilation and Python tools for PyTorch conversion.

Third-party models plug in natively

Core AI supports third-party models without format conversion. Llama, Mistral, Gemma, or any compatible model plugs in directly. The LanguageModel protocol lets applications swap between Apple's on-device model, Claude, Gemini, or a custom model with a single line change.

This aligns with EdgeLance's compute routing architecture. The local tier runs on the Neural Engine via Core AI. The base tier routes to a nearby GPU server. The cloud tier reaches external providers when policy allows. EdgeLance can use Apple's optimized runtime for local inference without being locked to Apple's models. The same mission pack that loads Gemma 2B on a MacBook loads it through Core AI with native memory management and hardware acceleration.

EdgeLance compute routing: local Neural Engine first, base GPU second, cloud third if policy allows. Core AI makes the local tier a first-class runtime.

Foundation Models 3 is natively multimodal

Apple Foundation Models 3 ships with two tiers: a 3B Core model that runs entirely on-device, and a 20B Advanced model using mixture-of-experts with 1-4B parameters active per request. Both accept text and image input and integrate with Apple's Vision framework for OCR, barcode scanning, and object recognition.

For a patrol that photographs a document, scans a vehicle plate, and asks a question about the scene: all three inputs process through one model running locally on a MacBook or iPhone. No separate pipelines per modality. EdgeLance already runs multi-model stacks on Apple Silicon. AFM 3 adds a baseline multimodal capability that ships with every Apple device, reducing the minimum viable AI loadout for a tactical node.

MCP goes platform-wide

Model Context Protocol extends across iOS 27 and macOS 27. MCP is the open standard for connecting AI models to external tools and data sources. Any application can expose capabilities that system AI invokes through structured tool calls.

EdgeLance services (mesh status, threat analysis, mission context, evidence queries, fleet state) can be exposed as MCP tools that Apple's system AI calls natively. An operator asking Siri 'what is the current threat picture' could invoke EdgeLance's threat analyzer through MCP without opening the app. MCP is a published protocol. EdgeLance already implements structured tool interfaces.

EdgeLance system layers from hardware to operator view. MCP platform support means Apple's AI can invoke EdgeLance services as native tool calls.

What this changes for EdgeLance and defense AI

Apple did not build Core AI for the military. They built it for consumer apps and Siri. But the result, local LLM inference, third-party model support, multimodal processing, MCP tool calling, ahead-of-time compilation, is exactly what defense developers have been hacking together with custom MLX code for two years. Now there is a supported framework.

Core AI replaces custom MLX integration with a supported framework. The LanguageModel protocol aligns with existing compute routing. MCP turns EdgeLance services into system-level AI tools. AFM 3 provides baseline multimodal capability on every device. watchOS 27 ships with improved health tracking that feeds EdgeLance's biometric readiness pipeline.

Xcode 27 runs code completion on the local Neural Engine first, routing to cloud only when needed. For defense dev teams in SCIFs, that means AI-assisted development without a cloud connection. The five-thousand-dollar ISR stack just got a better runtime without any hardware changes.

WWDC 2026: Apple just built the on-device AI framework the military needs

Core ML was never designed for LLMs

Third-party models plug in natively

Foundation Models 3 is natively multimodal

MCP goes platform-wide

What this changes for EdgeLance and defense AI

Sources and references

Related posts

See EdgeLance in action.