We’re just weeks away from WWDC 2024. The big, un-hidden theme for Apple this year is Artificial Intelligence. And there’s been a raft of articles following every deal the company is making in the space: from a partnership with OpenAI to hosting its own AI-processing chips in data centres.
The Apple that I know and love is not usually the first to market with a new technology; they follow later with their own, more thoughtful and considered take, that puts users first. Not technologies for their own sake, but fleshed-out products that rely on those technologies.
So I find myself wondering about what an Apple-powered, large language model, generative AI would look like. If it’s a just a new Siri that hallucinates like any other chat bot, then I’m going to eat this MacBook Pro, sell my AAPL, and start living life as a Linux programmer.
Apple needs to deliver AI features, but without the well-documented risks. How the hell are they going to do that?
All About Constraints
The only chance they have is, I think, an LLM that is tightly constrained in what it talks about. Like Siri today, it’s limited to particular domains of knowledge: basic features like timers, weather, sports, media, HomeKit stuff, App Intents… and not much else?
Siri is terrible not just because of what it can (or can’t) do, but because it screws up all the time. It mis-hears, it mis-interprets, and it has no memory. On these bases alone, modern LLM technologies could improve Siri by interpreting a large array of inputs for many of its possible actions. So while I don’t expect a Siri that will replace all-in LLM chatbots like Kindroid, I think it’s worth hoping for a Siri that has dramatically improved responsiveness, the ability to seamlessly followup, and greater reliability.
I think a “Siri 2.0” would probably offer a feature set similar to what we have now, but with the ability to reach into apps (via App Intents) to perform tasks, and maybe even create chains of tasks like we have with Shortcuts today.
Me: “Hey Siri, create an email list for Charles, Mahesh, Siobhan and Lance. Call it ‘Work Homies’.”
Siri: “Done."
Me: “Let’s write an email to them. Say ‘Hey y’all, let’s figure out the projections for Q3. Here’s the Keynote deck. Let me know what you think!’"
Siri: “Here’s the email.” You can see the email’s open and has the message, and it’s been formatted correctly, including your email signature.
Me: “Attach the Keynote file I was working on.” And boom, your most-recent Keynote deck is attached. “Send.” Off she goes.
Something like this could offer natural language interactions with Apple’s built-in apps, but could be extended to other apps too. I’d love to see this.
Xcode
I’m personally really excited about the possibilities of LLMs with Xcode. I already use ChatGPT to assist with coding, and having that built-in to Xcode presents a ton of opportunity to maximize my productivity.
Today, Github Copilot works by looking for instructions in the form of code comments, which it turns into code. That’s a great starting point for an Xcode integration, but I hope that Apple takes it up a notch.
While an Xcode-based LLM could be trained on all the Swift code and act as a top-level expert Swift developer, I would also love for it to be trained on all YOUR work. In the corporate environments where I work, projects get to be very large and complex. Some of the greatest challenges I face when joining a new team (I’m a contractor, so this happens pretty frequently!) is becoming familiar with large codebases.
How great would it be to open my Xcode project and be able to ask the AI, “can you show me the code that handles user authentication?” and it just opens AuthenticationHandler.swift
, highlighting the relevant method. Or you could interrogate the model talking about the approach used for a given technique:
Me: “I have a UI that can have several different states. Can you show me how to build a state machine?"
Xcode: “Sure! State machines are a great choice for performing different actions based on the value of one or more properties. Let’s step through creating one.” Xcode creates a new file
StateMachine.swift
and creates an Enum with some suggested states. “Think about your use case and define the different states that you’d like here."
And you could go on like this, stepping through the process of creating a state machine.
If Apple can pull this off, it could be a revolutionary change in how developers work on Apple platforms. I can’t help but think of the hours that I’ve spent banging my head against poor documentation and my own stupidity; a large language model that knew my code, and was trained on ALL THE THINGS, could quickly short-circuit the times where I’m stuck or feeling too sad to code effectively (it happens).
I feel like if I’d read the above five years ago, it would sound like an impossible dream, a farcical fantasy along the lines of the Knowledge Navigator video from 1987. But like the features in that video, all of this could soon be a reality. And I couldn’t be more excited.