How can Siri automate Shortcuts when it’s so opaque?

Screenshot of Python code editing software with image scaling script. — Claude Code takes advantage of a real development environment.

I’m pretty skeptical that Apple’s new Siri-wrapped Gemini will be able to accurately and reliably assist with automation. Gemini will be the foundation to Apple’s foundation models, but there’s no there there. Apple has no well-documented, debuggable, inspectable system to execute automation with, unless you count ancient and inscrutable AppleScript, and you shouldn’t.

Sure, LLM chatbots will spit out code (even AppleScript!) if you ask them to, but it might not work. It gets substantially worse when you’re asking LLMs questions about Shortcuts.

Go ahead and ask any chatbot to describe how to make a Shortcut to perform some automation that you’ve been wanting to do and then try to assemble what it suggests. It’s extremely tedious, prone to user error, and isn’t in any way guaranteed to work even when it’s all put together.

Agents that hook into development environments are much better than a bare chatbot because they can inspect, run, and debug the code they are generating. They aren’t perfect, but if you have an agent like Claude Code hooked up to an development tool like VS Code and start describing some Python script you want, it’ll execute and iterate until the output is what you asked for.

If humans don’t have access to documentation, to actionable debug output, logging, the ability to bypass/ignore actions as part of testing, and the ability to copy and paste snippets of code, then how can the new Siri do it?

Right now, Shortcuts works with AI models by passing some input and then receiving the output. When something goes to the model, the model transforms the data, and delivers a result back to Shortcuts. That’s a non-deterministic workflow, so any change to the model, or even just randomness in general, can produce different output. This means you can’t reliably troubleshoot or adjust it without introducing uncertainty in what new outputs you’ll get.

When working with an agent to assemble automation in an IDE, the code it builds is deterministic, so it will keep working even if the model changes. Not everything you want to automate requires LLM functionality when it runs, but not everything you automate should require hours of labor to fabricate the deterministic workflow version of it.

I really hope that the magic of new Siri isn’t going to be that it will just do things with bare actions and App Intents, magically, without any user-accessible process, or as a blob inside of a Shortcut you need to make. If I ask Siri to reorder a list, and it doesn’t do it correctly, I want to be able to access the scaffolding it created to see what went wrong, not just keep asking Siri to do it again in slightly different ways until I get output I like.

If Siri doesn’t produce anything inspectable, or it produces a Shortcut, then there’s not much work humans or AI can do to fix things.

AI cut below the rest

The problem the Shortcuts app is supposed to solve has never been solved, because no one really knows how to use Shortcuts unless they become a Shortcuts expert. Shortcuts is user-friendly in appearance, but not in practice. It’s meant to welcome people who don’t know anything about programming with its friendly drag-and-drop interface, and searchable actions panel.

Unfortunately, the names for actions don’t always say what they do, and the documentation is often a vague piece of filler that’s frequently reused for more than one Shortcut action. Even experienced programmers can get flummoxed when they try to search the available actions for seemingly standard functions, like reversing a list.

Magic connections are magic, until your script gets any longer than the length of your screen and you need to start dragging actions around, inevitably breaking connections and making unintended ones. With a text-based script you’d have to keep track of the names and spelling of your variables, but they don’t change out from under you if you add more lines of code above or below them.

You can’t do one of the most simple, and useful things in scripting, which is commenting out (ignoring/bypassing) something to test or evaluate alternatives.

A lot of the time, when people are using Shortcuts, they’re relying heavily on the run shell script action to do actual programming that lets them write normal, vanilla code, or ssh’ing into a server from iOS to do the same thing. It’s nice that Shortcuts can do that, but shell scripts aren’t cross platform, and ssh’ing into a server is in no way accomplishing Shortcuts’ mission.

Without logging, you can’t ask Siri why your automation that was supposed to run in the middle of the night didn’t run. Maybe it was a permissions issue that was never raised when the shortcut was created. You, and Siri, just don’t know.

AI rising tide lifts all boats

Again, Apple doesn’t have to do these things just for humans, or just for Siri. They are in no way mutually exclusive.

If the concern is that Shortcuts shouldn’t be like a programming language, with tracebacks, and logs which would put off “normal people” then just remember that “normal people” don’t really use Shortcuts. They ask a chatbot to just do it, and Siri, as Apple’s chatbot, could take advantage of those fiddly, programming bits and perform its role better, in a way that was auditable.

I have seen people make frantic posts on Mastodon about how AI is deskilling programmers, but the beauty of Shortcuts is that Apple already applies the deskilling at the factory.

[Joe Rosensteel is a VFX artist and writer based in Los Angeles.]

If you appreciate articles like this one, support us by becoming a Six Colors subscriber. Subscribers get access to an exclusive podcast, members-only stories, and a special community.

This Week's Sponsor

By Joe Rosensteel

■ User Automation

How can Siri automate Shortcuts when it’s so opaque?

AI cut below the rest

AI rising tide lifts all boats

Search Six Colors