“AR”, as in Augmented Reality. “Vision”, as in not what Apple’s selling. It’s been a decade now that I’ve thought that AR will be a next Really Big Thing, once the technology to build it is here (which it isn’t). Since Meta and Apple and plenty of startups are muddying the waters around what AR might mean, here’s a vision of the AR mid-game. Not the end-game, since that’s unimaginable, but still a long way out from here.
Side-trip: Why not AppleVision? · The Vision Pro looks like a cool piece of gear. There are two apps where I can see coughing up the $3500 right now today. First, giving my Mac an infinitely big screen. Right now I’m working on a project where I’m constructing this huge and complex document which sources its content from my brain, supported by multiple other big complex documents and a whole lot of browser tabs. The amount of time I spend doing ⌘-Tab and ⌘-` to get at the window I want is highly annoying. I gather that the Vision Pro would let me have a dozen full-page windows scattered in fixed spatial positions where I could just look/finger-tap for access. Tasty!
Second is the sports experience. I’d totally buy one for third-base-dugout or soccer-pitch-penalty-box immersion. I’m pessimistic though. The world’s richest sports organizations — Britain’s EPL, India’s IPL, America’s NFL — typically can’t manage to deliver even high-quality 1080p, let alone 4K, so I see no grounds for optimism that they’re gonna build the expensive ultra-high-bandwidth infrastructure to boost sales of multi-kilobuck headsets.
I notice Apple hasn’t said how much bandwidth one of these things needs to drive it. Dual 4K monitors running at 90Hz, that’s a whole lot of bits.
Let’s turn our attention back to AR.
The “R” in AR · It stands for Reality. A certain amount of your reality (and mine) happens at home where an I/O helmet might be appropriate, but the rest happens out in the real world, and I’m totally not going there faceplated. Consider Dan Morrill quoting Neal Stephenson’s Snow Crash:
Gargoyles represent the embarrassing side of the Central Intelligence Corporation. Instead of using laptops, they wear their computers on their bodies, broken up into separate modules that hang on the waist, on the back, on the headset. They serve as human surveillance devices, recording everything that happens around them. Nothing looks stupider; these getups are the modern-day equivalent of the slide-rule scabbard or the calculator pouch on the belt, marking the user as belonging to a class that is at once above and far below human society. They are a boon to Hiro because they embody the worst stereotype of the CIC stringer. They draw all of the attention. The payoff for this self-imposed ostracism is that you can be in the Metaverse all the time, and gather intelligence all the time.
Wanna be one of those? Me neither.
So anything that claims to be meaningfully “AR” needs to be usable out there in public when interfacing with, you know, reality.
AR? · So here’s the pure vision, the one that’ll be the big winner eventually. You’re doing something out in the world; walking in a park, shopping in a store, shooting baskets, painting a room, going to lunch. Not wearing a helmet. Reality is what you see when you look around you. At some point, you’ll be able to choose to see it plain or augmented. Let’s totally ignore whether you look through glasses or a tablet or a magic magnifying glass or a phone or cyborg eye-socket implants.
Augmented how? Here are ideas.
But wait, “augmentation” is a long klunky word. Let’s say “aug” for short (plural: augs, verb: to aug, participle: augging).
Art augs · At dusk, you’ve gone for a walk in a park that has lots of big trees because you’ve heard that it has good augging. A few minutes in, you get an aug notification: Natty Trunks. You give the go-ahead and a slow spacey dub of Bob Marley’s Natty Dread fades in; all the trees exhibit a glowing ring around their bases, in a variety of blues and blue-adjacent turquoises and violets. Then they dance, the color pulsing up the trunk, most following the bass (it’s Dub!), others the drums and horns. Snatches of vocals are highlighted by funky multicolored snakes flowing through the dark leafy canopy.
You look across the water at Downtown’s high-rises and get a hot-trending-aug signal. Accept it, and there’s a huge shabby grey-haired woman pushing a shopping cart across that skyline, stooping to look in windows and offer foul, profane, hilarious comments on what she says she sees.
The next Banksy could be an augger. Hey, he’s not that old, the current Banksy might give it a try.
Shopping aug · You visit Costco on your way home from work, in a hurry. Your shopping list says “Peanut butter, toilet paper, gin.” You’re smart enough to avoid Costco’s own aug, which will route you inefficiently past attractive distractions.
So you fire up the FastShop aug you’ve subscribed to for years and suddenly there are garish neon signs floating in the air for your three targets. It’s your call how you get there, if you want to head by the cake counter go ahead. But Costco is definitely less painful this way than it is now.
Graffiti augs · What people seem to like best on the Net is conversation; mostly now breathless little swirls of prose not designed to hold meaning for more than minutes. So you can chat-paint on fences or rocks or dogs. And have them dissolve in seconds or, for a particular message that matters, be there permanently but only for the person you sent it to.
Map augs · This is just what your current map app does, only the turn-here arrows and destination ETA and so on are in front of you. In fact, there are cars with heads-up floating map displays; I’d call that (very basic) real AR.
And, like in today’s online maps, there is plenty of space for ads and reviews that could float up from anything you look at.
Which brings me to…
Advertising augs · Offer the advertising profession an opportunity to paste their stuff on every freaking surface and they’ll go nuts; it’s the Holy Grail. They have a problem in that nobody wants to see ads. I don’t know they solve that problem, that’s why they earn the big bucks.
Game augs · This is easy to predict because it already exists. Niantic has led the charge with Ingress and Pokemon Go and I just got email advertising a new offering, “Peridot”. Niantic offers very, very weak AR, but it is AR. And the games attract millions and apparently made Niantic a lot of money, even in their current state.
One of the reasons that worked is that you didn’t have to wear any headgear, you could just stroll around and play on your phone, although I used to play on my Nexus Seven, which looked maybe a little weird but gave me quite an advantage.
War augs · Well, obviously. But I’ve never been in a war so that’s all I’ll say.
A weaker form of war is cops vs protesters. I will be highly unsurprised if the social conflict between climate-justice activists and uniformed petro-state defenders leads to some pretty gruesome street confrontations before our society sobers up and starts attempting to save us not burn us.
In that kind of situation I’d sure like annotations about where surveillance cams are and which way the cops’ tactical vehicle is heading.
Aug hardware · For some of these scenarios, a phone-sized device would work just fine. I used to own one! It was a pre-release beta of what became the Tango platform. It was an Android, only two or three times as thick, with LIDAR and IR and an ultra-wide-angle. You could run Java code on it that did primitive AR; filling your space with spheres, I seem to recall, or piloting a little virtual drone around the room.
It was a standard Android-style Java API. I got some basic code running and was excited at hell. But then Google dumped me because I wouldn’t move to Mountain View.
I mean, the hardware sucked. It ran slow and hot and the software was flaky as hell. Still felt like the future.
I have a feeling that for serious augging — like going to an art show or a nature tour — the ideal form factor would be, essentially, an iPad with a handle. So it’d be easy to hold up and look through.
Software problems · Let’s agree that the hardware you’d need to support a real AR experience is starting to flicker into existence, because the Vision Pro has most of it. Let’s assume that that hardware can eventually be fit into an iPad-with-handle form factor. (Not soon I think, but eventually.) So, what should software dweebs like me thinking about?
I think discoverability is an interesting problem. Let’s assume that at any reasonably popular location, there are going to be lots of augs around, more than anyone could look at, and they’d get in each others’ way. So you’d need augging filters that are location- and tag-sensitive. Would that be enough?
Then there are the actual payloads. Obviously you’d need some real horsepower in your device, and so it seems like the logical thing would be to ship augs as scene graphs? But how do you express the relation to the local geometry? I’m totally out of touch as to whether there are useful standards in that space? I guess looking at the Apple visionOS APIs would be useful.
AR is gonna be huge, I’m sure of it. But you won’t need to wear anything that covers your face.