Apple’s Vision: Impossible?
022. On mixed reality, AI & ML
Hey, friends—welcome back to online-offline, a newsletter about technology, culture, and the future. Special welcome to new subscribers! I’m glad you’re here. This week, thoughts in progress about the role AI/ML will play in a mixed reality. With that said, online-offline is meant to be a conversation, so do also reply with your thoughts/questions. And if you haven’t already, consider supporting my work by subscribing below.
This week Apple announced the Apple Vision Pro, a VR headset with AR elements that’s touted as the future of spatial computing. The device looks like a pair of goggles strapped on your head with a cord attached for power—the device has a two hour battery life.
The Vision Pro is a technological feat and probably a paradigm shift. 23 million pixels across two displays may accurately digitize the intricate details in real life. But I’m more impressed by visionOS, the new spatial OS that tracks your eyes and hands, and the new Apple R1 chip that enables perhaps the lowest latency we’ve seen in this kind of device. And, building on the spatial audio experience that Apple introduced with AirPods, the Vision is personalized to your ears dimensions and uses raytracing to adapt sound to your room’s physical materials and acoustics.
It’s all, to put it mildy … really cool shit. These specs are the most interesting thing about this launch and may be the catalyst for more useful products in the future. But when it comes down to it, Vision Pro is a solitary device that costs $3,499 (minimum!) and isn’t that useful to the average consumer.
The anti-social social club
iPhones are a vessel for connectivity. Apps like Google Maps, WhatsApp, and Snap shattered time and space on a global scale in the first five years of the iPhone launch. A decade later, you can use your iPhone to explore these apps while on a walk, when streaming Netflix, or quickly in between meetings—even better, you don’t have to look at the screen if you have AirPods or an Apple Watch. These products are certainly distractions, but they integrate seamlessly into your life and leave room for serendipitous eye contact with strangers and shared IRL experiences.
In contrast to the iPhone, using a Vision seems like living in a large computer, occasionally opting into real life. Imagine missing your child’s first steps because you were on a 3D FaceTime call? Or being stuck in your 500 square foot apartment moving floating objects on a blank wall for 8 hours daily? Or laying on the couch with your partner who’s upgraded from the PS5 to Vision and can’t see your face or read your body language? The device seems incredibly isolating and only appropriate for niche use cases. But these niche uses may enable the tide to turn; historically, virtual and mixed reality headsets have high churn, low utility, and caused motion sickness. The optimistic take is that today Vision Pro is a replacement for the television and computer, but in a decade it may replace many of our gadgets, including the iPhone. Just like how smartphones displaced the computer over time due to their capabilities, but mostly because they were mobile.
I think it’s a mistake to discuss the mainstream adoption for the Vision Pro. It’s called a PRO device for a reason—a developers playground (for Unity at least, TBD on Unreal), an extension of Apple R&D to explore the horizons of a mixed reality. Perhaps Apple will develop a portable Vision akin to MacBook Air. This will only matter if it’s 1) cheaper 2) has a more mobile form factor, and is 3) multi-player. Without #1 and #3, it only makes loneliness in US worse; without #1, it has low international appeal; and, without #2, it has fewer everyday and enterprise applications.
AI & ML, and the next era of computing
I’m more interested in the potential for augmented (or even mixed) reality & AI/ML. The use of computer vision and “advanced machine intelligence” in the Apple Vision is impressive for their first VR device—you can select things with your eyes and then flick and pinch to open and interact with spatial objects. Best of all, the super advanced ML allows you to create an avatar in your likeness that reflects face and hand movements in real time.
It makes you wonder how we’ll advance in the next decade, especially when everyone seems to be playing the same game differently. Facebook leaked Zuckerberg’s thoughts on Apple’s launch yesterday and in summary—Meta wants to create social and accessible mixed reality products.
When people visit the VR graveyard, they often remark on Google Glass. Personally, I always said I wanted to live in Snap Maps. Hear me out! What if my Spectacles not only captured special moments in my life, but they improved my everyday life. If I’m walking around the city, I can get personalized ads when I near my favorite store, get alerts when a close friend is nearby, identify objects, set reminders … There was something really special about Snap’s app and how it could interplay with future versions of the Spectacle. The social and commercial use cases were endless—in an era of App Tracking Transparency (Apple strikes again), a localized and immersive experience leads to first-party data for SMBs. Obviously that dream is dead now. Thanks, Spiegel.
But, I’d be remiss if I didn’t mention Humane, a company I admired long before I joined Kindred Ventures (disclaimer: Kindred is an investor in Humane). I can now point to their Ted video, which showcases their device and software. It’s contextual computing, natural in the way it accesses and adds to a universe of data, media, and applications. It’s also spatial computing. And it’s much more that the team will share in due time.
Each company—Apple, Meta, Humane, and others I didn’t mention—has a different vision and approach for the next era of computing. We should be excited about that! Who knows the kind of innovation that will beget the competition. I fully expect to have a gallery wall of spatial generative video art that I bought as NFTs by the end of the decade, and you should too.