hckrnws
Apple’s Persona technology uses Gaussian splatting to create 3D facial scans
by dmarcos
CorridorDigital recently used the tech to assist in remaking the rooftop bullet-time scene from The Matrix. It's used for making the environment instead of modeling it from scratch.
I gotta say, these new Personas are good.
The previous beta ones were terrifying frankenstein monsters. The new ones fooled my boss for 30 minutes.
There's a bit of uncanny valley left, nevertheless. My persona's smile reminds of the horrible expressions people like to make in Source Filmmaker.
For those who have had Persona conversations, how does varying audio latency affect immersion? Is there a recommended chat service?
I don't use it very frequently but when from the few times I did I can't recall any imperceptible lag via Apple iMessage.
Tested talked similar about Personas. https://youtu.be/LzZ2j9CAcww?si=IRvxNaNZeBQp7WLV
I'm usually a fan of Norm's videos, but this might be the first time I've seen a Tested video that felt more like paid-promotion than an actual unbiased review. I don't keep up with it though.
How's the latency? Latency is what makes Zoom et al painful for me now - it ruins the ability to politely interject, give confirmatiom, etc. Does Apple do a better job of this than Google/Zoom? In theory you could get 20-30ms (just spitballing numbers I used to get playing shooters!) but i've never got anywhere near that with vid conferencing.
Even so, latency-in-zoom kind of becomes an attribute of the medium and you learn to adapt. How does it feel with the Vision Pro though? The article talks about a really convincing sense of being in the same place with someone - how does latency affect that? (And does it differ based on if you're all physically in Silicon Valley or not?)
> latency-in-zoom kind of becomes an attribute of the medium and you learn to adapt.
To some degree but not fully. When you adapt your brain is still doing extra work to compensate, similarly to how you don’t «hear» jet engine noise after acclimating to an airplane but it will still tire you to some degree.
I had Zoom and Teams meetings daily during Covid, and personal FaceTime calls almost daily for a while. I still get «Zoom fatigue» if a call goes on for over an hour, if I need to talk face to face during the call (i.e. no screen sharing, can’t disable video and look at something else, etc.) I’m fine if I don’t look at people’s faces but rather people’s screen sharing.
It’s amazing tech, it’s just a solution looking for a problem.
It feels a bit like the original Segway’s over-engineered solution versus cheap Chinese hoverboards, then the scooters and e-bikes that took over afterwards.
Why would I be paying all this money for this realistic telepresence when my shitbox HP laptop from Walmart has a perfectly serviceable webcam?
I used my VP extensively recently when working remotely. It's not glamorous, but I used Screen Sharing with a Macbook that grants you a virtual ultrawide monitor.
Once you're already in VR, it's nice to not have to break out for a meeting, and that's where Personas fit in.
It's not a killer app carrying the product, it's a necessary feature making sure there's not a gap in workflow.
Ah, right! Because you can’t videoconference with the headset on.
Thank you! Now I get it!
So it’s sort of a stopgap solution before the ar glasses are small enough to do actual video calls without looking silly?
Why do we have video call meetings when people mostly just listen and the information is carried via audio?
Why do we have 4K monitors when 1920x1080 is perfectly fine for 99.999% of use cases?
If you look at the world through this lens called "serviceability" you'll think everything is a solution looking for a problem.
> when 1920x1080 is perfectly fine for 99.999% of use cases
A lot of people here work with text all day every day and we would rather work with text that looks like it came out of a laser printer than out of a fax machine.
Unless you are using a tiny 4k monitor (>9") its not going to be laser print quality.
The comment you're replying to made use of a simile, which is a figure of speech using "like" or "as" that constructs a non-literal comparison for rhetorical effect.
Comment was deleted :(
Of all places, HN should not be the one to casually conflate resolution and DPI!
The comments silently imply that they are talking about the same screen size, so 1920x1200 vs 4K is indeed a conversation about DPI.
I read their comment in the exact opposite way, and that your comment is exactly their point.
But who's going to use such a tiny display that would make 1080p look good?
A 24" 1080p monitor is perfectly fine for working with text of any kind. I still use mine at home, even after a decade.
As others said, resolution is not everything. DPI and panel quality matters a lot.
A good lower resolution panel is better than a lower quality larger panel. Uniformity, backlight color, color rendering quality, DPI... all of them matters.
--
This comment has been written on a 28" 1440p monitor.
My theory is that people complaining about text on low resolution displays are using Macs. Apple has seriously gimped the text rendering on low-dpi displays essentially just downscaling a high-resolution render of the screen rather than doing proper resolution aware text hinting.
For some reason people then blame their old displays rather than apple for this.
Makes sense.
I sometimes connect the same 24" monitor (an ASUS VZ249Q) to my M1 MacBook via USB to DP (so no intermediate electronics), and the display quality feels inferior to KDE, for example.
Same monitor allows for unlimited working for hours without eye fatigue when driven from my Linux machine. I have written countless lines of code and LaTeX documents on that panel. It rivals the comfort of my HP EliteDisplay.
"A lot of people are in meetings all day, and we would rather look at something that looks like we're there in person than at a limited webcam."
This depends a lot on whether you really want to be in these meetings, and what you're supposed to do in them.
The first part is obvious, for the second part if you're looking at slides and docs during the whole meeting, getting a super high fidelity view of all the other participants also looking (probably) at the slides doesn't help in any way.
I mean, Google Meet has a spotlight view exactly for this reason.
"We have this amazing revolutionary tech, and there only thing we can think of is sitting in meetings all day, working with Excel sheets, and answering emails"
I actually think about this a lot, and I could argue both sides of this. On the one hand, you could look at your list of examples as obvious examples of modern innovation/improvement that enrich our lives. On the other, you could take it as a fascetious list that proves the point of GP, as one other commenter apparently already has.
I often think how stupid video call meetings are. Teams video calls are one of the few things that make every computer I own, including my M1 MPB, run the fans at full tilt. I've had my phone give me overheat warnings from showing the tile board of bored faces staring blankly at me. And yeah, honestly, it feels like a solution looking for a problem. I understand that it's not, and that some people are obsessed for various reasons (some more legitimate than others) with recreating the conference room vibe, but still.
And with monitors? This is a far more "spicy" take, but I think 1280x1024 is actually fine. Even 1024x768. Now, I have a 4K monitor at home, so don't get me wrong: I like my high DPI monitor.
But I think past 1024x768, the actual productivity gains from higher resolutions begins to rapidly dwindle. 1920x1080, especially in "small" displays (under 20 inches) can look pretty visually stunning. 4K is definitely nicer, but do we really need it?
I'm not trying to get existential with this, because what do we really "need"? But I think that, objectively, computing is divided into two very broad eras. The first era, ending around the mid 2000s, was marked by year-after-year innovation where 2-4 years brought new features that solved _real problems_, as in, features that gave users new qualitative capabilities. Think 24-bit color vs 8-bit color, or 64-bit vs 32-bit (or even 32-bit vs 16-bit). Having a webcam. Having 5+ hours of battery life on a laptop, with a real backlit AMLCD display. Having more than a few gigabytes of internal storage. Having a generic peripheral bus (USB/firewire). Having PCM audio. Having 3D hardware acceleration...
I'm not prepared to vigorously defend this thesis ;-) but it seems at about 2005-ish, the PC space had reached most of these "core qualitative features". After that, everything became better and faster, quantitatively superior versions of the same thing.
And sometimes yeah, it can feel both like it's all gone to waste on ludicrously inefficient software (Teams...), and sometimes, like modern computing did become a solution in search of a problem, in order to keep selling new hardware and software.
> But I think past 1024x768, the actual productivity gains from higher resolutions begins to rapidly dwindle.
Idk man, I do lile seeing multiple windows at once. Browser, terminal, ...
My only counter point to your resolution argument is that 1440p is where I’m happy because of 2 words: real estate. Also 120hz for sure. Above that meh.
I edit video for a tech startup. High high high volume. I need 2-3 27+”1440p screens to really feel like I’ve got the desktop layout I need. I’m running an NLE (which ideally has 2 monitors on its own but I can live on 1), slack, several browser windows with HubSpot and Trello et al., system monitoring, maybe a DAW or audacity, several drives/file windows opens, a text editor for note taking, a PDF/email window with notes for an edit, terminal, the list goes on.
At home I can’t live without my 3440x1440p WS + 1440p second monitor for gaming and discord + whatever else I’ve got going. It’s ridiculous but one monitor, especially 1080p, is so confining. I had this wonderful 900p gateway I held on to until about 2 years ago. It was basically a tv screen, which was nice but just became unnecessary once I got yet another free 1080p IPS monitor from someone doing spring cleaning. I couldn’t go back. It was so cramped!
This is a bit extreme: but our livestream computer is 3 monitors plus a 4th technically: a 70” TV we use for multiview out of OBS.
I need space lol
yes and to a degree which i find particularly interesting. its never going to happen because of your example
i prefer working in my vp and see a possible world where vp makes my remote team collaborate as if were in the office, from the comfort of the most ergonomic location in my house
it solves this problem and 0.0001% of people are dorks like me who try and say, "they did it" while the rest of the world keeps going to work as before
all of the tech problems were solvable. people simply dont want to put a thing on their face and i think thats unsolvable
I live half way across the world from my folks so I don’t see them often. I’d love something that gives me a greater sense of presence than a video call can give.
I would not describe creating an experience that feels like you are in the room with a group of people, even allowing cross talk, is a solution looking for a problem. I think it's the thing everyone slowing dying on Zoom calls wishes they could have.
Oh no, they wish to have fewer useless meetings.
I disagree. Many of us don't use a headset regularly or carry it with us like a phone or laptop; it is an express inconvenience to use, with only marginal benefits. Businesses won't want one if webcams still do the trick, and users might respond positively but are always priced-out of owning one.
If I'm doing work at my desk and I get a Zoom call, there is a 0.00% chance I will go plug in my Vision Pro to answer it. I'm just going to open the app and turn on my webcam, spatial audio be damned.
"Now out of beta"??
Just in time for Vision Pro to go big. Right?
Comment was deleted :(
Comment was deleted :(
This video might help explain 3D Gaussian splatting. https://www.youtube.com/watch?v=wKgMxrWcW1s Essentially, an entirely new graphics pipeline with different fundamental techniques which allow for high performance and fidelity compared to... what we did before(?) Cool.
Not quite, it’s just a way to assign a color value to a point in space (think point clouds) based on photogrammetry. It’s voxels on steroids but still is drawn using the same techniques. It’s the magic of creating the splats that’s interesting.
A color value for each point is a good starting place to gain an intuition. Some readers might be interested to know that the color is not constant for each point, but instead dependent on viewing angle. That is part of what allows splats to look realistic. Real objects have some degree of specularity which makes them take on slightly different shades as you move your head.
And since we normally see with binocular vision, a stereoscopic view adds another layer of realism you wouldn't normally perceive otherwise. Each eye sees subsurface scattering differently and integrates in your head.
That video didn’t explain what Gaussian splatting is at all, but I did get a minute ad read for some cloud GPU service.
https://packet39.com/blog/a-primer-on-gaussian-splats/ is much better (don't load on mobile though, lots of data).
The same graphics pipeline is used: rasterization.
Rasterization is a very general term. There is a big difference in practice between the traditional rasterization pipeline and splat rasterizers
it's kinda like saying "we still show pixels". true but almost totally useless for understanding anything.
Sorry but this is a horrible video. The guy just spews superlatives in an annoying voice until 4:30 (of a 6 minute video mind you), when he finally gives a 10 second "explanation" of Gaussian splatting, which doesn't really explain anything, then jumps to a sponsored ad.
yeah... their older videos are a bit more useful from what I remember (more time spent on the research paper content, etc), but they've become so content-free that I just block the channel outright nowadays. it's the "this changes everything (every time, every day)" hype-channel for graphics.
Comment was deleted :(
Crafted by Rajat
Source Code