• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PS5 Pro/PSSR Appears to Provide Better Image Reconstruction than DLSS Running on a 4090 GPU

Zathalus

Member
But that 500TOPs in terms of game use is theoretical AFAIK, the minute the card needs to do game rendering, and RT that TOPs number becomes far, far smaller and less efficient was how I understood it by the split of SMT to slower bus setup for RT - whereas the Ragnarok AI ML solution looks completely asynchronously integrated to gaming workloads on RDNA.
Yes, the TOPS figures for all Nvidia cards are theoretical best cases, as regular rendering and RT could impact how much shared resources the card has available to dedicate to the Tensor cores. The regular SM, RT, and Tensor cores are all still separate of course. How efficient the end result is depends on the managing of shared resources, and how well general workloads (which can be done asynchronously) are issued to the Tensor and general CUDA cores.

In the case of RDNA4 the shared resource model still applies and WMMA commands for TOPS are issued on the SIMD in the compute units of the GPU. Hence the TOPS figure is theoretical best case as well.

That being said, the amount of TOPS that both Nvidia and AMD are delivering with Ampere, Lovelace, and RDNA4 is a bit overkill for ML upscaling. So 300 vs 500 TOPS is a bit academic.
It would be exactly double, so 476 TOPS using sparsity, which was only introduced with Ampere. PS5 Pro also uses sparsity?
Unknown, but the LLVM changes for RDNA4 indicate sparsity support:


Including sparsity allows the Pro to hit 289 TOPS at the max leaked clock speed of 2350Mhz. So it fits.
 

Bojji

Member
Leave PCMR alone, they're busy believing the 5XXX series will bring great performance for the money to show up Sony.

I bet Blackwell will be overpriced and underpeced (outside 5090), that's what all logical thinking people (not nv fanboys) think as well after ada launch. Lower prices would be a nice surprise.

That still leaves AMD, maybe their gpus will have good price/performance ratio? Used market is also vert healthy, there are lots of gpus that are better than Pro and released even in 2020.

VRR does not help alleviate frame time spikes. It only eliminates screen tearing which reduces the perceptual window where inconsistent frametimes become bothersome. A steady 60fps is more pleasant than constantly bouncing between 80-120 fps, and a steady 120fps is better than a steady 60fps.

iB9VSAL.jpeg
K6Gp8Ve.jpeg




Fucked up frametimes usually come from cpu limit scenario, if game is gpu limited it will be smooth. Most sony games have mode like that and it's the best way to play them.

Taking a closer look at this comparison shot, The PS5 Pro Version has an additional Ray Traced Shadow to the right of the frame (green arrow), as well as additional foliage on the foreground (yellow arrow). The anti aliasing on the 4090 (Purple Arrows) is also of lower quality in comparison to the Pro.

1oU9CFJ.jpeg

Shadows are dynamic in this scene, later shadow appears in pc version but it's very soft thanks to ray tracing (PS5 Pro doesn't have RT shadows):

apNZfQc.jpeg
WTtxOos.jpeg


You also have differences like here, there a lot of moving objects between player and the sun in the sky:

TeVnss2.jpeg
 
Last edited:

ap_puff

Member
Yeah, so exactly like I said? However, if you try that without VRR, you will get screen tearing and stutters.

No, this is completely false. PlayStation has many games with unlocked fps and they often bounce around between 70-90 and are often described as the best performing and feeling mode. What year is this, 2008? You have people on this very site playing at high frame rates in tons of games and almost none of them locks the fps to 60. They let it go above without necessarily hitting 120. Not hitting 120fps does NOT mean you will get massive frame time spikes that will result in terrible stutters and a bad experience. As I said, bad frame times will happen regardless of whether or not a given cap is consistently hit. Jedi Survivor and Bloodborne are prime examples of this.
completely incorrect. VRR does not stop stuttering. Have you ever played a game in your life? It is completely obvious when framerate tanks even with VRR enabled. Lets take Helldivers for example, it can often tank framerate by dozens from frame to frame when things go from calm to having dozens of effects on-screen especially smoke and explosions. You can perceive stuttering quite easily with VRR enabled even when the framerate is within "smooth" territory. Also, stutters such a shader compilation stutter are never fixed with VRR.

2nd, those unlocked games do not have rapid frametime spikes, congratulations on defeating a strawman.
 

winjer

Member
Are we still pretending that VRR fixes stuttering and frame pacing issues?
VRR is only supposed to maintain s sync between the delivered frames and the frames the monitor shows.
It avoids screen tearing and has lower latency than traditional v-sync. But that's it.
 

ap_puff

Member
Are we still pretending that VRR fixes stuttering and frame pacing issues?
VRR is only supposed to maintain s sync between the delivered frames and the frames the monitor shows.
It avoids screen tearing and has lower latency than traditional v-sync. But that's it.
yeah idk it feels like these dudes are in here trying to reenact digital foundry circa 2021 and making VRR out to be some sort of black magic. It's nice, not perfect.
 

winjer

Member
yeah idk it feels like these dudes are in here trying to reenact digital foundry circa 2021 and making VRR out to be some sort of black magic. It's nice, not perfect.

VRR is great to avoid screen tearing and that alone makes games feel smoother.
And because the GPU is not waiting on the monitor, it feels as snappy as un-synced delivery.
So it is an amazing tech, one of the most important in the last decade.
But like you say, it's not magic. And it won't fix performance issues with a game, such as stuttering or inconsistent frame pacing.
I'm not sure where people got this idea that VRR fixes everything.
 

Gaiff

SBI’s Resident Gaslighter
completely incorrect. VRR does not stop stuttering.
It does not help with stuttering brought about by massive frame-time spikes. It does, however, help with judders brought about persisting or late frames as John duly highlights here:

However, there's a catch - while you can enable this mode on any 120hz capable display, if you cannot utilize VRR, I would strongly suggest sticking with 60fps instead due to judder
And Alex also says the same in his video.
Have you ever played a game in your life? It is completely obvious when framerate tanks even with VRR enabled. Lets take Helldivers for example, it can often tank framerate by dozens from frame to frame when things go from calm to having dozens of effects on-screen especially smoke and explosions. You can perceive stuttering quite easily with VRR enabled even when the framerate is within "smooth" territory. Also, stutters such a shader compilation stutter are never fixed with VRR.
It's much less obvious when the fps drops from 120 to 80 then from 80 to 40 or 60 to 40. Shader compilation stutters have nothing to do with this discussion, so not sure why you're bringing them up. Point is, you're completely incorrect that it's preferable to play a locked 60 than unlock your frame rate and let VRR do its thing. Almost no one on PC locks their monitor or games to 60 or 120. Your counter-argument only applies when there are huge shifts in frame times/rates such as instantly going from 120 to 70fps, but how often does that even happen? Most of the time when gaming, your frame rates will remain within a certain window and gradually increase or decrease as you enter heavier scenes or the action ramps up.
2nd, those unlocked games do not have rapid frametime spikes, congratulations on defeating a strawman.
Huh, maybe you should have read the argument prior to butting in and saying a bunch of incorrect information? The discussion started because a poster claimed that the PS5 Pro could potentially "run" games better than a $2500 PC equipped with a 4090. I said that this wasn't going to happen. Another poster then said that it might very well be the case with Rift Apart and I answered that it wasn't because it's already much better on PC and runs at a higher frame rate. The poster then replied that going above 60 in Rift Apart isn't good because it increases frame-pacing issues, so it would be tied with 60fps in most cases and that's blatantly wrong. Rift Apart doesn't have frame pacing issues going above 60fps at all and you don't get massive fluctuations that push your fps to 120 one second and the down to 65 the next.

The preferred way on PC is to set an fps cap a few frames below the monitor's max refresh rate so your frame rate never goes beyond the VRR window. After that, it's perfectly fine to let your game go above 60 without hitting 120 consistently. Everyone with a high refresh-rate monitor does that. Almost no one who cannot hit 120 most of the time will cap themselves to 60. This is nonsense.

Are we still pretending that VRR fixes stuttering and frame pacing issues?
VRR is only supposed to maintain s sync between the delivered frames and the frames the monitor shows.
It avoids screen tearing and has lower latency than traditional v-sync. But that's it.
Yes, and that makes it perfectly fine to play above 60fps without hitting 120 consistently, which is what was being discussed. Before VRR, you'd get screen tearing and judder.
yeah idk it feels like these dudes are in here trying to reenact digital foundry circa 2021 and making VRR out to be some sort of black magic. It's nice, not perfect.
Nobody said it was and I clearly highlighted its limitations. You, however, agreed with the poster who said that it was better to lock your fps to 60 if you cannot hit 120 consistently, which is false.
 
Last edited:

PaintTinJr

Member
Yes, the TOPS figures for all Nvidia cards are theoretical best cases, as regular rendering and RT could impact how much shared resources the card has available to dedicate to the Tensor cores. The regular SM, RT, and Tensor cores are all still separate of course. How efficient the end result is depends on the managing of shared resources, and how well general workloads (which can be done asynchronously) are issued to the Tensor and general CUDA cores.

In the case of RDNA4 the shared resource model still applies and WMMA commands for TOPS are issued on the SIMD in the compute units of the GPU. Hence the TOPS figure is theoretical best case as well.

That being said, the amount of TOPS that both Nvidia and AMD are delivering with Ampere, Lovelace, and RDNA4 is a bit overkill for ML upscaling. So 300 vs 500 TOPS is a bit academic.

Unknown, but the LLVM changes for RDNA4 indicate sparsity support:


Including sparsity allows the Pro to hit 289 TOPS at the max leaked clock speed of 2350Mhz. So it fits.
When I said theoretical, I wasn't meaning like it could only do +80% bursts in real scenarios like RDNA, I was meaning it is nowhere close to its number for gaming, given that async on Nvidia is not even half as efficient in real game use cases as AMD from what I read some time back, so to get similar utilisation the algorithms on Nvidia need to be deep to be tolerant of latency at high bandwidth processing which isn't gaming, unlike say shaving off 37.5 TOPs from RDNA's full theoretical output and getting a real 20-25 TOPs.
 

winjer

Member
It does not help with stuttering brought about by massive frame-time spikes. It does, however, help with judders brought about persisting or late frames as John duly highlights here:

It helps only in the sense that without VRR and v-sync, with an uneven frame pacing, the monitor will be showing continuing screen tearing, as the GPU delivers frames at a random interval.
But the game is still sending frames with uneven interval, and this can be noticed by gamers.
So VRR is not fixing the issue, it's just hiding additional screen tearing. And screen tearing is often perceived as judder.

It's much less obvious when the fps drops from 120 to 80 then from 80 to 40 or 60 to 40. Shader compilation stutters have nothing to do with this discussion, so not sure why you're bringing them up. Point is, you're completely incorrect that it's preferable to play a locked 60 than unlock your frame rate and let VRR do its thing. Almost no one on PC locks their monitor or games to 60 or 120. Your counter-argument only applies when there are huge shifts in frame times/rates such as instantly going from 120 to 70fps, but how often does that even happen? Most of the time when gaming, your frame rates will remain within a certain window and gradually increase or decrease as you enter heavier scenes or the action ramps up.

Huh, maybe you should have read the argument prior to butting in and saying a bunch of incorrect information? The discussion started because a poster claimed that the PS5 Pro could potentially "run" games better than a $2500 PC equipped with a 4090. I said that this wasn't going to happen. Another poster then said that it might very well be the case with Rift Apart and I answered that it wasn't because it's already much better on PC and runs at a higher frame rate. The poster then replied that going above 60 in Rift Apart isn't good because it increases frame-pacing issues, so it would be tied with 60fps in most cases and that's blatantly wrong. Rift Apart doesn't have frame pacing issues going above 60fps at all and you don't get massive fluctuations that push your fps to 120 one second and the down to 65 the next.

The preferred way on PC is to set an fps cap a few frames below the monitor's max refresh rate so your frame rate never goes beyond the VRR window. After that, it's perfectly fine to let your game go above 60 without hitting 120 consistently. Everyone with a high refresh-rate monitor does that. Almost no one who cannot hit 120 most of the time will cap themselves to 60. This is nonsense.

Even with VRR, I can notice a drop from 80 to 60 fps. VRR helps by not having the screen tearing and by maintaining a reasonable low latency. But it's still noticeable.
VRR will not fix performance optimization issues. And it does not fix stutters from asset streaming of shader compilation.
But it's a lot better than the previous solutions.
Without VRR, devs on consoles had two solutions. One is to continue to enforce v-sync. And that means that a drop from 30 or 60 fps, will result in v-sync adjusting the frame rate to a multiple of the TV refresh rate.
So if a game is running at 60 fps, get s a drop to 59 fps and the game enforces v-sync, it will drop to 30 fps to maintain a sync with the TV at half refresh rate. And this is very noticeable for the player.
The other option is to drop vsync when frame rate drops. And this means screen tearing. Which is also noticeable, but less jarring than enforcing c-sync.

VRR is by far the best solution. But the frame rate is still dropping. With small drops in frame rate, the player might not even notice it. Especially if the frame pacing is good.
But if it's a drop of something like 20 fps, the player will still notice it. And if it's a stutter from asset streaming or shader compilation, the player will still notice it.

So let me reiterate, VRR only fixes problems with syncing frame delivery between the GPU and the monitor.
It does not solve performance issues.
 
Last edited:

Gaiff

SBI’s Resident Gaslighter
It helps only in the sense that without VRR and v-sync, with an uneven frame pacing, the monitor will be showing continuing screen tearing, as the GPU delivers frames at a random interval.
But the game is still sending frames with uneven interval, and this can be noticed by gamers.
So VRR is not fixing the issue, it's just hiding additional screen tearing. And screen tearing is often perceived as judder.
No disagreement there. That's exactly what I meant earlier.
Even with VRR, I can notice a drop from 80 to 60 fps. VRR helps by not having the screen tearing and by maintaining a reasonable low latency. But it's still noticeable.
Yes, you can still perceive frame rate drops, but the higher you go, the harder it is to perceive. Going from 80 to 60 is much less egregious than going from 60 to 40. I said as much months ago regarding VRR:

Yeah, am I insane or are they bullshitting? VRR is supposed to eliminating tearing caused by uneven frames in relation to the monitor. It was never meant to fixed frame rate drops. Every time I hear the guys at DF say, "But with VRR, it's not much of an issue." I go WTF. I feel and see the fps dropping from 60 to 50, how the fuck could I not? VRR won't stop that.
Going from 120 to 100 is even harder to notice.
VRR will not fix performance optimization issues. And it does not fix stutters from asset streaming of shader compilation.
Yes, and never claimed as such, but the context of the discussion is Rift Apart on PC and it has none of these issues.
But it's a lot better than the previous solutions.
Without VRR, devs on consoles had two solutions. One is to continue to enforce v-sync. And that means that a drop from 30 or 60 fps, will result in v-sync adjusting the frame rate to a multiple of the TV refresh rate.
So if a game is running at 60 fps, get s a drop to 59 fps and the game enforces v-sync, it will drop to 30 fps to maintain a sync with the TV at half refresh rate. And this is very noticeable for the player.
The other option is to drop vsync when frame rate drops. And this means screen tearing. Which is also noticeable, but less jarring than enforcing c-sync.

VRR is by far the best solution. But the frame rate is still dropping. With small drops in frame rate, the player might not even notice it. Especially if the frame pacing is good.
But if it's a drop of something like 20 fps, the player will still notice it. And if it's a stutter from asset streaming or shader compilation, the player will still notice it.

So let me reiterate, VRR only fixes problems with syncing frame delivery between the GPU and the monitor.
It does not solve performance issues.
Exactly, but let me give you the full context of this talk, so you can understand exactly what's being argued. I initially replied to a poster who said that we might see the Pro run games better than a 4090 by saying it wouldn't happen. PaintTinJr then referred to Rift Apart, saying it could become the definitive version on the Pro to which I replied that Rift Apart on PC has better AF, RT effects, and a much higher performance. He then proceeded to dismiss all these advantages as he always does when it comes to PC and then went the extra mile and said that 120fps in this game isn't any good because it introduces frame pacing issues, so it's effectively tied with a locked 60fps. Another poster then joined in and said that if you cannot hit 120 consistently, then you're better off leaving your frame rate capped at 60. I disagreed with all of this and retorted that Rift Apart has a high frame rate mode on consoles that goes way above 60 (around 80-100fps), same for GOWR, and they're considered the smoothest and best-performing modes despite not hitting 120fps consistently. One of the reasons being VRR (or the main reason, really). Another poster then said that VRR doesn't help with bad frame pacing, which I agreed with and pointed to Jedi Survivor, but bad frame pacing is an issue with the game so if the game has bad frame pacing, it doesn't matter whether you're playing at 60 or 200fps, it will persist. I gave Bloodborne as an example of a game with 30fps and bad frame pacing.

tl;dr It's perfectly fine to play your games at above 60fps with a VRR display (and often better) without being able to hit 120fps consistently. If your game has bad frame pacing, shader compilation stutters, or traversal stutters, VRR won't help with those, but I never argued otherwise and the game we were talking about has none of those issues anyway. Likewise, unless the game has some quirks, unlocking your fps above 60 won't suddenly introduce massive frame time/frame pacing issues. I maintain that playing Rift Apart on a 4090 with an unlocked fps and max settings is better than letting it sit at 60fps.
 
Last edited:

Zathalus

Member
VRR is great to avoid screen tearing and that alone makes games feel smoother.
And because the GPU is not waiting on the monitor, it feels as snappy as un-synced delivery.
So it is an amazing tech, one of the most important in the last decade.
But like you say, it's not magic. And it won't fix performance issues with a game, such as stuttering or inconsistent frame pacing.
I'm not sure where people got this idea that VRR fixes everything.
As you said, VRR won’t fix bad frame pacing or engine stuttering, but it does allow for uncapped fps to feel smoother due to a lack of screen tearing. So a game that runs between 90-100 fps and has no frame pacing or stuttering issues, as well as no massive fps fluctuations, would feel better than simply locking the fps down to 60.

When I said theoretical, I wasn't meaning like it could only do +80% bursts in real scenarios like RDNA, I was meaning it is nowhere close to its number for gaming, given that async on Nvidia is not even half as efficient in real game use cases as AMD from what I read some time back, so to get similar utilisation the algorithms on Nvidia need to be deep to be tolerant of latency at high bandwidth processing which isn't gaming, unlike say shaving off 37.5 TOPs from RDNA's full theoretical output and getting a real 20-25 TOPs.
Async issues were only a problem on pre Turing. Maxwell and Pascal were terrible with Async compute. The changes on Turing and especially Ampere have eliminated all the performance regressions with Async compute and Lovelace even has better performance uplift with Async compute vs RDNA3 in benchmarks these days. The only performance numbers for TOPS utilization for gaming that I could find was that DLSS has an execution time of roughly 1.52ms to upscale from 1080p to 4K on a 3060ti (https://github.com/NVIDIA/DLSS/blob/main/doc/DLSS_Programming_Guide_Release.pdf) while the leaked developer docs for the Pro stated PSSR requires 2ms to upscale to 4K.
 
OK I just looked and as expected the 4090 is running DLAA with a native 4K resolution in Ratchet (which runs at about 1440p native on Pro). And this is how they want to compare DLSS vs PSSR and say DLSS is still "king"?

I called it weeks ago:

After bragging for years about the miraculous DLSS, now it's gonna be 4K DLAA or bust...

Too bad it's only (sometimes) achievable on a $1800 card that draws 450W by itself....

Really a fair comparison to a console...

But don't worry, 5090 is coming for $2200 and 600W....

You can heat your whole house with that one this winter.....

LOL
 
Last edited:

Gaiff

SBI’s Resident Gaslighter
I called it weeks ago:

After bragging for years about the miraculous DLSS, now it's gonna be 4K DLAA or bust...

Too bad it's only (sometimes) achievable on a $1800 card that draws 450W by itself....

Really a fair comparison to a console...

But don't worry, 5090 is coming for $2200 and 600W....

You can heat your whole house with that one this winter.....

LOL
We still need confirmation though. Not sure if there’s a flaw in the way P Physiognomonics counted the pixels, but that would be a first for DF and a horrible look. DLSS and DLAA aren’t the same thing and labeling DLAA as DLSS isn’t misleading, it’s an outright lie. They’ve never done that in the past and I doubt they would start now, but we need confirmation on the pixel count. Maybe someone can ask them directly on Twitter?

Async issues were only a problem on pre Turing. Maxwell and Pascal were terrible with Async compute. The changes on Turing and especially Ampere have eliminated all the performance regressions with Async compute and Lovelace even has better performance uplift with Async compute vs RDNA3 in benchmarks these days. The only performance numbers for TOPS utilization for gaming that I could find was that DLSS has an execution time of roughly 1.52ms to upscale from 1080p to 4K on a 3060ti (https://github.com/NVIDIA/DLSS/blob/main/doc/DLSS_Programming_Guide_Release.pdf) while the leaked developer docs for the Pro stated PSSR requires 2ms to upscale to 4K.
NVIDIA was really late to the async train whereas AMD has had it since GCN with the HD Radeon 7000 I believe. It’s one od the major reasons DOOM 2016 ran so poorly on Kepler cards that ended up aging horribly.

Back to the TOPS talk, this would at least align with how they present their compute numbers using dual issue, so there’s at least a precedent for that.
 
Last edited:

Fafalada

Fafracer forever
The patent I read explicitly mentioned VR
Yes - VR in wireless context (remote rendering doesn't mean the server is 150ms away, it could be sitting next to the headset - it's still remote when not wired).

and the visualization @ 33:30 in the state of play is intentionally brief and rolling off axis IMHO to visualize accurately but also continue to obscure the exact inner workings of the algorithm to non-PSSR engineers. The visualization also appears to be an in place technique, rather than a scaler like the hole filling patent describes, and IIRC no one from Sony or leaks has described it as an ML upscaler.
I would not rely on marketing visualization as a metric of 'what does this algorithm do' - that's just asking to be misled to weird/random places that bear no connection to reality.
As for the 'hole filling' principally that's what all upscalers do - you have a grid of pixels at lower res (eg. 1080p) and grid of pixels at target (2160p) and you need to fill the spaces 'between' with meaningful detail. The key difference is the patent described irregular grids and holes of different shapes and sizes - something that just doesn't apply to context of reconstruction as we see it in the likes of FSR / DLSS (or what has been shown of PSSR so far).
VR context would be more appropriate because resolution there is supposed to be variable within render-targets, so I'll give you that part could be a match even with local-use cases, but this is where patent language works against us, it's always more broad than the actual application it's protecting, so it gets harder to tell.

I understand your reasoning with the TOPS comparison but believe it only works if the Pro has that in addition to 67 half TF@ FP16 dual issue because working on that all being shared for rendering and PSSR 2ms per frame to redirect to PSSR (1/8 or about 37.5 TOPs of 300) isn't much when divided across 60fps IMO and would need cheap hole filling optimisation.
1-2ms budget is all these things ever get - it's no different for GeForce GPUs, if the algorithm has to run significantly longer, it'd defeat the purpose. The point of high-compute throughput is that you get more done in that limited budget.


3080 has around 500 INT8 TOPS. The Pro, based on RDNA4, is utilising sparsity for that TOPS figure.
The developer docs did not state it this way (nor should they - sparsity multiplier is well - PR - when stated as a fixed figure). I have no visibility into RDNA4 (did AMD release the whitepapers now?), so can't speak for that, nor whether PS4Pro utilises AMD AI extensions or not.

As far as I know DLAA is basically regular TAA with a ML model on top of it to enhance image quality.
The only difference between TAA and 'reconstruction' is whether generated samples are used as 'subsamples' of the final image, or some of them are put on actual pixel grid. Ie. your understanding is the same as saying 'it's DLSS where source/target resolution are the same', which is afaik what it is.
 

AFBT88

Neo Member
VRR does not help alleviate frame time spikes. It only eliminates screen tearing which reduces the perceptual window where inconsistent frametimes become bothersome. A steady 60fps is more pleasant than constantly bouncing between 80-120 fps, and a steady 120fps is better than a steady 60fps.
80-120fps with good frame pacing is much more pleasant than fixed 60fps. And if you are very sensitive to even 80-120fps fluctuations, you can easily cap your fps to something like 90-95 and the 80-95 range would a great alternative to 80-120. That's the beauty of VRR. It doesn't magically make 70-120 feel like 120 fixed, but it gives a great headroom to not notice frame fluctuations up to 10-20% of your target fps.
 

Gaiff

SBI’s Resident Gaslighter
80-120fps with good frame pacing is much more pleasant than fixed 60fps. And if you are very sensitive to even 80-120fps fluctuations, you can easily cap your fps to something like 90-95 and the 80-95 range would a great alternative to 80-120. That's the beauty of VRR. It doesn't magically make 70-120 feel like 120 fixed, but it gives a great headroom to not notice frame fluctuations up to 10-20% of your target fps.
This. I thought I was going insane arguing with those guys.

And just to close out this whole ridiculous debate, playing Rift Apart on an RTX 4090 with an unlocked fps is much better than capping it at 60. The poster claiming that 120fps introduces frame time problems and thus isn't any more preferable than 60fps is biased as hell and was just trying to poorly dismiss an obvious advantage that PC has. Not that getting dragged into arguing about the merits of a $2500 4090-equipped PC over a console was any smarter on my part, but meh, you sometimes inadvertently get pulled into those.
 
Last edited:

Zathalus

Member
The developer docs did not state it this way (nor should they - sparsity multiplier is well - PR - when stated as a fixed figure). I have no visibility into RDNA4 (did AMD release the whitepapers now?), so can't speak for that, nor whether PS4Pro utilises AMD AI extensions or not.


The only difference between TAA and 'reconstruction' is whether generated samples are used as 'subsamples' of the final image, or some of them are put on actual pixel grid. Ie. your understanding is the same as saying 'it's DLSS where source/target resolution are the same', which is afaik what it is.
For sparsity, I was just going off the Ampere whitepaper and then the LLVM changes regarding sparsity support for RDNA4.

And yes, DLAA is basically just DLSS with the upscaling part removed, so TAA vs TAAU in the end. Source and target resolution being the same is a good way to describe it.
 

PaintTinJr

Member
Yes - VR in wireless context (remote rendering doesn't mean the server is 150ms away, it could be sitting next to the headset - it's still remote when not wired).
For 90 -120fps the memory latencies of the switches to wireless encode to transmit and then wireless receive, error correct and decode to start processing make me dubious about that, although the bandwidth reduction would line up with dropping down from hdmi 2.0/2.1 20-15GBit/s down to 3-5GBit/s.

I would not rely on marketing visualization as a metric of 'what does this algorithm do' - that's just asking to be misled to weird/random places that bear no connection to reality.
As for the 'hole filling' principally that's what all upscalers do - you have a grid of pixels at lower res (eg. 1080p) and grid of pixels at target (2160p) and you need to fill the spaces 'between' with meaningful detail. The key difference is the patent described irregular grids and holes of different shapes and sizes - something that just doesn't apply to context of reconstruction as we see it in the likes of FSR / DLSS (or what has been shown of PSSR so far).
VR context would be more appropriate because resolution there is supposed to be variable within render-targets, so I'll give you that part could be a match even with local-use cases, but this is where patent language works against us, it's always more broad than the actual application it's protecting, so it gets harder to tell.
Sony aren't the type of company to spend crazy man hours(money) on a visualization that is false that they can't show to management or shareholders, whether that's for their backlight dimming on their mini LED tvs, how hiRes audio works, how their continuous frame-rate CMOS sensors work or anything else for that matter in their long array of technological innovations.

This motion flow one with AC milan/Brazil legend Kaka in the life size paxinoscope looks like pure marketing, but is still representative of how their frame-insertion makes a picture playback smoother.



So it would be an outlier for this to be just 'marketing' if that visualization isn't consistent with the algorithm.


As for the patent, IMO the key difference wasn't irregular grids from my understanding, and AFAIK DLSS at eg 1080p native to x resolution still renders the native scene completely whole, then Lanczos scales the image like FSR and then analyses the upscaled image using sampling kernels surround the real pixels (and their motion vectors) to feed into the inference equations to increase effective fidelity in those processed kernels.

By contrast, the patent shows a stencil grid with - for arguments sake on average - one in four stencilled pixels unmasked and positioned disproportionately around discontinuities - going by the patent images - so that when rendering at 4K, the actual effective rendering resolution is just 1080p, and then going by the patent, the algorithm then recursively classifies the size and surrounding (spectral detail) of the masked holes and Lanczos(type) fills the low frequency small holes a kernel at a time and ML AI fills the big or high frequency detail holes a kernel at a time and then repeats. Which sounds nothing like DLSS, because key detail in the patented algorithm actually gets rendered as a chassis at the final output image level, and are real 4K pixels, meaning that even the lanczos type filled holes are interpolating from high quality sources, inferenced pixels from high quality sources, rather than low quality.

1-2ms budget is all these things ever get - it's no different for GeForce GPUs, if the algorithm has to run significantly longer, it'd defeat the purpose. The point of high-compute throughput is that you get more done in that limited budget.
Fair point, I was mistakenly thinking you were implying both DLSS and PSSR had north of 30-40 TOPS they could actually dedicate for the algorithms from their theoretical TOPs specs.
 
Last edited:

SweetTooth

Gold Member
I only see PSSR being mentioned with DLSS as a huge victory since Nvidia has multiple iterations and years of work to reach this maturity. Being better or worse than DLSS is not the point and it doesn't matter really, what matters is having a good solution to be implemented in PRO and future Playstation hardware.

Notice how Sony didn’t rest on their laurels when they saw how lackluster AMD solution was, they worked at their own in-house solution.

This is exactly why I prefer Sony to Nintendo and MS. Devs asked for SSD?! Lets make a blazing fast one and design a whole decompression block to make their life easier.

AMD is not keeping up with Nvidia? Lets fix this and invest our resources in developing new technologies.
 

Fafalada

Fafracer forever
For sparsity, I was just going off the Ampere whitepaper and then the LLVM changes regarding sparsity support for RDNA4.
Yea I'm aware of how NVidia whitepapers quote the numbers - I have not seen anyone else do it to date.
Unlike NVidia - Sony refers to sparsity support in the (leaked)dev-docs as an optimization (without a hard-multiplier) but specifically quotes the TOP throughput separately.

For 90 -120fps the memory latencies of the switches to wireless encode to transmit and then wireless receive, error correct and decode to start processing make me dubious about that, although the bandwidth reduction would line up with dropping down from hdmi 2.0/2.1 20-15GBit/s down to 3-5GBit/s.
Yes it's not the simplest of problem spaces - but it's also already been solved (Ie. in Oculus devices, both on wireless and low-bitrate USB2.0). The question is just on increasing fidelity now.

So it would be an outlier for this to be just 'marketing' if that visualization isn't consistent with the algorithm.
It's a bit of a stretch to compare the giant physical installation to a CG video 3 seconds long - but ok. Also I don't think it's inconsistent, I think you just have some fundamental disagreements about upscalers - but let me get to that below.

As for the patent, IMO the key difference wasn't irregular grids
They specifically call out that 'holes' can vary in size, multiple times throughout the patent. It's very clearly an important point of what they're trying to highlight in filing, and it's the main thing that differentiates it from... basically every other upscaler ever (that could be described by the same patent otherwise).

from my understanding, and AFAIK DLSS at eg 1080p native to x resolution still renders the native scene completely whole
As it's done in PSSR. Technically even CBR doesn't actually render the CB pattern the way people think it does - but that's getting into semantics. The main drawback of CB algorithms has been the requirement to modify rendering pipeline to accommodate irregular dx/dy because it's not setup on normal rectangular grid, it's why the adoption was so low (and often painful) and why the normal upscalers (starting with the likes of 'temporal injection') took over as they don't suffer from the drawbacks.

then Lanczos scales the image like FSR and then analyses the upscaled image using sampling kernels surround the real pixels (and their motion vectors) to feed into the inference equations to increase effective fidelity in those processed kernels.
Lanczos (or any other box filter) is just filling in the 'gaps' between real data. You're arguing semantics here. And yes - we perform some kind of 'averaging out' of contributing values from motion reprojected pixels, but that's a given - else we'd end up with reconstructed image with 0 AA. PSSR analysis probably differs here (the implication of spectral component giving different weights to pixel contributions) but it's fundamentally the same thing.

By contrast, the patent shows a stencil grid with
This would be repeating all the problems CBR had and drastically complicate adoption again. It's not the kind of pipeline changes anyone wants to do. Different story if this was a high-res image already transmitted as such (ie. the remote scenario), as then there's no need to touch render pipeline and only perform reconstruction in destination space.

Which sounds nothing like DLSS, because key detail in the patented algorithm actually gets rendered as a chassis at the final output image level, and are real 4K pixels, meaning that even the lanczos type filled holes are interpolating from high quality sources
There's no such thing as 'real 4k pixels'. You either have the complete data or you don't. If you know your exact sampling positions(and adjust for texture sampling frequencies of target resolution) you can literally reconstruct ground-truth 4k 'stenciled' grid by point-sampling from 1080p buffer. It's a mathematical identity if you do it correctly.
 
Last edited:

PaintTinJr

Member
Yea I'm aware of how NVidia whitepapers quote the numbers - I have not seen anyone else do it to date.
Unlike NVidia - Sony refers to sparsity support in the (leaked)dev-docs as an optimization (without a hard-multiplier) but specifically quotes the TOP throughput separately.


Yes it's not the simplest of problem spaces - but it's also already been solved (Ie. in Oculus devices, both on wireless and low-bitrate USB2.0). The question is just on increasing fidelity now.


It's a bit of a stretch to compare the giant physical installation to a CG video 3 seconds long - but ok. Also I don't think it's inconsistent, I think you just have some fundamental disagreements about upscalers - but let me get to that below.


They specifically call out that 'holes' can vary in size, multiple times throughout the patent. It's very clearly an important point of what they're trying to highlight in filing, and it's the main thing that differentiates it from... basically every other upscaler ever (that could be described by the same patent otherwise).


As it's done in PSSR. Technically even CBR doesn't actually render the CB pattern the way people think it does - but that's getting into semantics. The main drawback of CB algorithms has been the requirement to modify rendering pipeline to accommodate irregular dx/dy because it's not setup on normal rectangular grid, it's why the adoption was so low (and often painful) and why the normal upscalers (starting with the likes of 'temporal injection') took over as they don't suffer from the drawbacks.


Lanczos (or any other box filter) is just filling in the 'gaps' between real data. You're arguing semantics here. And yes - we perform some kind of 'averaging out' of contributing values from motion reprojected pixels, but that's a given - else we'd end up with reconstructed image with 0 AA. PSSR analysis probably differs here (the implication of spectral component giving different weights to pixel contributions) but it's fundamentally the same thing.


This would be repeating all the problems CBR had and drastically complicate adoption again. It's not the kind of pipeline changes anyone wants to do. Different story if this was a high-res image already transmitted as such (ie. the remote scenario), as then there's no need to touch render pipeline and only perform reconstruction in destination space.


There's no such thing as 'real 4k pixels'. You either have the complete data or you don't. If you know your exact sampling positions(and adjust for texture sampling frequencies of target resolution) you can literally reconstruct ground-truth 4k 'stenciled' grid by point-sampling from 1080p buffer. It's a mathematical identity if you do it correctly.
I think you are a correct, I don't think the patent describes an upscaler at all, because AFAIK the pixels that get rendered - and more importantly the fragments that make up the pixels that get rendered with all their shading fx too - are native pixels because they occupy the correct placement with the same render passes as normal, just that the GPU didn't have to waste time on any of the masked pixel fragments - that aren't indirectly sampled by unmasked pixel fragments which I believe the patent addressed also.


But at this point I'm not sure we're considering the same situation for devs integrating the patent - if it is PSSR - because IMO the patented solution begins using a game - by either - rendering a full native image first - or just starts with a uniform 1/4 resolution unmasked grid before doing a poor PSSR pass - then analyses frequencies and motion vectors to determine where to position the 1/4 resolution unmasked grid points for the next frame, then renders the next masked frame, uses PSSR to complete the frame, analyses frequencies and motion vectors to determine repositioning of the 1/4 resolution unmasked grid points and then continues indefinitely, effectively being very simple to implement at the beginning and end of a renderer - other than big changes if the offscreen GI rendering swamped performance and then needed lowered/altered to stay within native render budget.
 
Last edited:

Vick

Member
Still using that IGN Ratchet & Clank comparison with raw PC footage vs compressed video capture PS5 Pro..

New Girl Facepalm GIF by HULU


This is so damn bizarre.. especially we already had a much longer and detailed comparison from DF, ProRes Pro footage, where even Alex said PSSR is better than DLSS Performance.

e3Nswtx.png


uUItIrw.png


nTe9NYo.png




I don't really get it, why not discuss using proper sources?
 
Last edited:

ap_puff

Member
Still using that IGN Ratchet & Clank comparison with raw PC footage vs compressed video capture PS5 Pro..

New Girl Facepalm GIF by HULU


This is so damn bizarre.. especially we already had a much longer and detailed comparison from DF, ProRes Pro footage, where even Alex said PSSR is better than DLSS Performance.

e3Nswtx.png


uUItIrw.png


nTe9NYo.png




I don't really get it, why not discuss using proper sources?

hmmmm looking at those pictures closer PSSR does seem to have some advantages if it's not just compression artifacts in those stills, if you look at the crowd DLSS seems to have some sharpening artifacts around the crowd that dont appear in the PSSR version. It does seem to have better anti aliasing and less jaggies though
 

ap_puff

Member
80-120fps with good frame pacing is much more pleasant than fixed 60fps. And if you are very sensitive to even 80-120fps fluctuations, you can easily cap your fps to something like 90-95 and the 80-95 range would a great alternative to 80-120. That's the beauty of VRR. It doesn't magically make 70-120 feel like 120 fixed, but it gives a great headroom to not notice frame fluctuations up to 10-20% of your target fps.
the key is good frame pacing, no one is arguing against the idea that with small deviations in frametime from frame to frame means higher fps = better, I was simply responding to "unlocked fps > capped always" when it's not, you even acknowledge this when you say capping to 90 is better than allowing wild swings from 80-120
 

Vick

Member
hmmmm looking at those pictures closer PSSR does seem to have some advantages if it's not just compression artifacts in those stills, if you look at the crowd DLSS seems to have some sharpening artifacts around the crowd that dont appear in the PSSR version. It does seem to have better anti aliasing and less jaggies though
Not really.

In this comparison both versions use sharpening, and both sport oversharpening artifacts.
 

Gaiff

SBI’s Resident Gaslighter
the key is good frame pacing, no one is arguing against the idea that with small deviations in frametime from frame to frame means higher fps = better, I was simply responding to "unlocked fps > capped always" when it's not, you even acknowledge this when you say capping to 90 is better than allowing wild swings from 80-120
You were responding to something no one said?

And for the last part, no, he said if you’re that anal about frame rates fluctuations, you can always cap it so you have less variations, but if they bother you that much, why would you wanna play at a paltry 60fps anyway? Going from 120 to 80 is merely going from extremely smooth to still very smooth but less so.
 
Last edited:

PaintTinJr

Member
Not really.

In this comparison both versions use sharpening, and both sport oversharpening artifacts.
Given that these are all differently cropped images on each system to almost align them at the Pro's expensive of losing quality foreground pixels in its cropping. What do you make of the full scale images having different FOVs and different draw distances in context of comparing PSSR to DLSS ?
 

Vick

Member
Given that these are all differently cropped images on each system to almost align them at the Pro's expensive of losing quality foreground pixels in its cropping. What do you make of the full scale images having different FOVs and different draw distances in context of comparing PSSR to DLSS ?
BDadpd6.gif


Sorry man, got exactly none of that.
 

PaintTinJr

Member
BDadpd6.gif


Sorry man, got exactly none of that.
In some of the other comparison shots from DF - Oliver - it was clear as day that the PC picture was cropped at the top of the image - its background in the distance with least detail - and the Pro image was cropped at the bottom - in its foreground with maximum detail - to roughly align the images like they are here.

From that it was obvious that the draw distance on Pro was improved over the PS5 port on the PC, and that the Pro version also had an improved field of vision. Which then brings the question: Is it a fair comparison between different cropped pictures to compare PSSR and DLSS?
 

Bitstream

Member
Shadows are dynamic in this scene, later shadow appears in pc version but it's very soft thanks to ray tracing (PS5 Pro doesn't have RT shadows):

apNZfQc.jpeg
Ok. fair enough, the shadow is dynamic and appears later.

Are you going to comment on how the 4K 4090 image on the left has significantly worse anti aliasing (purple arrows) than on the PS5 Pro Image?

Any thoughts on the notion that it's disingenuous to try to compare DLSS vs PSSR Upscaling if your 4090 frame of reference is based on a native 4K DLAA image instead?


1oU9CFJ.jpeg
 

Gaiff

SBI’s Resident Gaslighter
Any thoughts on the notion that it's disingenuous to try to compare DLSS vs PSSR Upscaling if your 4090 frame of reference is based on a native 4K DLAA image instead?
Do we know this for a fact? Because this conversation is turning very bizarre. On the one hand, you have someone claiming it's 4K DLAA when the image clearly is labeled DLSS. Said person said they counted the pixels. Should we take their word for it and believe there was no flaw with their methodology? On the other hand, you have people finding advantages on a supposedly blurry and bad picture from a non-ProRes footage. Now it seems that you're just running with the fact that it has to be 4K DLAA...

If it's 4K DLAA vs PSSR and a bad photo at that, I have no idea how you guys are finding better aspects with PSSR, unless you legitimately believe that PSSR upscaling from 1440p is better than 4K+DLAA, which I find extremely unlikely.
 
Last edited:

Bojji

Member
Ok. fair enough, the shadow is dynamic and appears later.

Are you going to comment on how the 4K 4090 image on the left has significantly worse anti aliasing (purple arrows) than on the PS5 Pro Image?

Any thoughts on the notion that it's disingenuous to try to compare DLSS vs PSSR Upscaling if your 4090 frame of reference is based on a native 4K DLAA image instead?


1oU9CFJ.jpeg

I used this image in the first place to compare shadows and crowds - and it's useful for that. I think capture was too low quality to compare IQ objectively + we have co confirmation what kind of DLSS settings are used here.
 

PaintTinJr

Member
Do we know this for a fact? Because this conversation is turning very bizarre. On the one hand, you have someone claiming it's 4K DLAA when the image clearly is labeled DLSS. Said person said they counted the pixels, should we take their word for it? There was no flaw with their methodology? On the other hand, you have people finding advantages on a supposedly blurry and bad picture from a non-ProRes footage. Now it seems that you're just running with the fact that it has to be 4K DLAA...
Would you agree that looking at that image background, you can see the two images diverge massively and therefore cropped differently to make them align?
 

Zathalus

Member
Not really.

In this comparison both versions use sharpening, and both sport oversharpening artifacts.
The comparison is already useless for that, when Oliver was speaking to the Core Technology Director at Insomniac he agreed the game had too much sharpening and so they removed it from PSSR for the game.
 

Vick

Member
In some of the other comparison shots from DF - Oliver - it was clear as day that the PC picture was cropped at the top of the image - its background in the distance with least detail - and the Pro image was cropped at the bottom - in its foreground with maximum detail - to roughly align the images like they are here.

From that it was obvious that the draw distance on Pro was improved over the PS5 port on the PC, and that the Pro version also had an improved field of vision. Which then brings the question: Is it a fair comparison between different cropped pictures to compare PSSR and DLSS?
No idea how impactful that is, ultimately.

The comparison is already useless for that, when Oliver was speaking to the Core Technology Director at Insomniac he agreed the game had too much sharpening and so they removed it from PSSR for the game.
Actually, it's any comparison made after that one that is useless, because those (like IGN) will be made using PC sharpened vs Pro no sharpening.. it's going to be an unthinkable mess.

The first comparison DF made (ProRes footage) really is the only one that counts.
Due to both the source being pristine, and both version having equal IQ settings, making it possible to analyze and compare the AI upscaling solutions only.
 

Gaiff

SBI’s Resident Gaslighter
Oliver said they didn't even have enough footage to come to a conclusion, so it's ultimately pointless of us to try and determine which is better. There's just a month left until the Pro's release and it's likely DF will have live games to analyze a few days/week before the public does. Only a few more weeks guys and the war can continue.
 

Kangx

Member from Brazile
hmmmm looking at those pictures closer PSSR does seem to have some advantages if it's not just compression artifacts in those stills, if you look at the crowd DLSS seems to have some sharpening artifacts around the crowd that dont appear in the PSSR version. It does seem to have better anti aliasing and less jaggies though
It's more clear in the IGN interview clips. DLSS exhibits quite a bit more artifacts overall in thr crowds.
 

Bojji

Member
Oliver said they didn't even have enough footage to come to a conclusion, so it's ultimately pointless of us to try and determine which is better. There's just a month left until the Pro's release and it's likely DF will have live games to analyze a few days/week before the public does. Only a few more weeks guys and the war can continue.

Can't wait!

05aa1892bd354341918dac9f8b482369.gif
 
Last edited:

PaintTinJr

Member
No idea how impactful that is, ultimately.

..
Well for a start, aliasing gets worse the more an object moves from foreground to background. So a foreground cropped image aligned to a background cropped image might present more aliasing.

Objects also start to look softer moving towards the background because depth cuing fog equations increases density from near to far.... and IMO and this was the original reason I quoted your 'over sharpening on both' comment was that it might easily trick someone into thinking something is overly sharpened because the same item in a comparison shot - cropped differently and with an inferior field of vision and draw distance - might look softer from depth cueing fog, when it should have been equally sharp in a true like for like comparison.
 
Last edited:

ap_puff

Member
Not really.

In this comparison both versions use sharpening, and both sport oversharpening artifacts.
Yeah, but if you look in the top left corner the crowd it's much more noticeable maybe because the crowd density is higher on the PC version? Also in the top middle DLAA has some sort of ghosting artifacts on the balloons which are absent on PSSR. I find that interesting; overall the PC version still looks better mainly due to better base settings (you can tell some of the geometric detail is quite a bit higher on PC), I just really think it's cool that PSSR is putting up some sort of fight.

*edit* to be clear i'm comparing the first image with the comparison vs DLAA, also PS version is missing depth-of-field?
 
Last edited:
Top Bottom