SakkanartSilentJealousy
Member
A bit drawback to create the Pro is BC, hope PS6 will get a huge jump.
Hardware behind PSSR is so impressive.
Hardware behind PSSR is so impressive.
So what you're saying is that nvidia TOPS performance doesn't really matter at all "because you can just run those things on a normal GPU without any ML"?
How is it not important? Even a simple thing like framegen would benefit greatly from better performance, especially at higher base fps.
I'm not sure what you're getting at here anyway. I just said your claim that there is "no other use for ai in games other than Super resolution" is blatantly false. Your subjective opinion about the "importance" of these other uses isn't something I particularly care about.
THIS!!!Cerny literally stated the secondary motive. The primary is to improve machine graphics (which has implicit retaining hardcore fans), but the secondary is about technology development and iteration towards the next generation. Which is both internal/partner and dev driven. Ie: not primarily about the consumer in the case of the Pro.
I mean you could almost swap them around, because forward looking it sounds like they didn't have confidence executing their PS6 vision without the PS5 pro. The mid gen improvements being a nice bonus. Perhaps it started one way and has ended up the other. Regardless I don't think you can understate the importance of having physical hardware at this point in the cycle already integrating conceptual elements of the next generation.
+ PS6 will likely run PS5 games in BC in Pro mode. So, they would have an advantage vs. Xbox if they also launch in the same timeframe.THIS!!!
That is a huge reason for the PS5 Pro existence, assuming PS6 comes out holiday 2028. That gives them 4 years of experience of ML upscaling, instead of taking a first Crack at it at PS6 Launch. Cerny literally said that.
Raw performance? Most likely not. But the compounding effect of the hardware and ML/Algorithm improvements over the next 4/5 years might. That is what Amethyst is about i think.Does that mean Ps6 will also be 9 x more processing power than Ps5?
I think he said that it would be labeled differently and not be PSSR. Almost like PSSR is this version and next will be a different name vs PSSR 2.0He was also kinda vague to the question about how much PSSR will keep on improving, i would have loved to get a clear answer there, he started talking about Frame generation at that point.
Which was my point, compared to PS5 RDNA2 it was upgraded (it was not in PS5 as Cerny’s team is very strict / pragmatic with their budget and they are not shy about asking AMD to redesign the FPU of Ryzen 2 if it helps them in their goals). Then again other stuff aside from RT and ML/AI was also updated beyond RDNA2 as Cerny briefly touched upon (geometry engine, vertex/triangle setup and rasterisation, etc…). They just did not bring in anything that would force devs to recompile all their shaders.Both of these were in original rDNA2 (even Xbox has them), for some reason omitted for PS5 GPU.
It kind of is if you want the rest of the GPU free for important graphical processing which still has a long way to go.It's minor and not much important when they run in on normal GPU without any ML acceleration (PS5 GPU is even lacking stuff that would allow XeSS run on it).
It is not secret nor sauce, they explained what it is and that beyond basic binary compatibility and having more CUs you need to opt-in to these new features and change your code (you have seen plenty of partnership nVIDIA does with devs to get them to run things “better” no? Nothing new).I really doubt that there is a 'secret source' that makes the performance increase exponentially in RT performance, probably what we have already seen is what it is, after all if RDNA4 has significant improvements should be reflected in all the games already released, neither Intel nor Nvidia have had to adapt anything in the games already released to perform better than AMD in RT so why should it be any different?
We made a mid ass unbalanced upgrade for 920 euros.Summary on some key points please?
Yes, but that applies to everything in AI today - including NVidia GPUs.
Eg- for a 4090, bandwidth situation is even worse - as it does a bit more more than double the TOPs (660) and has less than double the bandwidth (about 1TB/s) of a PS5Pro.
Ultimately, the memory bandwidth is over 3 orders of magnitude removed from compute throughput. So for every memory access, you need to perform at least 1000 operations, otherwise you're wasting compute waiting on memory.
This is also why slower caches (like infinity) are not even making a dent in this problem - with 2TB/s - we're still in the 3 orders of magnitude too slow range.
I like the honesty here. It's logical but they are essentially saying that people who buy a PS5 Pro are beta testers for the PS6 which can be taken either way.THIS!!!
That is a huge reason for the PS5 Pro existence, assuming PS6 comes out holiday 2028. That gives them 4 years of experience of ML upscaling, instead of taking a first Crack at it at PS6 Launch. Cerny literally said that.
Taken at face value (ie. suggesting that the units operate completely independently from shader compute) that would only make bandwidth situation worse, not better.For one, the Pro is using something similar to WMMA. So these are instructions integrated into the Shader pipeline. While the Tensor units in Ada, are dedicated units
This I explained in the post above (and Mark did in his talk even more) - existing caches are still several orders of magnitude too slow - even if you get 100% utilisation. Eg. Ada best case nets you 5TB/s with a 4090 - which sure - it's 5x improvement over its raw memory access, but that's till over 200x slower from where we want to be.Another thing to consider is that vram bandwidth in Ada is much efficient, than in RDNA2. So in practical terms, a 4090 will have much more memory bandwidth than the Pro.
Taken at face value (ie. suggesting that the units operate completely independently from shader compute) that would only make bandwidth situation worse, not better.
That said - my understanding is that execution resources are still shared anyway even in Ada - so we're not looking at parallel execution of both types of compute. But that just brings us back to the same 1000:1 ratio we started with.
This I explained in the post above (and Mark did in his talk even more) - existing caches are still several orders of magnitude too slow - even if you get 100% utilisation. Eg. Ada best case nets you 5TB/s with a 4090 - which sure - it's 5x improvement over its raw memory access, but that's till over 200x slower from where we want to be.
Or to be specific - Mark's example of 3% utilisation would become 15% - but that still leaves 85% of performance on the table. The whole point why they went to registers as cache is that nothing else in the system runs at the speeds needed, not even L0.
The one positive with all of the above is that for a workload like upscaling - managing memory in/out is relatively trivial, as tiling can predictably access each block of memory in succession and give us that 1000x speedup on a per tile basis, as long as model doesn't need to jump around memory beyond that. But it shows the need for more fast on-chip memory.
It's analogue of the usecases Cell was originally created for (or Larabee/EE VUs) - ie. stream processors that have register-speed memory coupled to them - main change is that the bandwidth gap is wider than ever now.
Not really. It is after all a 1.0 release for the hardware and PSSR etc. But its conceptually in line with whats planned for PS6. Its not Sony's fault that people despite all the information available have unrealistic expectations as to what the Pro can do. It has the capability to do everything they said it could. We are in a bit of a strange place right now because almost everything we have seen is a retrofit which is muddying the water so I'm not entirely unsympathetic to the negative narrative, but to believe it you either need tunnel vision or need to read wider, ie not just warrior comments.I like the honesty here. It's logical but they are essentially saying that people who buy a PS5 Pro are beta testers for the PS6 which can be taken either way.
Ok we're done here.A 4090, even with a heavy use case of ChatGPT3 running on it, has a Tensor usage rate of 50-60%. It's mostly bandwidth starved, but it's not as bad as the pro.
Yes - so it's useful for async-execution (basically to increase occupancy), but we're talking single digit % efficiencies here - again, operating several orders of magnitude away from the problem described.Having dedicated Tensor units means, these units have their own L1 and registers. And don't have to contend with shader units.
@Kaiserstark summary
- Introduction to PS5 Pro
- Mark Cerny explains the PS5 Pro's focus on improving GPU performance and addressing mid-generation technological advances.
- Unlike generational leaps (e.g., PS3 → PS4), the Pro model optimizes existing hardware for better game performance without exclusive games.
- Key Goals for PS5 Pro
- Minimize workload for game developers while achieving significant improvements and make games like Dustborn and Concord quicker to complete.
- Focus on enhancing GPU capabilities for noticeable gameplay improvements.
- The "Big Three" Improvements
- Larger GPU with 67% more workgroup processors for faster rendering.
- Advanced ray tracing features using future AMD RDNA technologies.
- AI-driven upscaling via PlayStation Spectral Super Resolution (PSSR).
- Hybrid GPU Architecture
- Combines AMD RDNA 2 and RDNA 3 technologies for easier developer adoption.
- Focused improvements include faster vertex processing and new ray-tracing structures.
- Improved Memory System
- Increased memory bandwidth (28% higher than PS5).
- Extra memory added using DDR5 to support ray tracing, upscaling, and higher resolutions, including potential 8K.
- Ray Tracing Enhancements
- Doubling ray intersection speeds using a new BV8 acceleration structure.
- Improved performance consistency with hardware-based stack management to reduce divergence issues.
- Machine Learning Integration
- PSSR uses a custom lightweight convolutional neural network (CNN) to enhance image quality and resolution.
- Focus on efficient memory use to process frames quickly, enabling 4K upscaling with minimal system bottlenecks.
- Developer-Friendly Design
- PSSR supports variable upscaling ratios and integrates seamlessly with existing game engines.
- Maintains compatibility with PS5 for ease of game development.
- Future of Machine Learning in Gaming
- SIE aims to develop generalized machine learning architectures for broader applications in gaming.
- Goals include fully fused networks, improved ray tracing, and richer game graphics.
- PS5 Pro's Legacy and Vision
- The advancements in PS5 Pro lay the groundwork for future technologies, including enhanced machine learning and graphics processing innovations.
- Collaboration with AMD and internal R&D aims to revolutionize game design and player experiences.
Mark's explanation is pretty explicit here. The 8bit and 16bit extensions are AI specific instructions they added (so not flopflation) - note he makes a specific point they could have gone higher for 16bit but didn't see the practical use for it (or 32bit).the TOPS for 8bit was 300TOPS, but the 16bit integer TOPs was 66TOPS which is a 16 x 2(RPM) x 2(Flopflation) calculation
That was one of those 'blink and you miss it' moments - but the distinction he makes is important. The difference from simply 'rendering lower resolution' is that screenspace math differs (especially the interpolants). Ie. if you remember some games that added DLSS and ended up with comically low-resolution textures (until patches fixed it) - that's an example of the difference between lowres+upscale to 'sparse sampling+hole reconstruction'.kicking off with saying about holes in image being filled, the PSSR being slightly re-entrant
We made a mid ass unbalanced upgrade for 920 euros.
But hey, ps6 is probably gonna be THE shit, thanks for betatesting our upscaling solution.
Please and thank you.
(People can add to that if i missed something)
I was only pre-empting with his own term, while demonstrating with the multipliers that it was derived from the RDNA3 dual issue.Mark's explanation is pretty explicit here. The 8bit and 16bit extensions are AI specific instructions they added (so not flopflation) - note he makes a specific point they could have gone higher for 16bit but didn't see the practical use for it (or 32bit).
Yeah, definitely such a fleeting description he was focusing on the 1% with that comment and the 99% with what followed. You and I briefly discussed this in a PS2/Dreamcast thread IIRC as I hadn't realised the importance of AI/ML still needing quality base textures.That was one of those 'blink and you miss it' moments - but the distinction he makes is important. The difference from simply 'rendering lower resolution' is that screenspace math differs (especially the interpolants). Ie. if you remember some games that added DLSS and ended up with comically low-resolution textures (until patches fixed it) - that's an example of the difference between lowres+upscale to 'sparse sampling+hole reconstruction'.
One single stack of HBM4 can give 1.5TB of bandwidth.But did the speed per pin increase? Last time I saw hbm3e was still slower per pin than GDDR6. And I know Gddr7 is 30% faster and 6X.
If anything I see PS6 going with GDDR7X and the 40-48 gbps per pin
If you think its not needed why you complaining?
But there is no downside though. The PS5 Pro is still the most powerful console plus ML upscaling. Thus the best console to play Multiplats and exclusive games to boot.I like the honesty here. It's logical but they are essentially saying that people who buy a PS5 Pro are beta testers for the PS6 which can be taken either way.
I do admire their asshole, i mean their hassle.- 30 WGPs
- architecture is "between RDNA2 and RDNA3", called RDNA2.x
- e.g. geometry pipeline is RDNA3 based
- BVH8 acceleration structure for RT
- no double-FP like RDNA3
- 16.7 TFlops
What is the reasoning behind not implementing full RDNA3 dual issue?
Selling overpriced hardware as a testbed for future developments is solid decision from business stand point.
He explained it. This is not a new gen device. This is doing far more advanced things with ML and their RT customizations than RDNA3 is.What is the reasoning behind not implementing full RDNA3 dual issue?
For the GPU they didn't adopt full RDNA3 feature set because it would force developers to have different compilers, patches and executables. They don't want that burden for developers for a mid gen upgrade.
I know some of the Xbox people that care about public opinion used to come here but not usually the tech minded typesYou all think Mark Cerny visits NeoGAF?
Developer issue. If every game looked like shit you'd have a valid argument. But the majority look much better.
For a 700 “pro” console with all the technical bullshit that cerny spouts, higher frames should be the fucking minimum.MH wilds struggles on ps5 and ps5 pro greatly imptove the frame rate.
It's not overpriced, it's got a custom RX 6800 GPU more memory, a 2 TB SSD, it's a mid gen refresh, the reasoning for not including an RDNA 3 GPU was simply because it would need a separate game to be compiled for the Pro rather than just building once for a PS5/Pro.- 30 WGPs
- architecture is "between RDNA2 and RDNA3", called RDNA2.x
- e.g. geometry pipeline is RDNA3 based
- BVH8 acceleration structure for RT
- no double-FP like RDNA3
- 16.7 TFlops
What is the reasoning behind not implementing full RDNA3 dual issue?
Selling overpriced hardware as a testbed for future developments is solid decision from business stand point.
Not even an Nvidia 5090 can control shit decisions by shitty developers. Just as Cerny said in the video, it requires a rethinking of how developers approach this. Some are excelling right away, some are just tossing in a shit ton of raytracing and hoping for the best (see Alan Wake 2).For a 700 “pro” console with all the technical bullshit that cerny spouts, higher frames should be the fucking minimum.
We made a mid ass unbalanced upgrade for 920 euros.
But hey, ps6 is probably gonna be THE shit, thanks for betatesting our upscaling solution.
Please and thank you.
(People can add to that if i missed something)
no, PS5 Pro main expectation was to put out 30fps quality mode type of graphics at 60fps. The first party games generally have done this. The third party ones have been hit and miss.PS5 Pro main expectation was to get borderline unplayable games (image quality/frame rate) over the line. Fidelity/frame rate improvements to already technically excellent games (eg. Sony 1st party) are more on the enthusiast side. With the mixed results so far on the former.
- 30 WGPs
- architecture is "between RDNA2 and RDNA3", called RDNA2.x
- e.g. geometry pipeline is RDNA3 based
- BVH8 acceleration structure for RT
- no double-FP like RDNA3
- 16.7 TFlops
What is the reasoning behind not implementing full RDNA3 dual issue?
Selling overpriced hardware as a testbed for future developments is solid decision from business stand point.
Dude he literally answered this question, in like the first 10 minutes.What is the reasoning behind not implementing full RDNA3 dual issue?
I also have 1% of your shilling power but i'm gonna manage to survive somehowI love these takes from posters with 1% of Cerny's intelligence
Keep them coming lol
You didn't bother to watch the video?
Also I like how PS5 was RDNA1.x talk disappeared
Cerny talk didn't change the fact that PS5 is missing RDNA2 features.
Clearly nothing is missing. These consoles have so many custom parts that saying RDNA2 has missing features means very little.Cerny talk didn't change the fact that PS5 is missing RDNA2 features.
NVIDIA doesn't do that.Having dedicated Tensor units means, these units have their own L1 and registers.
Shader binary compatibility (and not wanting to add a third HW instruction encoding scheme)What is the reasoning behind not implementing full RDNA3 dual issue?
Cerny talk didn't change the fact that PS5 is missing RDNA2 features.
Whatever Cerny didn't put there is NOT NEEDED!
Want proof? Look at PS5 vs Series X
Clearly nothing is missing. These consoles have so many custom parts that saying RDNA2 has missing features means very little.
NVIDIA doesn't do that.
And yet those features are in Pro.
You didn't bother to watch the video?
He explained it. This is not a new gen device. This is doing far more advanced things with ML and their RT customizations than RDNA3 is.
It's not overpriced, it's got a custom RX 6800 GPU more memory, a 2 TB SSD, it's a mid gen refresh, the reasoning for not including an RDNA 3 GPU was simply because it would need a separate game to be compiled for the Pro rather than just building once for a PS5/Pro.