Valve Hardware Day 2006 - Multithreaded Edition
by Jarred Walton on November 7, 2006 6:00 AM EST- Posted in
- Trade Shows
The Future of Gaming?
The first thing that Valve had to do was look at the threading models available: how can they distribute work to the cores? Having decided on hybrid threading, the next step was to decide what sort of software tools were needed, and to create those tools as necessary. Most of that work should now be done or nearing completion, so the big question is with the threading in place, how are they going to apply the power available? Up to this point we have primarily been concerned with what multithreading is and how Valve has attempted to implement the technology. All this talk about multi-core support sounds good, but without a compelling need for the technology it really isn't going to mean much to most people.
We've already seen quite a few games that are almost entirely GPU bound, so some might argue that the GPU needs to get faster before we even worry about the CPU. As Valve sees things, however, the era of pretty visuals is coming to an end. We have now reached the point where in terms of graphics most people are more than satisfied with what they see. Games like Oblivion look great, but it's still very easy to tell that you're not in a real world. This does not mean that better graphics are not important, but Valve is now interested in taking a look at the rest of the story and what can be done beyond graphics. Valve also feels that their Source engine has traditionally been more CPU limited anyway, so they were very interested in techniques that would allow them to improve CPU performance.
Before we get to the stuff beyond graphics, though, let's take a quick look at what's being done to improve graphics. The hybrid threading approach to the rendering process can be thought of as follows:
At present, everything up to drawing in the rendering pipeline is essentially CPU bound, and even the drawing process can be at least partially CPU bound with tasks such as generating dynamic vertex buffers. The second image above shows the revised pipeline and how you might approach it. For example, spending CPU time building the vertex buffer makes the GPU more efficient, so even in cases where you are mostly GPU limited you can still improve performance by using additional CPU power. Another item related to graphics is the animations and bone transformations that need to be done prior to rendering a scene. Bone transforms can be very time consuming, and as you add more creatures the CPU limitations become more and more prevalent. One of the solutions is to simply cheat, either by reducing the complexity of the animations, cloning one model repeatedly, or by other methods; but with more processor power it becomes possible to actually do full calculations on more units.
A specific Half-Life example that was given is that you might have a scene with 200 Combine soldiers standing in rank, and the animations for that many units requires a huge chunk of time. All of the bone transformations can be done in parallel, so more CPU power available can directly equate to having more entities on screen at the same time.
With more computational power available to do animations, the immersiveness of the game world can also be improved. Right now, the amount of interaction that most creatures have with the environment is relatively limited. If you think about the way people move through the real world, they are constantly bumping into other objects or touching other objects and people. While this would be largely a visual effect, the animations could be improved to show people actually interacting with the environment, so to an onlooker someone running around the side of a house might actually reach out their hand and grab the wall as they turn the corner. Similarly, a sniper lying prone on the top of a hill could actually show their body adjusting to the curvature of the ground rather than simply being a standard "flat" prone position. Two characters running past each other could even bump and react realistically, with arms and bodies being nudged to the side instead of mysteriously gliding past each other.
One final visual aspect that CPU power can influence is the rendering of particle systems. Valve has given us a benchmark that runs through several particle system environments, and we will provide results for the benchmark later. How real world the benchmark is remains to be seen, as it is more of a technology demonstration at present, but the performance increases are dramatic. Not being expert programmers, we also can't say for sure how much of the work that they are doing on the CPU could actually be done elsewhere more efficiently.
Besides the benchmark, though, particle systems can be used to more realistically simulate things like flames and water. Imagine a scene where a campfire gets extinguished by real dynamically generated rain rather than as a canned animation: you could actually see small puffs of smoke and water as individual drops hit the fire, and you might even be able to kick the smoldering embers and watch individual sparks scatter around on the ground. The goal is to create a more immersive world, and the more realistic things look and behave, the more believable the environment. Is all of this necessary? Perhaps not with conventional games that we're used to, but certainly it should open up new gameplay mechanics and that's rarely a bad thing.
The first thing that Valve had to do was look at the threading models available: how can they distribute work to the cores? Having decided on hybrid threading, the next step was to decide what sort of software tools were needed, and to create those tools as necessary. Most of that work should now be done or nearing completion, so the big question is with the threading in place, how are they going to apply the power available? Up to this point we have primarily been concerned with what multithreading is and how Valve has attempted to implement the technology. All this talk about multi-core support sounds good, but without a compelling need for the technology it really isn't going to mean much to most people.
We've already seen quite a few games that are almost entirely GPU bound, so some might argue that the GPU needs to get faster before we even worry about the CPU. As Valve sees things, however, the era of pretty visuals is coming to an end. We have now reached the point where in terms of graphics most people are more than satisfied with what they see. Games like Oblivion look great, but it's still very easy to tell that you're not in a real world. This does not mean that better graphics are not important, but Valve is now interested in taking a look at the rest of the story and what can be done beyond graphics. Valve also feels that their Source engine has traditionally been more CPU limited anyway, so they were very interested in techniques that would allow them to improve CPU performance.
Before we get to the stuff beyond graphics, though, let's take a quick look at what's being done to improve graphics. The hybrid threading approach to the rendering process can be thought of as follows:
At present, everything up to drawing in the rendering pipeline is essentially CPU bound, and even the drawing process can be at least partially CPU bound with tasks such as generating dynamic vertex buffers. The second image above shows the revised pipeline and how you might approach it. For example, spending CPU time building the vertex buffer makes the GPU more efficient, so even in cases where you are mostly GPU limited you can still improve performance by using additional CPU power. Another item related to graphics is the animations and bone transformations that need to be done prior to rendering a scene. Bone transforms can be very time consuming, and as you add more creatures the CPU limitations become more and more prevalent. One of the solutions is to simply cheat, either by reducing the complexity of the animations, cloning one model repeatedly, or by other methods; but with more processor power it becomes possible to actually do full calculations on more units.
A specific Half-Life example that was given is that you might have a scene with 200 Combine soldiers standing in rank, and the animations for that many units requires a huge chunk of time. All of the bone transformations can be done in parallel, so more CPU power available can directly equate to having more entities on screen at the same time.
With more computational power available to do animations, the immersiveness of the game world can also be improved. Right now, the amount of interaction that most creatures have with the environment is relatively limited. If you think about the way people move through the real world, they are constantly bumping into other objects or touching other objects and people. While this would be largely a visual effect, the animations could be improved to show people actually interacting with the environment, so to an onlooker someone running around the side of a house might actually reach out their hand and grab the wall as they turn the corner. Similarly, a sniper lying prone on the top of a hill could actually show their body adjusting to the curvature of the ground rather than simply being a standard "flat" prone position. Two characters running past each other could even bump and react realistically, with arms and bodies being nudged to the side instead of mysteriously gliding past each other.
Click to enlarge |
One final visual aspect that CPU power can influence is the rendering of particle systems. Valve has given us a benchmark that runs through several particle system environments, and we will provide results for the benchmark later. How real world the benchmark is remains to be seen, as it is more of a technology demonstration at present, but the performance increases are dramatic. Not being expert programmers, we also can't say for sure how much of the work that they are doing on the CPU could actually be done elsewhere more efficiently.
Besides the benchmark, though, particle systems can be used to more realistically simulate things like flames and water. Imagine a scene where a campfire gets extinguished by real dynamically generated rain rather than as a canned animation: you could actually see small puffs of smoke and water as individual drops hit the fire, and you might even be able to kick the smoldering embers and watch individual sparks scatter around on the ground. The goal is to create a more immersive world, and the more realistic things look and behave, the more believable the environment. Is all of this necessary? Perhaps not with conventional games that we're used to, but certainly it should open up new gameplay mechanics and that's rarely a bad thing.
55 Comments
View All Comments
Nighteye2 - Wednesday, November 8, 2006 - link
Ok, so that's how Valve will implement multi-threading. But what about other companies, like Epic? How does the latest Unreal Engine multi-thread?Justin Case - Wednesday, November 8, 2006 - link
Why aren't any high-end AMD CPUs tested? You're testing 2GHz AMD CPUs against 2.6+ GHz Intel CPUs. Doesn't Anandtech have access to faster AMD chips? I know the point of the article is to compare single- and multi-core CPUs, but it seems a bit odd that all the Intel CPUs are top-of-the-line while all AMD CPUs are low end.JarredWalton - Wednesday, November 8, 2006 - link
AnandTech? Yes. Jarred? Not right now. I have a 5000+ AM2, but you can see that performance scaling doesn't change the situation. 1MB AMD chips do perform better than 512K versions, almost equaling a full CPU bin - 2.2GHz Opteron on 939 was nearly equal to the 2.4GHz 3800+ (both OC'ed). A 2.8 GHz FX-62 still isn't going to equal any of the upper Core 2 Duo chips.archcommus - Tuesday, November 7, 2006 - link
It must be a really great feeling for Valve knowing they have the capacity and capability to deliver this new engine to EVERY customer and player of their games as soon as it's ready. What a massive and ugly patch that would be for virtually any other developer.Don't really see how you could hate on Steam nowadays considering things like that. It's really powerful and works really well.
Zanfib - Tuesday, November 7, 2006 - link
While I design software (so not so much programming as GUI design and whatnot), I can remember my University courses dealing with threading, and all the pain threading can bring.I predicted (though I'm sure many could say this and I have no public proof) that Valve would be one of the first to do such work, they are a very forward thinking company with large resources (like Google--they want to work on ANYthing, they can...), a great deal of experience and, (as noted in the article) the content delivery system to support it all.
Great article about a great subject, goes a long way to putting to rest some of the fears myself and others have about just how well multi-core chips will be used (with the exception of Cell, but after reading a lot about Cell's hardware I think it will always be an insanely difficult chip to code for).
Bonesdad - Tuesday, November 7, 2006 - link
mmmmmmmmm, chicken and mashed potatoes....Aquila76 - Tuesday, November 7, 2006 - link
Jarred, I wanted to thank you for explaining in terms simple enough for my extremely non-technical wife to understand why I just bought a dual-core CPU! That was a great progression on it as well, going through the various multi-threading techniques. I am saving that for future reference.archcommus - Tuesday, November 7, 2006 - link
Another excellent article, I am extremely pleased with the depth your articles provide, and somehow, every time I come up with questions while reading, you always seem to answer exactly what I was thinking! It's great to see you can write on a technical level but still think like a common reader so you know how to appeal to them.With regards to Valve, well, I knew they were the best since Half-Life 1 and it still appears to be so. I remember back in the days when we weren't even sure if Half-Life 2 was being developed. Fast forward a few years and Valve is once again revolutionizing the industry. I'm glad HL2 was so popular as to give them the monetary resources to do this kind of development.
Right now I'm still sitting on a single core system with XP Pro and have lots of questions bustling in my head. What will be the sweet spot for Episode 2? Will a quad core really offer substantially better features than a dual core, or a dual core over a single core? Will Episode 2 be fully DX10, and will we need DX10 compliant hardware and Vista by its release? Will the rollout of the multithreaded Source engine affect the performance I already see in HL2 and Episode 1? Will Valve actually end up distributing different versions of the game based on your hardware? I thought that would not be necessary due to the fact that their engine is specifically designed to work for ANY number of cores, so that takes care of that automatically. Will having one core versus four make big graphical differences or only differences in AI and physics?
Like you said yourself, more questions than answers at this point!
archcommus - Tuesday, November 7, 2006 - link
One last question I forgot to put in. Say it was somehow possible to build a 10 or 15 GHz single core CPU with reasonable heat output. Would this be better than the multi-core direction we are moving towards today? In other words, are we only moving to mult-core because we CAN'T increase clock speeds further, or is this the preferred direction even if we could.saratoga - Tuesday, November 7, 2006 - link
You got it.A higher clock speed processor would be better, assuming performance scaled well enough anyway. Parallel hardware is less general then serial hardware at increasing performance because it requires parallelism to be present in the workload. If the work is highly serial, then adding parallelism to the hardware does nothing at all. Conversely, even if the workload is highly parallel, doubling serial performance still doubles performance. Doubleing the width of a unit could double the performance of that unit for certain workloads, while doing nothing at all for others. In general, if you can accelerate the entire system equally, doubling serial performance will always double program speed, regardless of the program.
Thats the theory anyway. Practice says you can only make certain parts faster. So you might get away with doubling clock speed, but probably not halving memory latency, so your serial performance doesn't scale like you'd hope. Not to mention increasing serial performance is extremely expensive compared to parallel performance. But if it were possible, no one would ever bother with parallelism. Its a huge pain in the ass from a software perspective, and its becoming big now mostly because we're starting to run out of tricks to increase serial performance.