Skip to content

Main Navigation

Puget Systems Logo
  • Solutions
    • Recommended Systems For:
    • Content Creation
      • Photo Editing
        • Recommended Systems For:
        • Adobe Lightroom Classic
        • Adobe Photoshop
        • Stable Diffusion
      • Video Editing
        • Recommended Systems For:
        • Adobe After Effects
        • Adobe Premiere Pro
        • DaVinci Resolve
        • Foundry Nuke
      • 3D Design & Animation
        • Recommended Systems For:
        • Autodesk 3ds Max
        • Autodesk Maya
        • Blender
        • Cinema 4D
        • Houdini
        • ZBrush
      • Real-Time Engines
        • Recommended Systems For:
        • Game Development
        • Unity
        • Unreal Engine
        • Virtual Production
      • Rendering
        • Recommended Systems For:
        • Keyshot
        • OctaneRender
        • Redshift
        • V-Ray
      • Digital Audio
        • Recommended Systems For:
        • Ableton Live
        • FL Studio
        • Pro Tools
    • Engineering
      • Architecture & CAD
        • Recommended Systems For:
        • Autodesk AutoCAD
        • Autodesk Inventor
        • Autodesk Revit
        • SOLIDWORKS
      • Visualization
        • Recommended Systems For:
        • Enscape
        • Lumion
        • Twinmotion
      • Photogrammetry & GIS
        • Recommended Systems For:
        • ArcGIS Pro
        • Agisoft Metashape
        • Pix4D
        • RealityCapture
    • AI & HPC
      • Recommended Systems For:
      • Data Science
      • Generative AI
      • Large Language Models
      • Machine Learning / AI Dev
      • Scientific Computing
    • More
      • Recommended Systems For:
      • Compact Size
      • Live Streaming
      • NVIDIA RTX Studio
      • Quiet Operation
      • Virtual Reality
    • Business & Enterprise
      We can empower your company
    • Government & Education
      Services tailored for your organization
  • Products
    • Computer System Styles:
    • Desktop Workstations
      • AMD Ryzen
        • Ryzen 9000:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • AMD Threadripper
        • Threadripper 7000:
        • Mid Tower
        • Full Tower
        • Threadripper PRO 5000WX:
        • Full Tower
        • Threadripper PRO 7000WX:
        • Full Tower
      • AMD EPYC
        • EPYC 9004:
        • Full Tower
      • Intel Core
        • Core 13th Gen:
        • Small Form Factor
        • Core 14th Gen:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • Intel Xeon
        • Xeon W-2400:
        • Mid Tower
        • Xeon W-3400:
        • Full Tower
    • Custom Computers
    • Laptop Workstations
      • Puget Mobile 17″
    • Rackstations
      • AMD Rackstations
        • Ryzen 7000:
        • R550-6U 5-Node
        • Ryzen 9000:
        • R121-4U
        • Threadripper 7000:
        • T121-4U
        • Threadripper PRO 5000WX:
        • WRX80 4U
        • Threadripper PRO 7000WX:
        • T141-4U
        • EPYC 9004:
        • E140-4U
      • Intel Rackstations
        • Core 14th Gen:
        • C131-4U
        • Xeon W-3400:
        • X141-4U
        • X141-5U
    • Custom Rackmount Workstations
    • Puget Servers
      • Puget Servers
        • AMD EPYC:
        • E200-1U
        • E140-2U
        • E280-4U
        • Intel Xeon:
        • X200-1U
    • Custom Servers
    • Storage Solutions
      • Network Attached Storage
        • QNAP NAS Recommendations
      • Puget Storage
        • Puget Storage:
        • 12-Bay 2U
        • 24-Bay 2U
        • 36-Bay 4U
    • Recommended Third Party Peripherals
      Curated list of accessories for your workstation
    • Puget Gear
      Quality apparel with Puget Systems branding
  • Publications
    • Articles
    • Blog Posts
    • Case Studies
    • HPC Blog
    • Podcasts
    • Press
    • PugetBench
  • Support
    • Contact Support
    • Support Articles
    • Warranty Details
    • Onsite Services
    • Unboxing
  • About Us
    • About Us
    • Contact Us
    • Our Customers
    • Enterprise
    • Gov & Edu
    • Press Kit
    • Testimonials
    • Careers
  • Talk to an Expert
  • My Account
  1. Home
  2. /
  3. Hardware Articles
  4. /
  5. 1-7x NVIDIA GeForce RTX 4090 GPU Scaling
7x NVIDIA RTX 4090 GPU Performance Scaling

1-7x NVIDIA GeForce RTX 4090 GPU Scaling

Posted on November 9, 2022 (November 9, 2022) by Matt Bach
Always look at the date when you read an article. Some of the content in this article is most likely out of date, as it was written on November 9, 2022. For newer information, see our more recent articles.

Table of Contents

  • Introduction
  • Test Setup
  • Power Draw: Idle & OctaneBench
  • Performance: DaVinci Resolve Studio
  • Performance: RedShift
  • Performance: V-Ray
  • Performance: OctaneBench
  • Conclusion
Closeup photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Image
Open Full Resolution

Introduction

About a month ago, NVIDIA began rolling out their new RTX 40 series GPUs, starting with the GeForce RTX 4090 24GB. The RTX 4090 is an incredibly powerful GPU, and in our content creation review, it easily blew past anything else on the market. In that same article, we included test results in both single and dual GPU configurations, but today, we want to take things to the extreme and see what kind of performance we can get with up to seven of these powerful cards.

Before we get too far, we want to clarify that here at Puget Systems, we have no intention of selling systems with this many RTX 4090s. These cards are both physically large and demand a ton of power, and we are currently planning on limiting our standard systems to just two RTX 4090s max. Any more than that and you are not only going to have an issue with the physical size of the cards but will also need more power than a standard 15amp circuit is able to provide in the USA.

But sometimes, our curiosity gets the better of us and we can’t help ourselves from testing a configuration that is far beyond what we typically offer to our customers. And after seeing what two RTX 4090 GPUs are capable of, we had to push it to the maximum of what our current Threadripper PRO workstation motherboards are capable of and turn it up to seven

Test Setup

With GPU supply being what it is right now, we, unfortunately, could not do this testing with just a single brand/model of card. Instead, we raided cards from our Labs and R&D teams, borrowing both NVIDIA FE cards, and AIB models. The one requirement we stuck to was that all the cards were not overclocked models (mostly because we didn’t have enough OC cards to do this testing, and didn’t want to mix them).

Test Platform

CPU: AMD Threadripper PRO 5995WX 64-Core
CPU Cooler: Supermicro SNK-P0064AP4
Motherboard: Asus Pro WX WRX80E-SAGE SE WIFI
RAM: 8x Micron DDR4-3200 16GB ECC Reg. (128GB total)
GPUs: 4x NVIDIA GeForce RTX 4090 24GB FE
2x PNY GeForce RTX 4090 XLR8 24GB
1x Asus GeForce RTX 4090 TUF 24GB
PSU: 4x Super Flower LEADEX Platinum 1600W
Storage: Samsung 980 Pro 2TB
OS: Windows 11 Pro 64-bit (2009)

Benchmark Software

DaVinci Resolve 18.0.4
PugetBench for DaVinci Resolve 0.93.1
(GPU Effects only)
OctaneBench 2020.1.5
RedShift 3.5.09
V-Ray Benchmark 5.02.00
Photo of multiple GPUs on counter - NVIDIA GeForce RTX 4090 24GB FE, PNY GeForce RTX 4090 XLR8 24GB, Asus GeForce RTX 4090 TUF 24GB
Top down photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Angled photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Previous Next
System Image
Photo of multiple GPUs on counter - NVIDIA GeForce RTX 4090 24GB FE, PNY GeForce RTX 4090 XLR8 24GB, Asus GeForce RTX 4090 TUF 24GB
Open Full Resolution
Top down photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Open Full Resolution
Angled photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Open Full Resolution
Previous Next

In addition to the base hardware itself, we used an 8 GPU mining rack and 6x Lian Li 600mm PCI-e 4.0 riser cables (PW-PCI-4-60X) in order to get all the GPUs connected to the motherboard. For those that are curious, the total price for this configuration is somewhere in the neighborhood of $28-30k.

Considering we were using a bunch of PCIe riser cables and up to four power supplies, things went remarkably smoothly. The only significant issue we had was that we were unable to get the RTX 4090 cards working properly at Gen4 speeds. Not only would the entire system hang as soon as it tried to boot to the OS drive, but the BIOS was also extremely laggy – taking more than a second to respond to inputs, and often repeating a single input two or three times.

We did not have this problem if we switched to an RTX 30 series GPU (or if we plugged the RTX 4090 straight into the motherboard), but for these 4090s with the PCIe extension cables, we ended up having to set the PCIe mode to Gen3. Luckily, PCIe x16 Gen3 is still excellent, and should only have a minor impact on overall performance for the applications we are testing.

Call to Action
Looking for a GPU Rendering Workstation or Server?
Call to Action
Looking for a GPU Rendering Workstation or Server?

Power Draw: Idle & OctaneBench

Before we get into performance, we wanted to give a little look into the overall power draw of the system. The RTX 4090 is one of the most power-hungry GPUs ever released, and we needed a total of four 1600W power supplies just to provide all the PCIe power cables the cards (and motherboard) required.

1-7x NVIDIA GeForce RTX 4090 GPU Idle and Load Power Draw

The chart above is showing the total power draw from the wall across all power supplies at both idle, and maximum load during OctaneBench. We could have used synthetic tests to get even higher power draw, but we opted to stick to realistic workloads – even if this configuration isn’t exactly realistic itself.

Idle power draw was fairly decent all things considered, and went from 150W with a single GPU, to 353W with all seven. That is only about 30 watts for each GPU added on average, with some variance every couple of cards as we added additional power supplies to the mix.

The power draw under load was where things got a bit out of hand. A single RTX 4090 only had a total system power draw of 510 watts in OctaneBench, but each card we added increased the power draw by about 375 watts. At the very top, we peaked at 2,750 watts, or roughly the equivalent of two standard electric space heaters going at full blast. Our AC was certainly working overtime during these tests!

Technically, in terms of straight wattage, we could have gotten away with just two 1600W power supplies. However, the main limiter was actually the number of PCIe power cables each PSU supported. And with how power-hungry the RTX 4090 can be, we wanted to play it safe and avoid using things like PCIe splitters or pigtails.

Performance: DaVinci Resolve Studio

1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in DaVinci Resolve Studio
Previous Next
System Image
1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in DaVinci Resolve Studio
Open Full Resolution
Open Full Resolution
Previous Next

To start off our performance testing, we want to look at the “GPU Effects” portion of our DaVinci Resolve benchmark. While DaVinci Resolve Studio doesn’t scale as efficiently as many of the other applications we will be testing, Resolve is known in the video editing world for how well it can utilize multiple GPU configurations for OpenFX and noise reduction.

Going from one card to two, and two cards to three, both result in about a 50% increase in performance. This results in a 3x RTX 4090 configuration performing about 2.3x faster than a single card. After that, however, the gains drop off sharply, and you are only looking at about a 10% performance gain for each additional RTX 4090. And, once you get up to six cards, we saw no performance advantage to adding a seventh card.

At the very peak, with six RTX 4090 GPUs, we saw exactly a 3x increase in performance over a single GPU. That isn’t terrible by any means, but we expect to see even bigger gains when we start looking at GPU rendering engines like Redshift, V-Ray, and Octane.

Performance: RedShift

1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in RedShift
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in Redshift
Previous Next
System Image
1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in RedShift
Open Full Resolution
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in Redshift
Open Full Resolution
Previous Next

The first GPU rendering engine we want to examine is Redshift. Redshift is different from many other rendering benchmarks as it is timing how long it takes to render a single scene, rather than returning how many samples per second the system can process. The issue with this method is that as hardware gets faster and faster, the test takes less and less time to finish. This can impact the measurement directly (especially since Redshift only reports results in whole seconds rather than fractions) but can make things like scene load time take a bigger portion of the result rather than isolating the rendering portion.

Because of this, we feel that the current Redshift benchmark is a bit misleading when it comes to incredibly powerful configurations like this. The performance scaling was decent up through four RTX 4090 cards, but the benefit per GPU dropped off after that. We still ended up with a peak performance that was 4.5x higher than a single RTX 4090, but we suspect that actual Redshift users rendering large scenes would see a larger benefit in most cases than what this benchmark is currently capable of showing.

Performance: V-Ray

1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in VRay Benchmark
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in VRay
Previous Next
System Image
1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in VRay Benchmark
Open Full Resolution
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in VRay
Open Full Resolution
Previous Next

Next up is V-Ray, which we tested in both CUDA and RTX modes. CUDA mode shows diminishing returns once you get above four RTX 4090s, but RTX mode had terrific scaling all the way up to six cards. It dropped off a bit with the seventh card, but we were still able to end up with a maximum of about 6x faster performance with seven RTX 4090s versus a single RTX 4090.

Performance: OctaneBench

1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in OctaneBench
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in OctaneBench
Previous Next
System Image
1-7x NVIDIA GeForce RTX 4090 GPU Scaling Performance in OctaneBench
Open Full Resolution
1-7x NVIDIA GeForce RTX 4090 Relative GPU Scaling Performance in OctaneBench
Open Full Resolution
Previous Next

Last up is OctaneBench, which typically has the best GPU scaling out of anything we currently test in our standard suite of benchmarks. We weren’t completely sure how it would hold up with this many cards, but it came through just about perfectly. For each card we added to the system, we got within a few percent of perfect scaling.

In fact, with seven RTX 4090 GPUs we ended up with about 6.9x higher performance than a single card. That is about as good as you can ask for and really shows off how well OctaneRender is able to take advantage of multi-GPU configurations.

Call to Action
Looking for a GPU Rendering Workstation or Server?
Call to Action
Looking for a GPU Rendering Workstation or Server?

Conclusion

While the overwhelming majority of users are only going to have one, maybe two, GPUs in their system, taking things to the extreme like this is a great way to see just what is possible with modern, off-the-shelf hardware. While we certainly used some non-standard things like four power supplies, PCIe extension cables, and a GPU mining rack, you could (with deep enough pockets) build this exact same rig yourself.

Now, should you actually build this? Probably not, and we have zero intention of ever selling anything like this to our customers. It was surprisingly stable, with the only real issue being we couldn’t run the RTX 4090s at PCIe Gen4 speeds, but that doesn’t make it a good idea. The better solution if you need this much GPU computing power is likely a small cluster, with only a few GPUs in each system. Sticking with GeForce will limit you to two GPUs in most cases, but with Quadro, you can easily do three, sometimes four, cards per system.

Distance photo of 1-7x NVIDIA GeForce RTX 4090 in mining rack
Image
Open Full Resolution

Feasibility aside, however, it is very impressive what this amount of processing power is capable of. Not every application is going to be able to scale well enough to take advantage of this many cards, but GPU rendering engines like OctaneRender in particular can see some incredible performance gains. The NVIDIA GeForce RTX 4090 is an extremely fast GPU in the first place, and when you combine this many of them, the performance (along with the power draw and heat output) goes through the roof.

For example, this exact configuration is currently number 12 on the OctaneBench database, trailing behind configurations that mostly consist of 14+ GPUs. And this is with non-overclocked GPUs; if we wanted to get into that, we would probably be looking at a #11, possibly #10 spot. In fact, someone out there has already tested an 8x RTX 4090 configuration, which is sitting at the #7 spot for OctaneBench!

All in all, this was a very fun project. The build unfortunately had to be torn down almost immediately to return the GPUs to their respective departments, but it is amazing what computers are capable of when you take things to these extremes.

Related Content

  • DaVinci Resolve Studio 18.6 – Professional GPU Performance Analysis
  • Topaz Video AI 5.1 – Professional GPU Performance Analysis
  • Topaz Video AI 5.1 – Consumer GPU Performance Analysis
  • DaVinci Resolve Studio 18.6 – Consumer GPU Performance Analysis
View All Related Content

Latest Content

  • LLM Inference – Professional GPU performance
  • LLM Inference – Consumer GPU performance
  • AMD Ryzen 9000: Performance vs Previous Generations
  • AMD Ryzen 9000 Content Creation Review
View All
Tags: DaVinci Resolve, GPU, GPU Scaling, NVIDIA, OctaneBench, Redshift, Rendering, RTX 4090, V-Ray

Who is Puget Systems?

Puget Systems builds custom workstations, servers and storage solutions tailored for your work.

We provide:

Extensive performance testing
making you more productive and giving better value for your money

Reliable computers
with fewer crashes means more time working & less time waiting

Support that understands
your complex workflows and can get you back up & running ASAP

A proven track record
as shown by our case studies and customer testimonials

Get Started

Browse Systems

Puget Systems Mobile Laptop Workstation Icon

Mobile

Puget Systems Tower Workstation Icon

Workstations

Puget Systems Rackmount Workstation Icon

Rackstations

Puget Systems Rackmount Server Icon

Servers

Puget Systems Rackmount Storage Icon

Storage

Latest Articles

  • LLM Inference – Professional GPU performance
  • LLM Inference – Consumer GPU performance
  • AMD Ryzen 9000: Performance vs Previous Generations
  • AMD Ryzen 9000 Content Creation Review
  • DaVinci Resolve Studio: AMD Ryzen 9000 Series vs Intel Core 14th Gen
View All

Post navigation

 13th Gen Intel Core Processors Content Creation ReviewNVIDIA GeForce RTX 4080 16GB Content Creation Review 
Puget Systems Logo
Build Your Own PC Site Map FAQ
facebook instagram linkedin rss twitter youtube

Optimized Solutions

  • Adobe Premiere
  • Adobe Photoshop
  • Solidworks
  • Autodesk AutoCAD
  • Machine Learning

Workstations

  • Content Creation
  • Engineering
  • Scientific PCs
  • More

Support

  • Online Guides
  • Request Support
  • Remote Help

Publications

  • All News
  • Puget Blog
  • HPC Blog
  • Hardware Articles
  • Case Studies

Policies

  • Warranty & Return
  • Terms and Conditions
  • Privacy Policy
  • Delivery Times
  • Accessibility

About Us

  • Testimonials
  • Careers
  • About Us
  • Contact Us

© Copyright 2024 - Puget Systems, All Rights Reserved.