Skip to content

Main Navigation

Puget Systems Logo
  • Solutions
    • Recommended Systems For:
    • Content Creation
      • Photo Editing
        • Recommended Systems For:
        • Adobe Lightroom Classic
        • Adobe Photoshop
        • Stable Diffusion
      • Video Editing
        • Recommended Systems For:
        • Adobe After Effects
        • Adobe Premiere Pro
        • DaVinci Resolve
        • Foundry Nuke
      • 3D Design & Animation
        • Recommended Systems For:
        • Autodesk 3ds Max
        • Autodesk Maya
        • Blender
        • Cinema 4D
        • Houdini
        • ZBrush
      • Real-Time Engines
        • Recommended Systems For:
        • Game Development
        • Unity
        • Unreal Engine
        • Virtual Production
      • Rendering
        • Recommended Systems For:
        • Keyshot
        • OctaneRender
        • Redshift
        • V-Ray
      • Digital Audio
        • Recommended Systems For:
        • Ableton Live
        • FL Studio
        • Pro Tools
    • Engineering
      • Architecture & CAD
        • Recommended Systems For:
        • Autodesk AutoCAD
        • Autodesk Inventor
        • Autodesk Revit
        • SOLIDWORKS
      • Visualization
        • Recommended Systems For:
        • Enscape
        • Lumion
        • Twinmotion
      • Photogrammetry & GIS
        • Recommended Systems For:
        • ArcGIS Pro
        • Agisoft Metashape
        • Pix4D
        • RealityCapture
    • AI & HPC
      • Recommended Systems For:
      • Data Science
      • Generative AI
      • Large Language Models
      • Machine Learning / AI Dev
      • Scientific Computing
    • More
      • Recommended Systems For:
      • Compact Size
      • Live Streaming
      • NVIDIA RTX Studio
      • Quiet Operation
      • Virtual Reality
    • Business & Enterprise
      We can empower your company
    • Government & Education
      Services tailored for your organization
  • Products
    • Computer System Styles:
    • Desktop Workstations
      • AMD Ryzen
        • Ryzen 9000:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • AMD Threadripper
        • Threadripper 7000:
        • Mid Tower
        • Full Tower
        • Threadripper PRO 5000WX:
        • Full Tower
        • Threadripper PRO 7000WX:
        • Full Tower
      • AMD EPYC
        • EPYC 9004:
        • Full Tower
      • Intel Core
        • Core 13th Gen:
        • Small Form Factor
        • Core 14th Gen:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • Intel Xeon
        • Xeon W-2400:
        • Mid Tower
        • Xeon W-3400:
        • Full Tower
    • Custom Computers
    • Laptop Workstations
      • Puget Mobile 17″
    • Rackstations
      • AMD Rackstations
        • Ryzen 7000:
        • R550-6U 5-Node
        • Ryzen 9000:
        • R121-4U
        • Threadripper 7000:
        • T121-4U
        • Threadripper PRO 5000WX:
        • WRX80 4U
        • Threadripper PRO 7000WX:
        • T141-4U
        • EPYC 9004:
        • E140-4U
      • Intel Rackstations
        • Core 14th Gen:
        • C131-4U
        • Xeon W-3400:
        • X141-4U
        • X141-5U
    • Custom Rackmount Workstations
    • Puget Servers
      • Puget Servers
        • AMD EPYC:
        • E200-1U
        • E140-2U
        • E280-4U
        • Intel Xeon:
        • X200-1U
    • Custom Servers
    • Storage Solutions
      • Network Attached Storage
        • QNAP NAS Recommendations
      • Puget Storage
        • Puget Storage:
        • 12-Bay 2U
        • 24-Bay 2U
        • 36-Bay 4U
    • Recommended Third Party Peripherals
      Curated list of accessories for your workstation
    • Puget Gear
      Quality apparel with Puget Systems branding
  • Publications
    • Articles
    • Blog Posts
    • Case Studies
    • HPC Blog
    • Podcasts
    • Press
    • PugetBench
  • Support
    • Contact Support
    • Support Articles
    • Warranty Details
    • Onsite Services
    • Unboxing
  • About Us
    • About Us
    • Contact Us
    • Our Customers
    • Enterprise
    • Gov & Edu
    • Press Kit
    • Testimonials
    • Careers
  • Talk to an Expert
  • My Account
  1. Home
  2. /
  3. Hardware Articles
  4. /
  5. What is the most reliable hardware in our Puget Systems workstations?

What is the most reliable hardware in our Puget Systems workstations?

Posted on August 21, 2019 by Matt Bach
Always look at the date when you read an article. Some of the content in this article is most likely out of date, as it was written on August 21, 2019. For newer information, see our more recent articles.

Table of Contents

  • Introduction
  • Shop failure analysis
  • Field failure analysis
  • Conclusion

Introduction

Here at Puget Systems, one of our top priorities is to provide the highest quality workstations possible for our customers. While the performance testing we do in our Labs department is a big part of this, the product qualification we do behind the scenes is equally important. In the end, high performance is terrific, but it means nothing if the workstation is not reliable as well.

As a part of our constant drive to offer only the highest quality components possible, we track and regularly review the failure rates for each part we carry. This is typically an internal process, but every once in a while, we like to share some of our analysis with the public. Today, we want to give a high-level view of how reliable the different types of components we use in our workstations are in terms of both their shop (failures caught in our production process) and field (failures after the system has shipped to the customer) failure rates. In addition, we want to recognize what has turned out to be the most reliable hardware used in our workstations.

What is the most reliable hardware in Puget Systems workstations?
Image
What is the most reliable hardware in Puget Systems workstations?
Open Full Resolution

We do want to note that since every single part we carry goes through a very comprehensive qualification process, our failure rates should be much lower than the industry average. There are a lot of great products out there, but also a lot of… not so great… ones. This means that our relative failure rates between certain hardware groups (like ECC vs Standard RAM and GeForce vs Quadro) likely do not match up with what you would see from other workstation manufacturers or if you were to build your own system.

Shop failure analysis

Computer hardware is amazingly complex, and even with our rigorous qualification process, there is always the chance that a part can be defective. However, one of the goals of our production process (where we build, install, and test every single system we sell) is to catch as many of these defective parts as possible. This includes the obvious actions like checking for physical defects, benchmarking the system to check for irregular performance, and ensuring that the cooling is adequate; but it also includes us deliberately trying to break things.

We don't take this to the extreme, of course, but we would much rather cause a part to prematurely fail while it is still in our shop (where we can easily replace it) rather than having it fail a day, month, or year after a customer receives their machine. Because of this, our "Shop Failures" include what most people would define as DOA (dead on arrival) failures, but also failures that occurred during our extensive burn-in process.

Puget Systems shop hardware failure rates
Image
Puget Systems shop hardware failure rates
Open Full Resolution

To see how reliable different groups of hardware are, we decided to pull data on a quarterly basis going back about 3 years. To help make sense of this data, we will organize it by hardware group:

CPU (Intel) – There isn't much to talk about here – processors have long been one of the most reliable parts in any computer.

Motherboard – Overall, initial motherboard reliability is trending in the right direction. From Q1 2017 to Q2 2018, the failure rate was on a steady decline. Since then, it has been a bit volatile but the average is flat overall.

RAM – We decided to chart standard and ECC (which includes both registered and non-registered models) RAM separately because there are a few very interesting data points. First of all, the initial quality of ECC RAM has been rock solid for years and is one of the least likely parts to give us problems during our production process. Standard RAM has also been very good in the last few years, but it is interesting to see how much it improved in Q4 of 2017. Two things happened at this time: We moved to DDR4-2666 RAM (from DDR4-2400), and we moved to the (new at the time) Intel 8th Gen processors like the Intel Core i7 8700K. Either one of these could be the cause of the improvement, although it is likely a combination of both.

GPU – Similar to RAM, we split up the failure rate between GeForce and Quadro video cards. Starting with GeForce, we have seen a slow, but steady, increase in shop failures since about Q3 2017. We are not quite sure why this is since we have not changed brands much during this time period, but it is an alarming trend. For Quadro, the shop failure rate has a pair of huge spikes in Q2 2017 and Q4 2018 which is almost entirely due to a couple of bad batches of Quadro P600 cards with defective HDMI or Mini DP ports.

Storage – For storage, there are two stories depending on whether the drive is a platter or SSD. First, for WD platter drives the initial quality saw a small improvement at Q1 2018, and has been stable (and terrific) since then. The bigger story (and actually what prompted us to put this post together) was the initial quality for Samsung SSDs. Simply put, even though we sell thousands of Samsung drives every year, we only ever have a handful that gives us any problems during our production process.

Power Supply – Rounding out our data, power supplies have been pretty consistent over the last 3 years. We have primarily been using EVGA power supplies during this time, and it is good to see this stable reliability since inconsistent quality is a major problem we've had with other power supply brands.

Field failure analysis

While we strive to make the most reliable workstations possible, hardware can eventually break no matter how much time and effort we put into it. In many ways, these "field" failures that happen after we have shipped the system are much more important than the "shop" failures. If a part fails on us during the production process, it is an inconvenience, but we can typically replace it and resume the process fairly quickly. When a part dies on a customer, however, it can be a very annoying process to get it replaced – even with our industry-leading support and repair department.

Puget Systems field hardware failure rates
Image
Puget Systems field hardware failure rates
Open Full Resolution

Once again, we are charting the failure rates since Q3 2016 (the last three years). However, note that the dates shown are when we purchased and installed the part, not the date that the part failed. In other words, the older the date on the chart, the longer that part has been running in the field. Since older parts are often more susceptible to failure, this means that it is completely normal for the "field" failure rate to increase as you look further back in time.

With that noted, let's examine each category individually:

CPU (Intel) – While Intel CPUs have been very reliable over the last two years, it is interesting to see a spike in failures on systems that are about three years old. Looking at the data, this increase in reliability appears to be primarily driven by the fact that we were using the Broadwell-based Intel X-series CPUs (such as the Core i7 6900K) three years ago which have a higher failure rate than the newer models.

Motherboard – Here, we see the kind of trend that you might expect from computer hardware in general. As we look further back in time, the failure rate increases at a steady rate. What this means is that the older your system is, the more likely your motherboard is to develop issues.

RAM – Interestingly, there isn't a huge difference in reliability for standard and ECC RAM over the three-year period we charted. ECC RAM is very slightly more reliable, but it isn't by very much. In either case, it appears that the age of the RAM only has a minor impact on its reliability.

GPU – For video cards, the reliability is quite a bit different for GeForce and Quadro cards. On the Quadro side, the cards are extremely reliable for the first year, but there is a sharp rise in failure rate starting around Q2 2018. This is interesting because we have primarily used the Quadro P-series all the way back to the end of 2016, so it was not a matter of a new product line changing the quality of the cards. For GeForce, however, the reliability has actually been getting a bit worse recently with an uptick in failure rate in the last year. This is a worrying trend since it means that the newer GeForce cards are having problems at a rate higher than 2-3 year old cards.

Storage – For the WD platter drives, we are seeing a very small increase in failures over time, but it isn't by very much. Samsung SSDs have a few failures if you go all the way back to 2016, but otherwise have been nearly perfect in terms of reliability.

Power Supply – The field failure rate for power supplies over the last three years is very similar to motherboards – as the power supply gets older, it becomes more and more likely to fail.

Another way we can look at this data is to group the reliability by year and stick all of the results onto a single chart. This doesn't give us quite the fine detail of our above charts, but it helps to give a clearer look at how the reliability of each hardware type changes depending on the age of the system.

Hardware reliability over time
Image
Hardware reliability over time
Open Full Resolution

Looking at the data this way reveals some very interesting information. First, if you have a relatively new system (less than a year old), by far the most likely component to break is an NVIDIA GeForce card. As far as what is least likely to develop problems on a fairly new system, Samsung SSDs are very reliable with only a single drive having failed in the field, but Quadro GPUs take the cake with zero failures within this time period.

However, as the system gets older, power supplies, motherboards, and to a slightly lesser extent Quadro GPUs all decrease in reliability. By the three-year mark, you are most likely to encounter a problem with your motherboard or PSU than anything else. On the positive side, Samsung SSDs fare the best over this three-year period, followed by ECC RAM and standard RAM.

Conclusion

While the failure rate for many types of components changes depending on the age of the system, if we simply sum up the failure rates for each hardware group, we get a great idea of the overall reliability for each type of component.

Puget Systems overall hardware failure rates
Image
Puget Systems overall hardware failure rates
Open Full Resolution

Looking at the data this way, there are a number of things that stand out. First, the fact that WD platter hard drives are as reliable as Intel CPUs over a three-year period was unexpected. Second, while Quadro GPUs are more reliable in the field than GeForce cards, a couple of bad batches of Quadro P600 cards with defective video ports means that, as a whole, Quadro has been less reliable than GeForce for us.

Since the title of this post is "What is the most reliable hardware in our Puget Systems workstations?", however, let's go ahead and answer that question:

Whether you are looking at initial reliability or reliability over time, it is clear that the Samsung SSDs are easily the most reliable hardware we have used in our workstations over the past three years.

Most Reliable Hardware in Puget Systems Workstations: Samsung SSDs
Image
Most Reliable Hardware in Puget Systems Workstations: Samsung SSDs
Open Full Resolution

Keep in mind that we do not restrict our customers to enterprise-class drives or anything like that – most of what we use are the SATA and NVMe-based consumer EVO and PRO product lines. And yet, they are ~50% more reliable than ECC RAM (of which reliability is entirely the point), or 3x more reliable than Intel CPUs.

One thing we do want to make clear is that this DOES NOT mean you don't need to back up your data if you are using a Samsung SSD. Yes, the reliability is excellent, but there is always the chance that the drive can fail. In addition, a reliable drive doesn't protect you against malware, viruses, lightning strikes, or simply accidentally deleting something you didn't mean to. Your data is far more valuable than any piece of hardware in your computer, and you should always take active measures to protect it.

Tags: Hardware, reliability

Who is Puget Systems?

Puget Systems builds custom workstations, servers and storage solutions tailored for your work.

We provide:

Extensive performance testing
making you more productive and giving better value for your money

Reliable computers
with fewer crashes means more time working & less time waiting

Support that understands
your complex workflows and can get you back up & running ASAP

A proven track record
as shown by our case studies and customer testimonials

Get Started

Browse Systems

Puget Systems Mobile Laptop Workstation Icon

Mobile

Puget Systems Tower Workstation Icon

Workstations

Puget Systems Rackmount Workstation Icon

Rackstations

Puget Systems Rackmount Server Icon

Servers

Puget Systems Rackmount Storage Icon

Storage

Latest Articles

  • LLM Inference – Professional GPU performance
  • LLM Inference – Consumer GPU performance
  • AMD Ryzen 9000: Performance vs Previous Generations
  • AMD Ryzen 9000 Content Creation Review
  • DaVinci Resolve Studio: AMD Ryzen 9000 Series vs Intel Core 14th Gen
View All

Post navigation

 DaVinci Resolve GPU Roundup: NVIDIA SUPER vs AMD RX 5700 XTPugetBench for DaVinci Resolve 
Puget Systems Logo
Build Your Own PC Site Map FAQ
facebook instagram linkedin rss twitter youtube

Optimized Solutions

  • Adobe Premiere
  • Adobe Photoshop
  • Solidworks
  • Autodesk AutoCAD
  • Machine Learning

Workstations

  • Content Creation
  • Engineering
  • Scientific PCs
  • More

Support

  • Online Guides
  • Request Support
  • Remote Help

Publications

  • All News
  • Puget Blog
  • HPC Blog
  • Hardware Articles
  • Case Studies

Policies

  • Warranty & Return
  • Terms and Conditions
  • Privacy Policy
  • Delivery Times
  • Accessibility

About Us

  • Testimonials
  • Careers
  • About Us
  • Contact Us

© Copyright 2024 - Puget Systems, All Rights Reserved.