TABLE OF CONTENTS
Well, it depends.
Are you looking for memory that has access to higher speeds and is compatible with more platforms? Or are you looking for endurance memory that can work 24/7, catch more errors, but sacrifices a bit of speed to do it?
RAM (Random Access Memory) modules are a crucial part of every system, but not all modules are the same. Aside from the capacity, frequency, and latency, the modules can either be Error-Correcting Code (ECC) modules, or non-ECC modules.
The difference between the two is that ECC memory will protect your system from a potential crash by correcting any errors in the data, while non-ECC memory doesn’t correct such errors.
Think of non-ECC memory as your speed-oriented memory, while ECC is your endurance / reliability memory.
Since not all platforms support ECC memory, and not every system needs it, let’s discuss what ECC memory is, how it works, and whether you need it.
What is ECC Memory?
To understand how the Error-Correcting Code (ECC) works, first you need to understand what a single-bit error is. Because that’s the major problem that ECC was made to handle.
A single-bit error is when a single bit (a binary 0 or 1) within a data within the RAM is changed to the opposite value accidentally.
This kind of error is tiny, and a computer may not recognize it as incorrect automatically, which can lead to many problems.
You can think of single-bit errors like metaphorical weeds in your lawn. Your lawn is your memory, and the ECC part of your memory is the choice of herbicide that’s available.
Non-ECC memory won’t take out any of the “weeds.”
ECC will annihilate all weeds, but a bit slower.
Single-bit errors can occur because of magnetic or electrical interference inside the computer, which is present in every system as background radiation.
Voltage stress, temperature variation, impact shock, or even data being read or written in a different way than originally intended can also lead to a single-bit error.
ECC memory will take care of these errors and fix them before they turn into a bigger problem.
ECC memory is very similar to non-ECC memory. The biggest difference between the two is that ECC memory usually has a bit of extra memory dedicated solely to making sure that the actual memory doesn’t crash and burn if something bad happens.
ECC is basically just a little chip on a normal stick of RAM that makes sure that every bit of data that goes in and out is exactly what they’re supposed to be.
What it does is that it creates an encrypted piece of code from the data being written into the main memory and stores that code in the extra bit of memory I told you about.
When the data stored in the main memory needs to be accessed, it then creates a new code and checks that piece of code against the code that was previously generated.
If it finds that they’re both the same, and that the data has not been tampered with in any way, it allows the data to be read.
But if it finds that the new code differs from the stored code, it tries to fix the issue by decrypting the code to find exactly where the problem lies.
And if it can’t, it at the very least makes sure that you know that something has gone wrong instead of silently continuing to work.
It’s similar to comparing MD5 hashes when downloading a program to make sure that what you downloaded was what you actually wanted and not a different rogue undercover file.
This is how ECC becomes slightly slower—because it has to create those extra codes—but a more efficient option to take care of your metaphorical lawn.
According to Intelligent Memory, the chances of such an error occurring are one single-bit error every 14 to 40 hours, per Gigabit (125 MBs) of RAM.
The Need for Error Correction in Servers and Workstations
In a system used for everyday browsing and gaming, error correction isn’t a necessity. However, the stakes are higher in the world of servers and professional workstations.
If you’re running a business that specializes in finance, a single-bit error can lead to a server crash that could potentially wipe transactions from your server.
Such a memory error can also lead to a data transcription error, which could lead to misplacing a decimal or changing a number.
This is obviously detrimental to the integrity and trustworthiness of businesses. If you went to purchase your cat a new toy and ended up paying $100.00 instead of $10.00, that would be quite tragic.
If you visited your doctor and the bill came out to $56,987, instead of $59.45, you’re an American or just became the victim of an ECC-related computer error, or both. And you’d most likely never use their services again.
Not to mention the terrible day you’d have after getting a bill that size for a mere checkup.
When caching an airflow simulation on an airplane’s wing, you don’t want memory errors messing up, because there are lives at stake.
Wasted time investment is another thing that ECC Memory can prevent. If you’re rendering high-res, complex images or training a deep learning model, having to start over after weeks of processing, because of some memory errors, is a huge waste of time and money for your business.
ECC memory is a necessity in the medical industry too. With patient care, the accuracy of records is critical, and a single-bit error can lead to a wrong diagnosis, which could be fatal. The choice in memory can make or break a business’s records and reputation.
Without ECC memory, not only is there a chance of errors like this happening, but you also won’t know they happened until someone reviews the data and finds a mistake. And sometimes, that can be too late.
Make sure you double-check the receipt on that cat toy.
Which Platforms Support ECC Memory?
In the server world, both Intel’s Xeon CPUs and AMD’s Epyc server lineups support ECC memory. Note that for you to use ECC memory, both the processor and the motherboard you’re using will need to support ECC memory.
Remember the lawn and herbicide analogy? Think of your processor and motherboard as the tools you use to work efficiently with ECC. You need the proper tools to spread herbicide where you want it in your lawn.
In the mainstream platforms used today, most of Intel’s processors (even some budget-oriented Celeron models) will support ECC memory, provided you use a motherboard that is compatible with such memory.
With AMD, all Ryzen processors support ECC memory with a compatible X570 chipset motherboard, whereas the B550 chipset doesn’t support ECC memory with Ryzen 2000 processors. Ryzen processors with an integrated graphics card, or accelerated processing unit (APU), the 3000 G-Series, and 4000 G-Series will require you to use a PRO processor for ECC support.
Here’s an overview Table from ASUS showing Ryzen CPU ECC Memory Support:
The Downsides of Using ECC Memory
The most common issue with ECC memory is compatibility. Even though most modern platforms support it, you must make sure that the specific processor and motherboard combination works with ECC memory.
This is less of a problem in the server world, where ECC memory is commonly used. So server hardware usually supports ECC by default.
Pricing is also different.
With ECC memory modules being more expensive than non-ECC modules due to the added functionality. Depending on the capacity you’re buying, the price difference is about 10-20%.
There is also a slight performance hit, because of the additional time that ECC memory takes to check for any errors. According to Corsair, you can expect a hit of about 2%.
Wrapping Things Up – Do You Need ECC Memory?
When you’re building a professional workstation or a server that needs to run 24/7, ECC memory is a must.
To go without ECC in this scenario would be like using a greyhound to pull your wagon when what you really need is a sturdy workhorse.
Both the price difference and performance hit are worth it when you consider that you won’t have to worry about the possibility of a single-bit error causing you headaches.
You wouldn’t want to double-check every receipt to make sure it was correct, right? Extend the same courtesy to your clients and customers.
But if all you’re doing is playing games and using your computer for other non-critical work, you most likely don’t need ECC and would leave performance on the table if you went for it.
Over to you
Have you used ECC memory before? What was your experience? Are you planning on using it in your next workstation? Let us know in the comments, or on our forum!
1 comment
13 April, 2024
Hi Alex, nice post.