Understanding S.M.A.R.T.: How to Check SSD/HDD Health and Replacement Timing

S.M.A.R.T. is a technology for monitoring the health of storage devices, and it plays an important role in predicting HDD and SSD failures and preventing data loss in advance.

By regularly checking the health of storage devices, it is possible to prepare for backup or replacement before a failure occurs.

Some tools can run in the background and constantly monitor the storage, which will be introduced later.

If a sudden failure occurs, there is nothing that can be done, but if warning signs are detected, it is possible to prepare backups or replacement storage with enough time to act.

This article explains in detail the basic mechanism of S.M.A.R.T., its advantages, and specific methods for data protection.

Key Points of This Article

S.M.A.R.T. is a technology for monitoring the health of storage devices
Monitors various values such as reallocated sector count and temperature
Aims to detect signs of failure and prevent data loss in advance
Allows preparation for backup and storage replacement
Minimizes downtime due to failure and enables smooth return to daily work

MEMO

It is not always possible to know for sure before a failure, and sometimes devices break suddenly, so it is important to back up data regularly, not just before a failure.

This article also explains basic knowledge such as storage standards and mainstream storage configurations, as well as how to choose storage from the perspectives of performance and compatibility.

Select PC parts and online stores to instantly generate an estimate, check compatibility, and calculate power requirements. You can save up to five different builds, making it easy to try out multiple configurations.

≫ Tool：PC Parts Estimation & Compatibility Check Tool

About S.M.A.R.T. for Storage (SSD/HDD)

Let’s take a closer look at self-diagnosis technology for storage, its role, benefits, and methods for data protection.

About S.M.A.R.T. for Storage (SSD/HDD)

What is S.M.A.R.T.?
Understanding Storage Health
Preventing Data Loss in Advance

What is S.M.A.R.T.?

S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a technology for monitoring the health of storage devices and is built into SSDs and HDDs.

This technology is designed to detect storage failures preventively, with the main purpose of detecting abnormalities early and preventing data loss.

Especially for storage devices used over a long period, backing up data or replacing the device before abnormalities occur can protect important data.

S.M.A.R.T. monitors various values such as reallocated sector count, temperature, error rate, and power-on hours, and uses these values to assess the health of the storage and predict failures.

Details of the monitoring items will be explained later.

Understanding Storage Health

Tools that check the status of storage analyze and clearly show the storage status (such as normal or abnormal) based on S.M.A.R.T. data.

Since it is difficult to understand what each S.M.A.R.T. item and value means, these tools help by showing the status of each item and the overall health based on all items.

This allows users to understand the health of their storage devices and prepare for backup or replacement.

S.M.A.R.T. itself is only for “monitoring” values, and the “analysis” and “reporting” of storage health is done by the tools.

Therefore, just because a storage device supports S.M.A.R.T. does not mean the device itself will judge or report its health status.

Preventing Data Loss in Advance

By using S.M.A.R.T. and tools that analyze its values, users can understand the health of their storage devices.

When a failure is likely, it is possible to prepare by backing up data or getting replacement parts in advance.

Being able to roughly know when a failure might occur in advance gives time to prepare for backup or replacement.

If the signs are not known, there may be no backup, and even if backups are done regularly, the data may be outdated.

Also, if storage needs to be prepared after a failure, it can take several days to select and purchase, during which the computer cannot be used.

In this sense, preparing backups and replacement storage in advance, and replacing storage at a convenient time before failure enables a smooth return to daily work.

Ken

It’s better for peace of mind to know in advance rather than having a sudden failure!

I regularly check my storage health using the software “Crystal Disk Info“, and it is quite accurate.

In fact, I was able to have enough time to prepare backups and replacement storage in advance, and my data was safe, so it has been very helpful.

However, since it is not 100% accurate, it is not recommended to rely on this alone, and regular backups are still necessary.

Ken

Just because this exists doesn’t mean you only need to back up at that time, so be careful!

Items Monitored by S.M.A.R.T.

This section explains the important elements that can be monitored by S.M.A.R.T.

Items Monitored by S.M.A.R.T.

List of Monitoring Items
Reallocated Sectors Count
Uncorrectable Sector Count
Current Pending Sector Count
Temperature
Spin Retry Count (HDD only)

First, the list of items that can be monitored is summarized, and then the items that are closely related to storage health and failure are explained in detail.

List of Monitoring Items

The items that can be monitored by S.M.A.R.T. may vary depending on the type and manufacturer of the storage, but generally, the following items are monitored.

These items are used as indicators of storage health and performance.

No.	Item Name	Overview
1	Read Error Rate	Frequency of errors during read operations
2	Reallocated Sectors Count	Number of sectors replaced when bad sectors are detected
3	Seek Error Rate	Frequency of errors during disk seek operations
4	Power-On Hours	Total accumulated power-on time of the device
5	Spin-Up Time	Time taken for the disk to reach operating speed (HDD only)
6	Start/Stop Count	Number of times the device has started and stopped
7	Temperature	Internal temperature of the storage
8	Current Pending Sector Count	Number of bad sectors waiting to be reallocated
9	Uncorrectable Sector Count	Number of sectors that could not be corrected during read or write
10	Power Cycle Count	Number of times the device has been powered on
11	G-Sense Error Rate	Frequency of errors due to shock or vibration
12	Reallocated Event Count	Number of times sector reallocation has occurred
13	Spin Retry Count	Number of times spin-up failed and was retried
14	Load/Unload Cycle Count	Number of times the head was loaded and unloaded (HDD only)
15	End-to-End Error	Number of errors during data transfer from cache to host
16	Wear Leveling Count	Number of times SSD wear leveling has occurred
17	Total LBAs Written	Total number of logical block addresses written to the device
18	Total LBAs Read	Total number of logical block addresses read from the device
19	Program Fail Count	Number of times data writing failed on SSD
20	Erase Fail Count	Number of times data erasure failed on SSD
21	Remaining Life	Estimated remaining life percentage of the SSD
22	Host Reads	Number of read commands from the host
23	Host Writes	Number of write commands from the host
24	Temperature Throttle	Number of times performance was limited due to high temperature
25	SATA Downshift Error Count	Number of times SATA transfer speed was reduced

Let’s take a closer look at the items that are closely related to storage health and failure.

Reallocated Sectors Count

Reallocated Sectors Count indicates the number of sectors replaced with spare sectors when bad sectors are detected on the disk.

Storage devices have a function to automatically replace physically damaged sectors with spare sectors to maintain data reliability.

A sector is the smallest unit of data storage on an SSD or HDD.

Storage devices divide data into small blocks called sectors to manage and store data efficiently.

Each sector has a unique address, and data is read and written in these units.

On HDDs, data is stored on circular disks (platters), which are divided into concentric circles called tracks.

Tracks are further divided into sectors, and data is stored in each sector. Usually, each sector stores 512 bytes or 4096 bytes (4KB) of data.

On SSDs, data is stored using flash memory, so there are no physical disks or tracks, but data is still managed in units called sectors.

If the number of reallocated sectors increases, it is highly likely that physical damage to the storage is progressing and the end of the storage’s life is approaching.

If the number increases rapidly, immediate action such as backup or replacement is necessary.

Uncorrectable Sector Count

Uncorrectable Sector Count indicates the number of physically damaged sectors on the storage.

If this number increases, the risk of problems with reading and writing data increases, which can lead to data loss or system instability.

Normally, storage devices replace damaged sectors with spare sectors, but if for some reason they cannot be replaced, they are counted as uncorrectable sectors.

If this number continues to increase, the device may be nearing the end of its life, just like with reallocated sectors.

Possible reasons include:

Lack of spare sectors
Physical damage
Complete data loss
Firmware or internal errors
Storage area limit
Large-scale errors or data corruption

When storage is nearing the end of its life, many bad sectors may occur, and all spare sectors may be used up.

If there are not enough spare sectors, new bad sectors cannot be replaced and remain as uncorrectable sectors.

Current Pending Sector Count

Current Pending Sectors are sectors where problems occurred when saving data on the storage.

Because these sectors have problems with reading and writing data, the storage prepares to replace them with other normal sectors.

If the number of pending sectors increases, the storage may be deteriorating.

If sectors cannot be read or written normally, the risk of data loss or system errors increases.

Temperature

If the temperature of the storage is too high, internal parts may be damaged and the risk of failure increases.

If high temperatures continue for a long time, the life of the storage may be significantly shortened.

Normally, storage devices are ideally operated in the range of 35°C to 45°C, but depending on the environment and usage, this range may be exceeded.

Especially, for high-speed storage such as the latest PCIe NVMe SSDs, and when transferring large amounts of data for a long time, the temperature tends to rise.

Therefore, some motherboards have heatsinks attached to the M.2 slot for cooling, but if not, it is recommended to prepare and attach a heatsink yourself.

Spin Retry Count (HDD only)

Spin Retry Count indicates the number of times an HDD failed to spin up the disk and had to retry.

Spin-up refers to the process where the disk starts spinning from a stopped state and reaches the speed required for operation.

HDDs have magnetic disks (platters) inside that spin at high speed, enabling data reading and writing. Spin-up is the action of starting this rotation.

Spin retry is an indicator that suggests problems with the HDD’s motor or mechanical parts, so if this count increases, the risk of drive failure is high.

Tools for Analyzing Storage with S.M.A.R.T.

This section explains tools that can check, analyze, and report S.M.A.R.T. information.

Tools for Analyzing Storage with S.M.A.R.T.

Crystal Disk Info
Speccy
HD Tune

Crystal Disk Info

Crystal Disk Info is a convenient software that uses S.M.A.R.T. to easily check the health of storage devices.

It displays S.M.A.R.T. items and values for each storage device, and based on those values, it judges the health status as one of four: “Good”, “Caution”, “Bad”, or “Unknown”.

As explained earlier, S.M.A.R.T. monitors various items and items closely related to failure, but honestly, just looking at the values does not make much sense.

This software shows the status in four categories, making it very intuitive and easy to understand, so it is recommended.

Also, the criteria for judging storage health are based on academic papers (see here), so it is considered reliable.

I also check this tool regularly, and when the status changed to “Bad”, the storage failed within one week to one month, so I trust it quite a bit.

Personally, I make decisions based on the health status as follows:

Caution: Prepare backup or replacement storage and replace when there is time
Bad: Stop all PC work and prioritize backup and replacement

I have had about five storage devices fail so far, and after the status changed to “Bad”, they really failed within a short period of one week to one month.

Therefore, to minimize further deterioration, I stop PC work and perform backup or replacement work.

It is better to recognize this as a high priority.

However, sometimes failures can be detected as the number of reallocated sectors or uncorrectable sectors gradually increases, but there have also been cases where devices broke suddenly, so it is not absolute.

This is not a problem with the tool, but rather the nature of storage devices themselves.

Also, it has a startup function to launch the tool when the PC starts, a resident function to monitor in the background, and a notification function to alert changes in status by email or sound, which is convenient.

Speccy

Speccy is a system information tool provided by Piriform (the developer of CCleaner).

This tool provides detailed information about computer hardware and software, allowing users to check the system status and configuration at a glance.

For storage information, it displays the model number, manufacturer, and other basic information for each storage device, as well as S.M.A.R.T. information.

As shown in the image, it displays the status for each item, so it may not be suitable for those who want an overall judgment.

However, if someone is familiar with these items and wants to focus on a specific value, it is a good choice.

HD Tune

HD Tune is a tool mainly used to measure the performance and check the status of storage devices such as SSDs and HDDs.

The “Health” tab of the tool displays S.M.A.R.T. information in detail.

“Status” shows the status of each item, and “Health Status” determines the overall status.

Summary: Know When to Replace Storage with S.M.A.R.T.!

This article explained in detail the technology S.M.A.R.T. for monitoring storage health, its role and benefits, monitoring items, and tools.

Here is a summary of the key points.

Key Points of This Article

S.M.A.R.T. is a technology for monitoring the health of storage devices
Monitors various values such as reallocated sector count and temperature
Aims to detect signs of failure and prevent data loss in advance
Allows preparation for backup and storage replacement
Minimizes downtime due to failure and enables smooth return to daily work

S.M.A.R.T. is a technology for predicting storage failures and preventing data loss in advance.

By monitoring values such as reallocated sector count, uncorrectable sector count, and temperature, and analyzing these values, it is possible to determine the health of the storage.

This allows users to understand the status of their storage and prepare backups or replacement storage as needed.

By replacing storage at a convenient time before failure, there is no period when the computer cannot be used due to failure, and daily work can resume smoothly.

This article also explains basic knowledge such as storage standards and mainstream storage configurations, as well as how to choose storage from the perspectives of performance and compatibility.