What is the S.M.A.R.T. value in Nexthink?
What is the S.M.A.R.T. value in Nexthink?
S.M.A.R.T. - Self Monitoring, Analysis and Reporting Technology
S.M.A.R.T. is allowing Nexthink to monitor the health of the hard drive.
Description of Nexthink and S.M.A.R.T.
Based on the IT best practices used by software and hardware vendors, Nexthink uses an algorithm in charge of computing the values coming from the Collector and storing only one value, the result of the combination of the S.M.A.R.T. values, the S.M.A.R.T. index.
Having a low value for the S.M.A.R.T. index is not necessarily alarming as it won't tell you that the hard drive will fail or the endpoint system will have issues straight away or in 56 minutes or in 2 weeks. What it does tell you is that the values reported by the Collector are making us lower the degree of confidence we have in the hard drive. When the index value is very low, it's time to ask the IT technician to go on-site and use the proprietary software to have the official diagnostic.
The algorithm will take the following items into account to estimate the index.
Power Cycle Count
Reallocation Event Count
Current Pending Sector Count
Reallocated Sectors Count
Uncorrectable Sector Count
Load Cycle Count
Write Error Rate
For SSD, the calculation is based on "Power Cycle Count" only, as the other metrics do not apply to SSD.
The hard disk manufacturers designing and producing the hardware and the associated software have their own protocol and have their own way of interrogating their components. This is not something that is necessarily publicly made available. A part of it may be, but not all of it. So it is very difficult to provide detailed information about the actual state of the disk.
In order to reduce the space used by the Engine database and prevent the Engine process to be overloaded with S.M.A.R.T. values, the Engine does not keep the history of the S.M.A.R.T. values.
When it comes to S.M.A.R.T. index monitoring, things can be quite tricky as we can really only read the values that the system is reporting and we won't be able to go as deep as the proprietary software in order to retrieve the detailed information regarding the hardware status. Hard drive manufacturers such as Seagate, Western Digital and others will have the right tools in order to troubleshoot and give the final diagnostics regarding the health of the hardware components of a hard drive.
The lower the index is, the lower confidence Nexthink would have in the hard drive short term health.
The higher the index is, the higher confidence Nexthink would have in the hard drive long term health.
Example of how to use the S.M.A.R.T. values in a real life situation
The idea here is not necessarily to focus on the value but more on the evolution of it. If you have one or several devices having a S.M.A.R.T. index of 90% one week and the other week 80% it's not really alarming, but you should keep an eye on them. But if for another device the index goes from 90% one week to 55% the following week, you should somehow monitor closely the evolution of the S.M.A.R.T. index for this device.
The best tool for creating a historical trend is the Portal, as it is the designated tool for long term analysis and metrics evolution monitoring.
Having a S.M.A.R.T. index of X% doesn't mean that the Collector is relying on any specific value but on the overall health of a given hard disk. For example, if your device is powered on and working non-stop for 2 weeks, the S.M.A.R.T. index will increase because there is a potential risk for the hard disk to decrease its performance but it does not mean that the hard disk is in danger and must be replaced.
For example, here we have alerts on a laptop that has been running for months without a reboot. The operating system is working fine. In that case, a S.M.A.R.T. parameter is giving us the information that the disk has been powered on for a long time.
Based on that, the algorithm used by Nexthink is emphasizing this information and lowering the disk S.M.A.R.T. index. The S.M.A.R.T. index status will be useful to know that something could be potentially at risk for the hard disk on a given device. Nexthink will provide the first S.M.A.R.T. index status in order to let you decide whether you want to push the investigations further or not. Then you can use a third-party application of your choice to troubleshoot the possible issue.
If the third-party application tells you that there is no error even if a computer has been running for 6 months, there is probably no error but Nexthink will lower the ratio as the algorithm considers that this scenario could lead to future hard disk issues.