CN103646114B - Characteristic extracting method and device in hard disk SMART data - Google Patents
Characteristic extracting method and device in hard disk SMART data Download PDFInfo
- Publication number
- CN103646114B CN103646114B CN201310733574.5A CN201310733574A CN103646114B CN 103646114 B CN103646114 B CN 103646114B CN 201310733574 A CN201310733574 A CN 201310733574A CN 103646114 B CN103646114 B CN 103646114B
- Authority
- CN
- China
- Prior art keywords
- data
- attribute
- smart
- hard disk
- smart data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes characteristic extracting method and device in a kind of hard disk SMART self-monitorings, analysis and reporting techniques data, wherein, the method is comprised the following steps:The SMART data acquisition systems of sample hard disk are obtained, wherein, SMART data acquisition systems include Q SMART data and Q hard disk type information corresponding with Q SMART data difference;Q SMART data are normalized, to generate Q normalization SMART data;Respectively Q normalization SMART data are modified according to Q hard disk type information, to generate amendment SMART data acquisition systems;Hard disk characteristic is generated according to amendment SMART data acquisition systems.The method of the embodiment of the present invention, can pass through same fault pre-alarming model realization and the fault pre-alarming of different hard disks is tested and analyzed, improve the accuracy of fault pre-alarming model, reduce model training, test and analysis cost.
Description
Technical field
The present invention relates to technical field of memory, more particularly to a kind of hard disk SMART self-monitorings, analysis and reporting techniques number
According to middle characteristic extracting method and device.
Background technology
As hard disk failure can be from hard disk SMART (Self Monitoring Analysis And Reporting
Technology, self-monitoring, analysis and reporting techniques) reflect in data, therefore in hard disk failure early warning analysis, can
Whether can be broken down within following a period of time according to the SMART data analysiss hard disk of hard disk.At present, machine can be passed through
Learning algorithm trains fault pre-alarming model according to certain attribute in SMART data, so as to according to the fault pre-alarming model to hard
The SMART data of disk are analyzed to predict whether hard disk being capable of steady operation within following a period of time.
But, it is due to the eigenvalue representation disunity of different attribute in SMART data and excessively discrete, it is difficult to
Predict joint effect of some different attributes to hard disk.And, when training pattern, existing characteristics value is lacked some attributes
Situation, increase analysis SMART data difficulty so that model prediction is inaccurate.Additionally, the hard disc data of different vendor
Eigenvalue calculation mode disunity, is unfavorable for that unified numerical characteristics are represented, it is therefore desirable to the SMART to the hard disk of each manufacturer
Data are respectively trained fault pre-alarming model to carry out fault pre-alarming analysis, and this is accomplished by repeatedly carrying out model training, so that
Analysis cost is increased significantly.
The content of the invention
It is contemplated that at least solving above-mentioned technical problem to a certain extent.
For this purpose, first purpose of the present invention is to propose characteristic extracting method in a kind of hard disk SMART data, should
Method is only capable of achieving the fault pre-alarming to different hard disks by same fault pre-alarming model without the need for multiple fault pre-alarming models
Test and analysis, improve the accuracy of fault pre-alarming model, reduce model training, test and analysis cost.
Second object of the present invention is to propose characteristic extraction element in a kind of hard disk SMART data.
It is that, up to above-mentioned purpose, during first aspect present invention embodiment proposes a kind of hard disk SMART data, characteristic is carried
Method is taken, is comprised the following steps:The SMART data acquisition systems of sample hard disk are obtained, wherein, the SMART data acquisition systems include Q
SMART data and Q hard disk type information corresponding with the Q SMART data difference;The Q SMART data are carried out
Normalized, to generate Q normalization SMART data;According to the Q hard disk type information respectively to the Q normalizing
Change SMART data to be modified, to generate amendment SMART data acquisition systems;Hard disk is generated according to the amendment SMART data acquisition systems
Characteristic.
Characteristic extracting method in the hard disk SMART data of the embodiment of the present invention, by the SMART numbers to sample hard disk
According to being normalized, and normalized hard disk SMART data are modified according to type of hardware information, thus, are made
Hard disk SMART data have identical codomain, and by being modified subregion to normalized hard disk SMART data, so as to
Fault pre-alarming test to different hard disks is capable of achieving by same fault pre-alarming model and is analyzed, improve fault pre-alarming mould
The accuracy of type, reduces model training, test and analysis cost.
It is that during second aspect present invention embodiment provides a kind of hard disk SMART data, characteristic is carried up to above-mentioned purpose
Device is taken, including:First acquisition module, for obtaining the SMART data acquisition systems of sample hard disk, wherein, the SMART data sets
Conjunction includes Q SMART data and Q hard disk type information corresponding with the Q SMART data difference;First generation module,
For being normalized to the Q SMART data, to generate Q normalization SMART data;Correcting module, for root
Respectively the Q normalization SMART data are modified according to the Q hard disk type information, to generate amendment SMART data
Set;Second generation module, for generating hard disk characteristic according to the amendment SMART data acquisition systems.
Characteristic extraction element in the hard disk SMART data of the embodiment of the present invention, by the SMART numbers to sample hard disk
According to being normalized, and normalized hard disk SMART data are modified according to type of hardware information, thus, are made
Hard disk SMART data have identical codomain, and by being modified subregion to normalized hard disk SMART data, so as to
Fault pre-alarming test to different hard disks is capable of achieving by same fault pre-alarming model and is analyzed, improve fault pre-alarming mould
The accuracy of type, reduces model training, test and analysis cost.
The additional aspect and advantage of the present invention will be set forth in part in the description, and partly will become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Description of the drawings
The above-mentioned and/or additional aspect and advantage of the present invention will become from the description with reference to accompanying drawings below to embodiment
It is substantially and easy to understand, wherein:
Fig. 1 be one embodiment of the invention hard disk SMART data in characteristic extracting method flow chart;
Fig. 2 be another embodiment of the present invention hard disk SMART data in characteristic extracting method flow chart;
Fig. 3 be a specific embodiment of the invention hard disk SMART data in gradient data normalized Analysis result
Schematic diagram;
Fig. 4 be the present invention a specific embodiment hard disk SMART data in attribute data normalized Analysis result
Schematic diagram;
Fig. 5 be one embodiment of the invention hard disk SMART data in characteristic extraction element structural representation;
Fig. 6 be another embodiment of the present invention hard disk SMART data in characteristic extraction element structural representation;
And
Fig. 7 be another embodiment of the invention hard disk SMART data in characteristic extraction element structural representation.
Specific embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from start to finish
Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached
The embodiment of figure description is exemplary, is only used for explaining the present invention, and is not considered as limiting the invention.
In describing the invention, it is to be understood that term " " center ", " longitudinal direction ", " horizontal ", " on ", D score,
The orientation or position relationship of the instruction such as "front", "rear", "left", "right", " vertical ", " level ", " top ", " bottom ", " interior ", " outward " is
Based on orientation shown in the drawings or position relationship, it is for only for ease of the description present invention and simplifies description, rather than indicate or dark
Show that the device or element of indication there must be specific orientation, with specific azimuth configuration and operation therefore it is not intended that right
The restriction of the present invention.Additionally, term " first ", " second " are only used for describing purpose, and it is not intended that indicating or implying relative
Importance.
In describing the invention, it should be noted that unless otherwise clearly defined and limited, term " installation ", " phase
Company ", " connection " should be interpreted broadly, for example, it may be being fixedly connected, or being detachably connected, or be integrally connected;Can
Being to be mechanically connected, or electrically connect;Can be joined directly together, it is also possible to be indirectly connected to by intermediary, Ke Yishi
The connection of two element internals.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition
Concrete meaning in invention.
At present, machine learning algorithm can be passed through and fault pre-alarming model is trained according to certain attribute in SMART data, so as to
Whether prediction hard disk being capable of steady operation within following a period of time.If however, existing fault pre-alarming model cannot be embodied
Joint effect of the dry SMART attributes to hard disk, and also the hard disk SMART data of several model cannot be added to a failure
It is trained in the middle of Early-warning Model.Therefore, hard disk failure early warning is inaccurate, and needs to difference in hard disk prealarming process
Different faults Early-warning Model corresponding to the hard disk of model is analyzed, and analysis cost is higher.If can be by the hard of different model
Property value in disk SMART data is processed so that each property value represents unified, then can be by the hardware of some different models
Data are trained in being added to a fault pre-alarming model, thus can reduce fault pre-alarming model training number of times, and reduction is parsed into
This.For this purpose, the present invention proposes characteristic extracting method in a kind of hard disk SMART data.
Fig. 1 be one embodiment of the invention hard disk SMART data in characteristic extracting method flow chart.
As shown in figure 1, characteristic extracting method is comprised the following steps in hard disk SMART data
S11, obtains the SMART data acquisition systems of sample hard disk.
In one embodiment of the invention, SMART data acquisition systems include Q SMART data and with Q SMART data
The corresponding Q hard disk type information of difference.Wherein, SMART data acquisition systems are Q same kind and/or different types of hard disk
Such as hard disk related to hard disk recorded in SMART seeks the attribute datas such as error rate, hard disk temperature, and corresponding
The data acquisition system of hard disk type information.Wherein, hard disk type information refer to by HD vendor provide it is related to hard disk for example
The data messages such as hard disk model, hard disk ID (Identity).For example, when machine algorithm study is carried out, the SMART of hard disk
Data acquisition system includes that the hard disk tracking error rate of multiple different hard disks, hard disk power up the attribute datas such as number of times, hard disk temperature, with
And the information such as the corresponding hard disk model of hard disk, hard disk ID (Identity).
Q SMART data are normalized by S12, to generate Q normalization SMART data.
In one embodiment of the invention, can be in Q different type and/or different types of hard disk SMART data
Each attribute data be normalized respectively, so as to each attribute data that will there are different codomains in SMART data
The data being normalized in same codomain.Thus, it is capable of achieving unified Analysis and the place to different types of hardware SMART data
Reason.
Q normalization SMART data are modified respectively, to generate amendment by S13 according to Q hard disk type information
SMART data acquisition systems.
In order to the test result for obtaining different hard disks respectively in the test result of hard disk failure Early-warning Model, at this
In one embodiment of invention, corresponding data-bias can be set respectively to different hard disks according to the type information of each hard disk
Measure, and the data offset according to corresponding to each hard disk is modified to realize to SMART to Q normalization SMART data
The subregion of data acquisition system.
S14, generates hard disk characteristic according to amendment SMART data acquisition systems.
Characteristic extracting method in the hard disk SMART data of the embodiment of the present invention, by the SMART numbers to sample hard disk
According to being normalized, and normalized hard disk SMART data are modified according to type of hardware information, thus, are made
Hard disk SMART data have identical codomain, and by being modified subregion to normalized hard disk SMART data, so as to
Fault pre-alarming test to different hard disks is capable of achieving by same fault pre-alarming model and is analyzed, improve fault pre-alarming mould
The accuracy of type, reduces model training, test and analysis cost.
In order that the fault pre-alarming model performance for training is more preferably, it is being normalized to Q SMART data
Before, the corresponding Grad of each attribute data in each SMART data can be also obtained by method of least square.Specifically,
Fig. 2 be another embodiment of the present invention hard disk SMART data in characteristic extracting method flow chart.
As shown in Fig. 2 characteristic extracting method is comprised the following steps in hard disk SMART data.
S21, obtains the SMART data acquisition systems of sample hard disk.
In one embodiment of the invention, SMART data acquisition systems include Q SMART data and with Q SMART data
The corresponding Q hard disk type information of difference.Wherein, SMART data acquisition systems are Q same kind and/or different types of hard disk
Such as hard disk related to hard disk recorded in SMART seeks the attribute datas such as error rate, hard disk temperature, and corresponding
The data acquisition system of hard disk type information.Wherein, hard disk type information refer to by HD vendor provide it is related to hard disk for example
The data messages such as hard disk model, hard disk ID (Identity).For example, when machine algorithm study is carried out, the SMART of hard disk
Data acquisition system includes that the hard disk tracking error rate of multiple different hard disks, hard disk power up the attribute datas such as number of times, hard disk temperature, with
And the information such as the corresponding hard disk model of hard disk, hard disk ID (Identity).
S22, obtains the S corresponding S gradient data collection of the first attribute data subclass in each SMART data respectively
Close.
In an embodiment of the present invention, SMART data acquisition systems also include S attribute data corresponding with S attribute difference
Set, each SMART data include and the S corresponding S of attribute difference the first attribute data subclass.Wherein, the first category
Property data subset to close be data acquisition system corresponding to certain attribute in SMART.For example, the first attribute data subclass can be
The data acquisition system corresponding to hard disk tracking error rate attribute in SMART.
In an embodiment of the present invention, first, corresponding first attribute number of s-th attribute from each SMART data
According to M attribute data is chosen in subclass successively, to generate P the second attribute data subclass, wherein, each second attribute number
Include M attribute data, P=N-M+1 according to subclass, N is attribute data in the corresponding first attribute data subclass of attribute s
Sum, s=1 ... S.
Then, the P corresponding P gradient data of the second attribute data subclass is calculated respectively, and according to P gradient data
Generate the corresponding gradient data set of attribute s.Specifically, after P the second attribute data subset is obtained, can be by weighting most
Little square law obtains the P corresponding P fitting coefficient of the second attribute data subclass respectively, specifically, can first obtain P plan
I-th fitting coefficient k in syzygy numberi=(Z-b*Y)/X, wherein, i=1 ... P, specifically, X, Y, Z and b can be by following public affairs
Obtained by formula:
Wherein, wjFor the corresponding default weight of j-th attribute data in the corresponding first attribute data subclass of attribute s, xj
For the detection time of j-th attribute data in the corresponding first attribute data subclass of attribute s, yjFor attribute s corresponding first
J-th attribute data in attribute data subclass.
I-th fitting coefficient k in P fitting coefficient is obtainediAfterwards, P gradient can be obtained respectively by below equation
I-th gradient data Grad in datai:
Gradi=ki*(M-1)*yM+i-1
Wherein, ki* (M-1) is represented between two attribute datas of the straight line for being fitted out by weighted least-squares method
Drop value, the symbol of drop value represent the trend of whole piece straight line, then are multiplied with the y values of last attribute data, are obtained final
Grad, now the size of Grad can both represent overall variation trend, also may indicate that the intensity of variation tendency.
After P gradient data is obtained, the corresponding gradient data set of attribute s can be generated according to P gradient data.
It should be appreciated that S the first attribute data subset in each SMART data can be finally obtained by above-mentioned steps
Close S corresponding gradient data set.
S23, using S gradient data set of each the SMART data for obtaining as S the first new attribute data subset
Conjunction is separately added into each SMART data.
Q SMART data are normalized by S24, to generate Q normalization SMART data.
In an embodiment of the present invention, Q SMART data can be normalized by below equation:
G (x)=sign (x) × logy| x |,
Wherein, x is an attribute data in Q SMART data, after g (x) is for the corresponding normalization of attribute data x
Attribute data, wherein, y can be calculated by below equation:
yz≤Value<(y+Δy)z,
Wherein, z is predetermined threshold value, and factory-defaults of the Value for the corresponding attributes of attribute data x, Δ y are default essence
Degree.
For example, (1900,2000) corresponding to greatest gradient value account for sum more than 70% when, can by calculate
The Grad for obtaining is Grad=1.078, y=1.071, then gradient normalization image as shown in Figure 3 is obtained and such as Fig. 4 institutes
The attribute data normalized image shown.
Q normalization SMART data are modified respectively, to generate amendment by S25 according to Q hard disk type information
SMART data acquisition systems.
In order to the test result for obtaining different hard disks respectively in the test result of hard disk failure Early-warning Model, at this
In inventive embodiment, can according to Q hard disk type acquisition of information Q correction value corresponding with Q hard disk type information difference,
And according to corresponding Q correction value is repaiied to corresponding normalization SMART data respectively respectively with Q hard disk type information
Just.For example, corresponding data offset, and root can be set respectively according to the type information of each hard disk to different hard disks
Q normalization SMART data are modified according to the data offset corresponding to each hard disk to realize to SMART data acquisition systems
Subregion.
S26, generates hard disk characteristic according to amendment SMART data acquisition systems.
In an embodiment of the present invention, each amendment SMART data acquisition system includes and the S corresponding S of attribute difference
Amendment attribute data set.Specifically, the corresponding S training characteristics data of S attribute can be obtained respectively, and respectively to each
Training characteristics value in training characteristics data is ranked up to generate S characteristic sequence (V corresponding with S attributei).Wherein,
The feature of each property value v in the corresponding amendment attribute data set of each attribute can be obtained by following default mapping ruler
Value f (v):
After the eigenvalue for obtaining each property value, can be according to the corresponding amendment attribute data collection of each attribute for obtaining
The eigenvalue of each property value in conjunction generates hard disk characteristic.Thus, can cause to correct each in SMART data acquisition systems
Eigenvalue corresponding to training data is trained in being all applied to fault pre-alarming model by mapping ruler, it is to avoid training
In model process, the defect of eigenvalue disappearance, improves the accuracy of failure predication model.
Multiple SMART data are being returned by characteristic extracting method in the hard disk SMART data of the embodiment of the present invention
Before one change is processed, the corresponding gradient data set of attribute data in each SMART data is obtained by method of least square, and with
Gradient data set updates corresponding first attribute data set, thus, the variation tendency of SMRAT data can be caused to highlight
Come, then coordinate machine learning algorithm, the fault pre-alarming model performance for training can be made more preferably.
In order to realize above-described embodiment, the present invention also proposes characteristic extraction element in a kind of hard disk SMART data.
Characteristic extraction element in a kind of hard disk SMART data, including:First acquisition module is hard for obtaining sample
The SMART data acquisition systems of disk, wherein, SMART data acquisition systems include Q SMART data and corresponding respectively with Q SMART data
Q hard disk type information;First generation module, for being normalized to Q SMART data, to generate Q normalizing
Change SMART data;Correcting module, for being modified to Q normalization SMART data according to Q hard disk type information respectively,
To generate amendment SMART data acquisition systems;Second generation module, for generating hard disk characteristic number according to amendment SMART data acquisition systems
According to.
Fig. 5 be one embodiment of the invention hard disk SMART data in characteristic extraction element structural representation.
As shown in figure 5, characteristic extraction element includes in hard disk SMART data:First acquisition module 100, first is given birth to
Into module 200, correcting module 300 and the second generation module 400.
Specifically, the first acquisition module 100 is used to obtain the SMART data acquisition systems of sample hard disk.Wherein, SMART data
Set includes Q SMART data and Q hard disk type information corresponding with Q SMART data difference.In other words, SMART numbers
It is that such as hard disk related to hard disk recorded in Q same kind and/or different types of hard disk SMART seeks out according to set
The attribute datas such as error rate, hard disk temperature, and the data acquisition system of corresponding hard disk type information.Wherein, hard disk type letter
Breath refers to such as hard disk model related to hard disk, the data message such as hard disk ID (Identity) provided by HD vendor.Lift
For example, when machine algorithm study is carried out, the first acquisition module 100 can obtain multiple in the SMART data acquisition systems of hard disk
The hard disk tracking error rate of different hard disks, hard disk power up the attribute datas such as number of times, hard disk temperature, and the corresponding hard disk type of hard disk
Number, the data message such as hard disk ID (Identity).
First generation module 200 for being normalized to Q SMART data, to generate Q normalization SMART
Data.Specifically, in one embodiment of the invention, 200 cocoa of the first generation module is to Q different type and/or difference
Each attribute data in the hard disk SMART data of type is normalized respectively, so as to will have not in SMART data
The data in same codomain are normalized to each attribute data of codomain.Thus, it is capable of achieving to different types of hardware SMART
The unified Analysis of data and process.
Correcting module 300 for being modified to Q normalization SMART data according to Q hard disk type information respectively, with
Generate amendment SMART data acquisition systems.Specifically, in one embodiment of the invention, correcting module 300 can be according to each hard disk
Type information set corresponding data offset, and the data-bias according to corresponding to each hard disk to different hard disks respectively
Amount is modified to realize the subregion to SMART data acquisition systems to Q normalization SMART data.Thus, in hard disk failure early warning
The test result of different hard disks is obtained in the test result of model respectively can.
Second generation module 400 is for according to amendment SMART data acquisition system generation hard disk characteristics.
Characteristic extraction element in the hard disk SMART data of the embodiment of the present invention, by the SMART numbers to sample hard disk
According to being normalized, and normalized hard disk SMART data are modified according to type of hardware information, thus, are made
Hard disk SMART data can have identical codomain, and by being modified subregion to normalized hard disk SMART data,
The fault pre-alarming test to different hard disks and analysis are capable of achieving so as to pass through same fault pre-alarming model, failure is improve pre-
The accuracy of alert model, reduces model training, test and analysis cost.
Fig. 6 be another embodiment of the present invention hard disk SMART data in characteristic extraction element structural representation.
As shown in fig. 6, characteristic extraction element includes in hard disk SMART data:First acquisition module 100, first is given birth to
Into module 200, correcting module 300, the second generation module 400, the second acquisition module 500 and addition module 600.
In an embodiment of the present invention, SMART data acquisition systems include S attribute data collection corresponding with S attribute difference
Close, each SMART data includes and the S corresponding S of attribute difference the first attribute data subclass.Wherein, the first attribute
It is data acquisition system corresponding to certain attribute in SMART that data subset is closed.For example, the first attribute data subclass can be
The data acquisition system corresponding to hard disk tracking error rate attribute in SMART.
Specifically, the second acquisition module 500 is used to obtain S the first attribute data in each SMART data respectively
Gather corresponding S gradient data set.Add S gradient data collection of each SMART data of the module 600 for obtaining
Cooperate to be separately added into each SMART data for S the first new attribute data subclass.
Fig. 7 be another embodiment of the invention hard disk SMART data in characteristic extraction element structural representation.
As shown in fig. 7, characteristic extraction element includes in hard disk SMART data:First acquisition module 100, first is given birth to
Into module 200, correcting module 300, the second generation module 400, the second acquisition module 500 and addition module 600.Wherein, second
Acquisition module 500 includes:First signal generating unit 510 and the second signal generating unit 520, wherein, the first generation module 200 includes processing
Unit 210, correcting module 300 include:First acquisition unit 310 and amending unit 320, the second generation module 400 include:Second
Obtaining unit 410, sequencing unit 420 and the 3rd acquiring unit 430.Wherein, the second signal generating unit 520 includes:First obtains son
Unit 521 and second obtains subelement 522.
Specifically, the first signal generating unit 510 is for corresponding first attribute of s-th attribute from each SMART data
Data subset chooses M attribute data in closing successively, to generate P the second attribute data subclass, wherein, each second attribute
Data subset is closed includes M attribute data, P=N-M+1, and N is attribute number in the corresponding first attribute data subclass of attribute s
According to sum, s=1 ... S.
Second signal generating unit 520 is used to calculate the P corresponding P gradient data of the second attribute data subclass respectively, and
The corresponding gradient data set of attribute s is generated according to P gradient data.
Processing unit 210 is used to be normalized Q SMART data by below equation:
G (x)=sign (x) × logy| x |,
Wherein, x is an attribute data in Q SMART data, after g (x) is for the corresponding normalization of attribute data x
Attribute data, y can be calculated by below equation:
yz≤Value<(y+Δy)z,
Wherein, z is predetermined threshold value, and factory-defaults of the Value for the corresponding attributes of attribute data x, Δ y are default essence
Degree.
First acquisition unit 310 is for corresponding with Q hard disk type information difference according to Q hard disk type acquisition of information
Q correction value.
Amending unit 320 is used for according to corresponding Q correction value is returned to corresponding respectively respectively with Q hard disk type information
One change SMART data are modified.
Second obtaining unit 410 is used for the corresponding S training characteristics data of S attribute of acquisition respectively.
Sequencing unit 420 is used to be ranked up the training characteristics value in each training characteristics data with generation and S respectively
The corresponding S characteristic sequence (V of individual attributei)。
3rd acquiring unit 430 is used to obtain the corresponding amendment attribute data of each attribute by following default mapping ruler
Eigenvalue f (v) of each property value v in set:
3rd signal generating unit 440 is for according to each category in the corresponding amendment attribute data set of each attribute for obtaining
Property value eigenvalue generate hard disk characteristic.
First acquisition subelement 521 is used to obtain P the second attribute data subclass respectively by weighted least-squares method
Corresponding P fitting coefficient, wherein, i-th fitting coefficient k in P fitting coefficienti=(Z-b*Y)/X,
Wherein, i=1 ... P,
wjFor the corresponding default weight of j-th attribute data in the corresponding first attribute data subclass of attribute s, xjFor category
The detection time of j-th attribute data, y in the property corresponding first attribute data subclass of sjFor corresponding first attributes of attribute s
J-th attribute data in data subset conjunction.
Second acquisition subelement 522 is used to obtain i-th gradient data in P gradient data respectively by below equation
Gradi:
Gradi=ki*(M-1)*yM+i-1。
Characteristic extraction element in the hard disk SMART data of the embodiment of the present invention, obtains each by method of least square
The corresponding gradient data set of attribute data in SMART data, and corresponding first attribute data is updated with gradient data set
Set, thus, can cause the variation tendency of SMRAT data to highlight, then coordinate machine learning algorithm, can make what is trained
Fault pre-alarming model performance is more preferably.
In flow chart or here any process described otherwise above or method description are construed as, expression includes
It is one or more for realizing specific logical function or process the step of the module of code of executable instruction, fragment or portion
Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein the suitable of shown or discussion can not be pressed
Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or here logic described otherwise above and/or step, for example, are considered use in flow charts
In the order list of the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (as computer based system, the system including processor or other can hold from instruction
The system of row system, device or equipment instruction fetch execute instruction) use, or with reference to these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass
The dress that defeated program is used for instruction execution system, device or equipment or with reference to these instruction execution systems, device or equipment
Put.The more specifically example (non-exhaustive list) of computer-readable medium is including following:With the electricity that one or more connect up
Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits
Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program thereon or other are suitable
Medium, because for example by carrying out optical scanning to paper or other media edlin, interpretation can then be entered or if necessary with which
His suitable method is processed to electronically obtain described program, is then stored in computer storage.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In embodiment, the software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage
Or firmware is realizing.For example, if realized with hardware, and in another embodiment, can be with well known in the art
Any one of row technology or their combination are realizing:With for the logic gates of logic function is realized to data signal
Discrete logic, the special IC with suitable combinational logic gate circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method is carried
Suddenly the hardware that can be by program to instruct correlation is completed, and described program can be stored in a kind of computer-readable storage medium
In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
Additionally, each functional unit in each embodiment of the invention can be integrated in a processing module, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a module.Above-mentioned integrated mould
Block both can be realized in the form of hardware, it would however also be possible to employ the form of software function module is realized.The integrated module is such as
Fruit using in the form of software function module realize and as independent production marketing or use when, it is also possible to be stored in a computer
In read/write memory medium.
Storage medium mentioned above can be read only memory, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
Example ", or the description of " some examples " etc. mean specific features with reference to the embodiment or example description, structure, material or spy
Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not
Identical embodiment or example are referred to necessarily.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not
These embodiments can be carried out with various changes, modification, replacement and modification in the case of the principle and objective that depart from the present invention, this
The scope of invention is limited by claim and its equivalent.
Claims (12)
1. a kind of hard disk SMART self-monitorings, analysis and reporting techniques data in characteristic extracting method, it is characterised in that
Including:
Obtain sample hard disk SMART data acquisition systems, wherein, the SMART data acquisition systems include Q SMART data and with institute
State Q SMART data and distinguish corresponding Q hard disk type information;
The Q SMART data are normalized, to generate Q normalization SMART data;
Respectively the Q normalization SMART data are modified according to the Q hard disk type information, to generate amendment
SMART data acquisition systems;
Hard disk characteristic is generated according to the amendment SMART data acquisition systems;
Wherein, the SMART data acquisition systems include S attribute data set corresponding with S attribute difference, described in each
SMART data include and the S corresponding S of attribute difference the first attribute data subclass, described to the Q SMART
Before data are normalized, also include:
The S corresponding S gradient data set of the first attribute data subclass in each described SMART data is obtained respectively;
Using S gradient data set of each the SMART data for obtaining as S the first new attribute data subclass point
Jia Ru not described each SMART data.
2. the method for claim 1, it is characterised in that obtain S first category in each described SMART data respectively
Property data subset close corresponding S gradient data set and specifically include:
M attribute is chosen successively in the corresponding first attribute data subclass of s-th attribute from SMART data each described
Data, to generate P the second attribute data subclass, wherein, P=N-M+1, N are corresponding first attribute datas of the attribute s
The sum of attribute data in subclass, s=1 ... S;
The P corresponding P gradient data of the second attribute data subclass is calculated respectively, and according to the P gradient data
Generate the corresponding gradient data set of the attribute s.
3. method as claimed in claim 2, it is characterised in that described to calculate the P the second attribute data subclass respectively
Corresponding P gradient data is specifically included:
The P corresponding P fitting coefficient of the second attribute data subclass is obtained respectively by weighted least-squares method, its
In, i-th fitting coefficient k in the P fitting coefficienti=(Z-b*Y)/X,
Wherein, i=1 ... P,
wjFor the corresponding default weight of j-th attribute data in the corresponding first attribute data subclass of the attribute s, xjFor institute
State the detection time of j-th attribute data in the corresponding first attribute data subclass of attribute s, yjIt is corresponding for the attribute s
J-th attribute data in first attribute data subclass;
Obtain i-th gradient data Grad in the P gradient data by below equation respectivelyi:
Gradi=ki*(M-1)*yM+i-1。
4. the method as described in any one of claim 1-3, it is characterised in that described that normalizing is carried out to the Q SMART data
Change is processed and is specifically included:
The Q SMART data are normalized by below equation:
G (x)=sign (x) × logy| x |,
Wherein, x is an attribute data in the Q SMART data, after g (x) is for the corresponding normalization of attribute data x
Attribute data, the y can be calculated by below equation:
yz≤Value<(y+Δy)z,
Wherein, z is predetermined threshold value, and Value is the factory-default of the corresponding attributes of the attribute data x, and Δ y is default smart
Degree.
5. the method as described in any one of claim 1-3, it is characterised in that described according to the Q hard disk type information point
Other being modified to the Q normalization SMART data specifically includes:
According to Q hard disk type acquisition of information Q correction value corresponding with the Q hard disk type information difference;
According to corresponding Q correction value is entered to corresponding normalization SMART data respectively respectively with the Q hard disk type information
Row amendment.
6. the method as described in any one of claim 1-3, it is characterised in that the amendment SMART data acquisition systems include with
Corresponding S of the S attribute difference corrects attribute data set, described to be generated firmly according to the amendment SMART data acquisition systems
Disk characteristic is specifically included:
The corresponding S training characteristics data of the S attribute are obtained respectively;
Respectively the training characteristics value in each training characteristics data is ranked up corresponding with the S attribute S to generate
Characteristic sequence (Vi);
The spy of each property value v in the corresponding amendment attribute data set of each attribute is obtained by following default mapping ruler
Value indicative f (v):
Generated according to the eigenvalue of each property value in the corresponding amendment attribute data set of each attribute for obtaining described hard
Disk characteristic.
7. characteristic extraction element in a kind of hard disk SMART data, it is characterised in that include:
First acquisition module, for obtaining the SMART data acquisition systems of sample hard disk, wherein, the SMART data acquisition systems include Q
Individual SMART data and Q hard disk type information corresponding with the Q SMART data difference;
First generation module, for being normalized to the Q SMART data, to generate Q normalization SMART number
According to;
Correcting module, for being modified to the Q normalization SMART data according to the Q hard disk type information respectively,
To generate amendment SMART data acquisition systems;
Second generation module, for generating hard disk characteristic according to the amendment SMART data acquisition systems;
Wherein, the SMART data acquisition systems include S attribute data set corresponding with S attribute difference, described in each
SMART data include and the S corresponding S of attribute difference the first attribute data subclass, first generation module it
Before, also include:
Second acquisition module, it is corresponding for obtaining the S in each described SMART data the first attribute data subclass respectively
S gradient data set;
Module is added, S gradient data set of each the SMART data for obtaining belongs to as S new first
Property data subset close and be separately added into described each SMART data.
8. device as claimed in claim 7, it is characterised in that second acquisition module is specifically included:
First signal generating unit, for the corresponding first attribute data subclass of s-th attribute from SMART data each described
In choose M attribute data successively, to generate P the second attribute data subclass, wherein, P=N-M+1, N are the attribute s
The sum of attribute data, s=1 ... S in corresponding first attribute data subclass;
Second signal generating unit, for calculating the P corresponding P gradient data of the second attribute data subclass, and root respectively
The corresponding gradient data set of the attribute s is generated according to the P gradient data.
9. device as claimed in claim 8, it is characterised in that second signal generating unit is specifically included:
First obtains subelement, for obtaining the P the second attribute data subclass pair respectively by weighted least-squares method
The P fitting coefficient answered, wherein, i-th fitting coefficient k in the P fitting coefficienti=(Z-b*Y)/X,
Wherein, i=1 ... P,
wjFor the corresponding default weight of j-th attribute data in the corresponding first attribute data subclass of the attribute s, xjFor institute
State the detection time of j-th attribute data in the corresponding first attribute data subclass of attribute s, yjIt is corresponding for the attribute s
J-th attribute data in first attribute data subclass;
Second obtains subelement, for obtaining i-th gradient data in the P gradient data respectively by below equation
Gradi:
Gradi=ki*(M-1)*yM+i-1。
10. the device as described in any one of claim 7-9, it is characterised in that first generation module is specifically included:
Processing unit, for being normalized to the Q SMART data by below equation:
G (x)=sign (x) × logy| x |,
Wherein, x is an attribute data in the Q SMART data, after g (x) is for the corresponding normalization of attribute data x
Attribute data, the y can be calculated by below equation:
yz≤Value<(y+Δy)z,
Wherein, z is predetermined threshold value, and Value is the factory-default of the corresponding attributes of the attribute data x, and Δ y is default smart
Degree.
11. devices as described in any one of claim 7-9, it is characterised in that the correcting module is specifically included:
First acquisition unit, for corresponding respectively with the Q hard disk type information according to the Q hard disk type acquisition of information
Q correction value;
Amending unit, distinguishes corresponding Q correction value respectively to corresponding normalizing for basis and the Q hard disk type information
Change SMART data to be modified.
12. devices as described in any one of claim 7-9, it is characterised in that the amendment SMART data acquisition systems include with
Corresponding S of the S attribute difference corrects attribute data set, and second generation module is specifically included:
Second obtaining unit, for obtaining the corresponding S training characteristics data of the S attribute respectively;
Sequencing unit, for being ranked up respectively to generate and the S to the training characteristics value in each training characteristics data
The corresponding S characteristic sequence (V of attributei);
3rd acquiring unit, for being obtained in the corresponding amendment attribute data set of each attribute by following default mapping ruler
Each property value v eigenvalue f (v):
3rd signal generating unit, for according to each property value in the corresponding amendment attribute data set of each attribute for obtaining
Eigenvalue generates the hard disk characteristic.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310733574.5A CN103646114B (en) | 2013-12-26 | 2013-12-26 | Characteristic extracting method and device in hard disk SMART data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310733574.5A CN103646114B (en) | 2013-12-26 | 2013-12-26 | Characteristic extracting method and device in hard disk SMART data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103646114A CN103646114A (en) | 2014-03-19 |
CN103646114B true CN103646114B (en) | 2017-04-05 |
Family
ID=50251327
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310733574.5A Active CN103646114B (en) | 2013-12-26 | 2013-12-26 | Characteristic extracting method and device in hard disk SMART data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103646114B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105589795A (en) * | 2014-12-31 | 2016-05-18 | 中国银联股份有限公司 | Disk failure prediction method and device based on prediction model |
CN105260279B (en) * | 2015-11-04 | 2019-01-01 | 四川效率源信息安全技术股份有限公司 | Method and apparatus based on SMART data dynamic diagnosis hard disk failure |
CN107025153B (en) * | 2016-01-29 | 2021-02-12 | 阿里巴巴集团控股有限公司 | Disk failure prediction method and device |
CN110399238B (en) * | 2019-06-27 | 2023-09-22 | 浪潮电子信息产业股份有限公司 | Disk fault early warning method, device, equipment and readable storage medium |
CN110929305A (en) * | 2019-08-08 | 2020-03-27 | 北京盛赞科技有限公司 | Hard disk protection method, device, equipment and computer readable storage medium |
CN113380316B (en) * | 2020-02-25 | 2024-07-09 | 深信服科技股份有限公司 | Disk information mining method, device, equipment and storage medium |
CN111611117B (en) * | 2020-05-22 | 2022-06-10 | 浪潮电子信息产业股份有限公司 | Hard disk fault prediction method, device, equipment and computer readable storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1627277A (en) * | 2003-12-13 | 2005-06-15 | 张国飙 | Intelligent hard disk |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101764846B (en) * | 2009-12-18 | 2012-07-11 | 西南交通大学 | Implement method of remote centralized disk array operation monitoring system |
US8572315B2 (en) * | 2010-11-05 | 2013-10-29 | International Business Machines Corporation | Smart optimization of tracks for cloud computing |
-
2013
- 2013-12-26 CN CN201310733574.5A patent/CN103646114B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1627277A (en) * | 2003-12-13 | 2005-06-15 | 张国飙 | Intelligent hard disk |
Non-Patent Citations (1)
Title |
---|
固态硬盘当缓存 Intel Smart Response技术实战;PCFAN评测室;《电脑迷 》;20110731;32-33 * |
Also Published As
Publication number | Publication date |
---|---|
CN103646114A (en) | 2014-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103646114B (en) | Characteristic extracting method and device in hard disk SMART data | |
Hong et al. | Toward a connectivity gradient-based framework for reproducible biomarker discovery | |
CN110514924B (en) | Power transformer winding fault positioning method based on deep convolutional neural network fusion visual identification | |
CN108154508B (en) | Method, apparatus, storage medium and the terminal device of product defects detection positioning | |
CN111459700B (en) | Equipment fault diagnosis method, diagnosis device, diagnosis equipment and storage medium | |
CN110399238B (en) | Disk fault early warning method, device, equipment and readable storage medium | |
CN108257121B (en) | Method, apparatus, storage medium and the terminal device that product defects detection model updates | |
US11715198B2 (en) | Medical use artificial neural network-based medical image analysis apparatus and method for evaluating analysis results of medical use artificial neural network | |
CN107844417A (en) | Method for generating test case and device | |
US10210456B2 (en) | Estimation of predictive accuracy gains from added features | |
CN104596780B (en) | Diagnosis method for sensor faults of motor train unit braking system | |
JP2020052740A (en) | Abnormality detection device, abnormality detection method, and program | |
CN108919059A (en) | A kind of electric network failure diagnosis method, apparatus, equipment and readable storage medium storing program for executing | |
CN110517762A (en) | The method for generating for identification and/or predicting the knowledge base of the failure of Medical Devices | |
CN115964361B (en) | Data enhancement method, system, equipment and computer readable storage medium | |
US20230110056A1 (en) | Anomaly detection based on normal behavior modeling | |
Menze et al. | Mimicking the human expert: pattern recognition for an automated assessment of data quality in MR spectroscopic images | |
CN111949459B (en) | Hard disk failure prediction method and system based on transfer learning and active learning | |
CN109978868A (en) | Toy appearance quality determining method and its relevant device | |
CN114742115B (en) | Method for constructing fault diagnosis model of rolling bearing and diagnosis method | |
CN106919380A (en) | Programmed using the data flow of the computing device of the figure segmentation estimated based on vector | |
CN114236272B (en) | Intelligent detection system of electronic product | |
CN112183751B (en) | Neural network model prediction confidence calibration method, system and storage medium | |
CN114528942A (en) | Construction method of data sample library of engineering machinery, failure prediction method and engineering machinery | |
KR102166441B1 (en) | Lesions detecting apparatus and controlling method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |