CN107229636B - Word classification method and device - Google Patents
Word classification method and device Download PDFInfo
- Publication number
- CN107229636B CN107229636B CN201610173402.0A CN201610173402A CN107229636B CN 107229636 B CN107229636 B CN 107229636B CN 201610173402 A CN201610173402 A CN 201610173402A CN 107229636 B CN107229636 B CN 107229636B
- Authority
- CN
- China
- Prior art keywords
- category
- evaluation
- comment data
- classification
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000011156 evaluation Methods 0.000 claims abstract description 146
- 239000000463 material Substances 0.000 claims description 5
- 230000001737 promoting effect Effects 0.000 claims 1
- 238000012544 monitoring process Methods 0.000 abstract description 20
- 238000010586 diagram Methods 0.000 description 17
- 230000000694 effects Effects 0.000 description 9
- 238000012552 review Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a word classification method, which comprises the following steps: obtaining comment data of a specified service; according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category; according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category; and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong. The word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a word classification method and device.
Background
Whether a real product or a virtual product is concerned, the influence of the public praise on the product is more and more important, the quality of the public praise is from the evaluation of a user, and if the evaluation of the user can be mastered in time, the product can be modified in a targeted manner so as to improve the product better.
In the public opinion monitoring system in the prior art, key information is captured from complicated information on the internet through related professional public opinion software and then added into a public word bank of the public opinion monitoring system. Thus, the public reaction to public opinion events can be known from the public thesaurus.
The public opinion monitoring system in the prior art is applied to the internet, and can extract some keywords of user evaluation on the internet, but the types of services or products related to the evaluation of the user on the internet are various, and service operators cannot make correct judgment on the quality of each service or product according to the keywords in a public word bank.
Disclosure of Invention
The embodiment of the invention provides a word classification method, which can count evaluation words related to the quality of each service according to each service, thereby improving the effectiveness of monitoring. The embodiment of the invention also provides a corresponding device.
The invention provides a word classification method in a first aspect, which comprises the following steps:
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
The second aspect of the present invention provides a word classification device, including:
the acquisition unit is used for acquiring comment data of the specified service;
the first classification unit is used for performing first-level classification on the comment data acquired by the acquisition unit according to a first category contained in the specified service to obtain comment data of each first category;
the second classification unit is used for performing second hierarchical classification on the comment data of each first category obtained by the classification of the first classification unit according to a second category contained in each first category to obtain the comment data of each second category;
the extracting unit is used for extracting the evaluation words in the comment data of each second category classified by the second classifying unit;
and the relationship establishing unit is used for establishing the corresponding relationship between the evaluation terms extracted by the extracting unit and the second category to which the evaluation terms belong.
Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word stock of the public opinion monitoring system, the word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of a public opinion monitoring system according to an embodiment of the present invention;
FIG. 2 is a diagram of an embodiment of a word classification method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a classification architecture of a posting scenario in an embodiment of the invention;
FIG. 4 is a diagram illustrating a classification scheme for movie scenes according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a classification scheme of a cartoon scene according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a classification scheme of a game scenario according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a classification scheme of a literature scene according to an embodiment of the present invention;
FIG. 8 is a diagram of an embodiment of a word classification device according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
FIG. 10 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
FIG. 11 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
fig. 12 is a schematic diagram of another embodiment of the word classification device in the embodiment of the invention.
Detailed Description
The embodiment of the invention provides a word classification method, which can count evaluation words related to the quality of each service according to each service, thereby improving the effectiveness of monitoring. The embodiment of the invention also provides a corresponding device. The following are detailed below.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The word classification method provided by the embodiment of the invention is applied to a public opinion monitoring system, and can provide an evaluation system word bank for positive and negative public opinion analysis aiming at general entertainment business events, activities, products and the like. That is, a special evaluation word library can be provided for a game, an animation, a literary work, a movie, a distribution meeting, and the like. The evaluation thesaurus is not limited to the above fields, and in the future, it is possible to monitor the contents related to propagation, such as activity events and propagation materials having a large influence, and the field related to the evaluation thesaurus dynamically changes according to the demand. Therefore, the content provider can perform targeted improvement by combining the evaluation word stock, and the word stock in the application comprises positive evaluation words and negative evaluation words, so that the content provider can master the advantages and disadvantages of the provided works in time, and further make better content. Moreover, the term of evaluation of the present application is incorporated into a particular product, work, or business, which may also provide efficiency of machine learning.
The public opinion monitoring system generally comprises a user platform, a plurality of classification nodes and a storage array, as shown in fig. 1, wherein a user can evaluate a service through the user platform, and comment data aiming at a certain service is input on the user platform. Comment data received on a user platform can be distributed to classification nodes for classification, classification of the classification nodes can be based on classification according to services, after the services to which the comment data belong are determined, the comment data are input into a storage array for storage, and a storage device can be divided into a plurality of storage areas, for example: the storage area can be divided into a movie storage area, an animation storage area, a literature storage area, a publishing storage area and the like, wherein each storage area can be further subdivided into a plurality of small storage areas, and the small storage areas are used for storing comment data of specific services.
When the user needs to know the relevant evaluation of the specified service, the word classification device can acquire the comment data of the specified service from the storage area of the specified service stored in the storage device, and then classify the comment data to extract the evaluation word.
The device for performing word classification in the embodiment of the present invention may be an independent server for word classification, or may be a service cluster including a plurality of classification nodes.
FIG. 2 is a diagram of an embodiment of a word classification method according to an embodiment of the present invention.
As shown in fig. 2, an embodiment of the method for word classification in the embodiment of the present invention includes:
101. and obtaining comment data of the specified service.
The specified service may be a service specified by a content developer, such as: a movie, a literary work, a cartoon, etc.
102. And according to the first category contained in the specified service, performing first-level classification on the comment data to obtain comment data of each first category.
The specified traffic may include a plurality of first categories, such as: movies may include a first category of movie properties, characters, movie internals, and entireties.
The review data for the movie may be categorized by several first categories of movie characteristics, characters, movie internals, and universes.
103. And according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category.
The first category may also include a plurality of second categories, such as: the film characteristics can include a plurality of second categories such as line-of-speech dialogue, plot/drama, suspense, sound effects, colors, photography/pictures, music, details, special effects/special effects, scenes and the like;
the comment data of the movie characteristics are classified according to lines, scenes, plots, dramas, suspense, sound effects, colors, photographs, pictures, music, details, special effects, and scenes, and each comment data of the second category can be obtained.
Of course, the second category may be further divided into a third category, and the third category may be further divided into a fourth category, and the infinite division is not exhaustive, so the division into the second category is only used as an example in the present application.
104. And extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
Extracting the evaluation terms from the comment data may adopt a keyword extraction method, for example: the comment data in the second category of scenarios are: the theme of the star on the earth is excellent, the subject is good and novel, the keyword novelty can be extracted from the theme, and then, the plot and the novel corresponding relation are established.
In the embodiment of the invention, comment data of a specified service are acquired; according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category; according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category; and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong. Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word stock of the public opinion monitoring system, the word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Optionally, after the extracting of the evaluation terms in the comment data of each second category, the method may further include:
determining attributes of the evaluation terms;
the establishing of the corresponding relationship between the evaluation terms and the second category to which the evaluation terms belong includes:
and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
The attributes of the evaluation words can be divided into positive and negative, the positive evaluation words are beneficial to estimating the aspect that the content provider continues to be well, and the negative evaluation words are beneficial to estimating the content provider to modify the content in time, so that better service can be provided for users.
In the embodiment of the present invention, a relationship between a first category and a second category is established for a release meeting, a movie, an animation, a game, and a literary work, a third category is also established for some second categories, and a process of performing word classification for different services in the embodiment of the present invention is described below with reference to fig. 3 to 7, respectively.
Fig. 3 is a schematic diagram of a classification architecture of a release meeting scenario in the embodiment of the present invention.
As shown in fig. 3, a first category of conference services may include: content, people, services, promotions, and entitlements, wherein a second category under the content may include: theme, product, facility, music, venue, environment, and time. The second category under the character may include guests, teams, presenters, performers, and spectators. The second category under service may include ticketing, invitations, and live services. The second category under the hype may include media lineups and hype material.
The comment data of the conference service may include positive comment data and negative comment data, the comment data may be classified from the positive side and the negative side, the positive comment may be understood with reference to table 1, for example, and the negative comment may be understood with reference to table 2, for example.
Table 1: corresponding relation between positive evaluation words and categories of conference release business
First class | Second class | Evaluation word |
Content providing method and apparatus | Music | Scene of should |
Content providing method and apparatus | Site | Air pie |
Character | Host person | Unique |
Character | Audience member | Order of conservation |
Service | Invitation letter | Exquisite |
Propaganda | Media material | Fresh and fresh |
Integral body | Integral body | Gorgeous |
Table 2: correspondence between negative evaluation words and categories of conference release business
Tables 1 and 2 are just a few examples of positive and negative evaluation terms of the conference service, and actually, a public opinion monitoring system may include correspondence between a plurality of evaluation terms and the second category. Thus, the host of the release meeting can know the advantages and the disadvantages of the host through the evaluations, the advantages can continue in the next release meeting, and the disadvantages can be made up as much as possible in the next release meeting.
Fig. 4 is a schematic diagram of a classification architecture of movie scenes in an embodiment of the present invention.
As shown in fig. 4, a first category of movie services may include: movie features, characters, movie internals, and universes. The second category in the movie feature may include line-of-speech dialog, drama, suspense, sound effects, photography, color, music, detail, trick, scene, and the like. The second category of characters may include a director and actors. The second category in the movie interior may include value, feelings, stories and styles. The whole is judged on the whole, and the second category of the whole can be considered as the whole.
The review data of the movie service may include positive review data and negative review data, and the review data may be classified from positive and negative, respectively, the positive reviews may be understood with reference to table 3, for example, and the negative reviews may be understood with reference to table 4, for example.
Table 3: corresponding relation between positive evaluation words and categories of film service
First class | Second class | Evaluation word |
Film characteristics | Plot of a scene | Warm |
Film characteristics | Specific effects | Impact force |
Character | Actor(s) | Playing with games |
Character | Director | Good of guide |
Movie insider | Story | Vividly moving |
Movie insider | Value view | Positive energy |
Integral body | Integral body | Classic |
Table 4: correspondence between negative evaluation words and categories of film services
Tables 3 and 4 are just a few examples of positive and negative evaluation terms of film services, and actually, a public opinion monitoring system may include a plurality of corresponding relationships between the evaluation terms and the second category. Therefore, the producer of the film can know own advantages and disadvantages through the evaluations, the advantages can continue in the next release meeting, and the disadvantages can be made up as much as possible in the next release meeting.
The taxonomy architecture diagrams of scenes for animations, games and literary works can be understood with reference to fig. 5-7, respectively.
The first and second categories of animation, games and literary works can be understood with reference to fig. 5 to 7 in combination with the above-described conference service of fig. 3 and the movie service of fig. 4, which are not listed in the present application.
In this embodiment of the present invention, after the establishing of the correspondence between the evaluation term and the second category to which the evaluation term belongs, the method may further include:
obtaining comment data updated by the specified service;
determining updated evaluation terms in the second category according to the updated comment data;
and updating the corresponding relation according to the updated evaluation words.
Content providers may try to perfect a given service after learning some comment terms, such as: the method can patch the game to improve the defects in the game, so that some new comment data may exist, updated evaluation words can be proposed according to the new comment data, and the corresponding relation between the second category and the evaluation words is updated. Therefore, the content can be provided to learn new evaluation words in real time, and further better improvement is made.
Optionally, the updating the corresponding relationship according to the updated evaluation term may include:
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
In the embodiment of the invention, when the comment of the film already comprises the scenario-warm evaluation words, the number of the scenario-warm times can be accumulated without repeatedly listing the corresponding relationship of the scenario-warm. When the comment words of the movie do not include the evaluation words of the plot-annular deduction, the corresponding relation of the plot-annular deduction can be increased.
Optionally, before the obtaining of the comment data of the specified service, the method may further include:
and determining the business to which the received comment data belongs according to the keywords in the received comment data.
In the embodiment of the present invention, determining the service to which the received comment data belongs may be understood with reference to the description in fig. 1, and redundant description is not repeated here.
Referring to fig. 8, an embodiment of the apparatus 20 for word classification provided by the embodiment of the present invention includes:
an obtaining unit 201, configured to obtain comment data of a specified service;
a first classification unit 202, configured to perform first-level classification on the comment data acquired by the acquisition unit 201 according to a first category included in the specified service, so as to obtain comment data of each first category;
a second classification unit 203, configured to perform second-level classification on the comment data of each first category obtained by the classification by the first classification unit 202 according to a second category included in each first category, so as to obtain comment data of each second category;
an extracting unit 204, configured to extract an evaluation term in the comment data of each second category classified by the second classifying unit 203;
a relationship establishing unit 205, configured to establish a correspondence relationship between the evaluation term extracted by the extracting unit 204 and the second category to which the evaluation term belongs.
In the embodiment of the present invention, the obtaining unit 201 obtains comment data of a specified service; the first classification unit 202 performs first-level classification on the comment data acquired by the acquisition unit 201 according to a first category included in the specified service, so as to obtain comment data of each first category; the second classification unit 203 performs second-level classification on the comment data of each first category obtained by the classification by the first classification unit 202 according to a second category included in each first category, so as to obtain comment data of each second category; the extracting unit 204 extracts the evaluation terms in the comment data of each second category classified by the second classifying unit 203; the relationship establishing unit 205 establishes a correspondence relationship between the evaluation term extracted by the extracting unit 204 and the second category to which the evaluation term belongs. Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word bank of the public opinion monitoring system, the word classification device provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 8 and referring to fig. 9, in another alternative embodiment of the apparatus for word classification provided by the embodiment of the present invention, the apparatus 20 further includes a first determining unit 206,
the first determining unit 206, configured to determine the attribute of the evaluation term extracted by the extracting unit 204;
the relationship establishing unit 205 is configured to establish a correspondence relationship between the evaluation term and the second category to which the evaluation term belongs according to the attribute of the evaluation term determined by the first determining unit 206.
Alternatively, on the basis of the embodiment corresponding to fig. 8, referring to fig. 10, in another alternative embodiment of the apparatus 20 for word classification provided in the embodiment of the present invention, the apparatus further includes a second determining unit 207 and an updating unit 208,
the obtaining unit 201 is further configured to obtain comment data updated by the specified service;
the second determining unit 207 is configured to determine, according to the updated comment data acquired by the acquiring unit 201, an updated evaluation term in the second category;
the updating unit 208 is configured to update the corresponding relationship according to the updated evaluation term determined by the second determining unit 207.
Alternatively, on the basis of the embodiment corresponding to fig. 10, in another alternative embodiment of the apparatus 20 for word classification provided by the embodiment of the present invention,
the update unit 208 is configured to:
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
Alternatively, on the basis of the embodiment corresponding to fig. 8, referring to fig. 11, in another alternative embodiment of the apparatus 20 for word classification provided in the embodiment of the present invention, the apparatus further includes a third determining unit 209,
the third determining unit 209 is configured to determine, before the obtaining unit 201 obtains the comment data of the specified service, a service to which the received comment data belongs according to a keyword in the received comment data.
The word classification devices described in fig. 8 to 11 can be understood by referring to the descriptions of fig. 1 to 7, and the description is not repeated here.
Fig. 12 is a schematic structural diagram of the word classification device 20 according to the embodiment of the present invention. The apparatus 20 for word classification is applied to a public opinion monitoring system, the apparatus 20 for word classification includes a processor 210, a memory 250 and a transceiver 230, the memory 250 may include a read-only memory and a random access memory, and provides operating instructions and data to the processor 210. A portion of the memory 250 may also include non-volatile random access memory (NVRAM).
In some embodiments, memory 250 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof:
in an embodiment of the present invention, by calling the operation instructions stored in the memory 250 (which may be stored in the operating system),
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word bank of the public opinion monitoring system, the word classification device provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
The processor 210 controls the operation of the apparatus 20 for word classification, and the processor 210 may also be referred to as a CPU (Central Processing Unit). Memory 250 may include both read-only memory and random access memory and provides instructions and data to processor 210. A portion of the memory 250 may also include non-volatile random access memory (NVRAM). The various components of the word sorting apparatus 20 in a particular application are coupled together by a bus system 220, wherein the bus system 220 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 220 in the figures.
The method disclosed in the above embodiments of the present invention may be applied to the processor 210, or implemented by the processor 210. The processor 210 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 210. The processor 210 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 250, and the processor 210 reads the information in the memory 250 and completes the steps of the above method in combination with the hardware thereof.
Optionally, the processor 210 is further configured to determine an attribute of the evaluation term; and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
Optionally, the processor 210 is further configured to obtain comment data of the specified service update; determining updated evaluation terms in the second category according to the updated comment data; and updating the corresponding relation according to the updated evaluation words.
Optionally, the processor 210 is configured to, when the updated evaluation term is already included in the second category, perform quantity accumulation at the same evaluation term as the updated evaluation term; when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
Optionally, the processor 210 is further configured to determine, according to a keyword in the received comment data, a service to which the received comment data belongs.
The word classification device provided in fig. 12 can be understood with reference to the descriptions of fig. 1 to fig. 11, and will not be described in detail herein.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
The method and the device for word classification provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used to help understanding the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (8)
1. A method of word classification, comprising:
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
extracting the evaluation words in the comment data of each second category, and establishing a corresponding relation between the evaluation words and the second category to which the evaluation words belong, wherein the first category and the second category are both the categories of the specified service, and the evaluation words comprise positive evaluation words and negative evaluation words;
obtaining comment data updated by the specified service;
determining updated evaluation terms in the second category according to the updated comment data;
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relation between the updated evaluation term and the second category;
counting the evaluation words related to the superiority and inferiority of the specified service class;
when the designated service is a release meeting, a first category contained in the release meeting comprises content, characters, services, promotions and a whole, wherein a second category contained in the content comprises at least one of subject, product, facility, music, field, environment and time, a second category contained in the characters comprises at least one of guests, teams, presenters, performers and audiences, a second category contained in the services comprises at least one of ticketing, invitation and live services, a second category contained in the promotions is at least one of media lineup and material promotions, and a second category contained in the whole is also the whole.
2. The method of claim 1, wherein after the extracting of the evaluation terms in each of the second categories of comment data, the method further comprises:
determining attributes of the evaluation terms;
the establishing of the corresponding relationship between the evaluation terms and the second category to which the evaluation terms belong includes:
and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
3. The method according to claim 1 or 2, wherein before the obtaining of comment data of a specified service, the method further comprises:
and determining the business to which the received comment data belongs according to the keywords in the received comment data.
4. An apparatus for word classification, comprising:
the acquisition unit is used for acquiring comment data of the specified service;
the first classification unit is used for performing first-level classification on the comment data acquired by the acquisition unit according to a first category contained in the specified service to obtain comment data of each first category;
the second classification unit is used for performing second hierarchical classification on the comment data of each first category obtained by the classification of the first classification unit according to a second category contained in each first category to obtain the comment data of each second category;
the extracting unit is used for extracting the evaluation words in the comment data of each second category classified by the second classifying unit;
a relationship establishing unit, configured to establish a correspondence between the evaluation terms extracted by the extracting unit and a second category to which the evaluation terms belong, where the first category and the second category are both categories of the specified service, and the evaluation terms include positive evaluation terms and negative evaluation terms;
the device is used for acquiring comment data of the specified business update; determining updated evaluation terms in the second category according to the updated comment data; when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms; when the updated evaluation term is not included in the second category, establishing a corresponding relation between the updated evaluation term and the second category;
the device is used for counting the evaluation words related to the superiority and inferiority of the specified business class;
the apparatus is configured to, when the designated service is a post office, include a first category of content, a character, a service, a promotion, and a whole, wherein a second category of the content includes at least one of a subject, a product, a facility, music, a venue, an environment, and a time, a second category of the character includes at least one of a guest, a team, a host, a performer, and an audience, a second category of the service includes at least one of ticketing, an invitation, and a live service, a second category of the promotion includes at least one of a media lineup and promotional material, and a second category of the whole includes a whole.
5. The apparatus according to claim 4, characterized in that the apparatus further comprises a first determination unit,
the first determining unit is used for determining the attribute of the evaluation term extracted by the extracting unit;
the relationship establishing unit is configured to establish a correspondence relationship between the evaluation term and the second category to which the evaluation term belongs according to the attribute of the evaluation term determined by the first determining unit.
6. The apparatus according to claim 4 or 5, characterized in that the apparatus further comprises a third determination unit,
the third determining unit is configured to determine, according to a keyword in the received comment data, a service to which the received comment data belongs, before the obtaining unit obtains the comment data of the specified service.
7. An apparatus for word classification, comprising: a memory and a processor;
the memory is used for storing operation instructions;
the processor is used for calling the operation instruction to execute the steps of the word classification method according to any one of claims 1-3.
8. A computer-readable storage medium, characterized in that a program is stored in the computer-readable storage medium for implementing the steps of the method of word classification according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610173402.0A CN107229636B (en) | 2016-03-24 | 2016-03-24 | Word classification method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610173402.0A CN107229636B (en) | 2016-03-24 | 2016-03-24 | Word classification method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107229636A CN107229636A (en) | 2017-10-03 |
CN107229636B true CN107229636B (en) | 2021-08-13 |
Family
ID=59931791
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610173402.0A Active CN107229636B (en) | 2016-03-24 | 2016-03-24 | Word classification method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107229636B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201346813A (en) * | 2011-12-28 | 2013-11-16 | Intel Corp | System and method for identifying reviewers with incentives |
CN103729459A (en) * | 2014-01-10 | 2014-04-16 | 北京邮电大学 | Method for establishing sentiment classification model |
CN104123302A (en) * | 2013-04-27 | 2014-10-29 | 四川火狐无线科技有限公司 | Searching method, device and system |
CN104978328A (en) * | 2014-04-03 | 2015-10-14 | 北京奇虎科技有限公司 | Hierarchical classifier obtaining method, text classification method, hierarchical classifier obtaining device and text classification device |
CN105005589A (en) * | 2015-06-26 | 2015-10-28 | 腾讯科技(深圳)有限公司 | Text classification method and text classification device |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7599847B2 (en) * | 2000-06-09 | 2009-10-06 | Airport America | Automated internet based interactive travel planning and management system |
US20040054627A1 (en) * | 2002-09-13 | 2004-03-18 | Rutledge David R. | Universal identification system for printed and electronic media |
CN103488635A (en) * | 2012-06-11 | 2014-01-01 | 腾讯科技(深圳)有限公司 | Method and device for acquiring product information |
CN102890707A (en) * | 2012-08-28 | 2013-01-23 | 华南理工大学 | System for mining emotional tendencies of brief network comments based on conditional random field |
US9996504B2 (en) * | 2013-07-08 | 2018-06-12 | Amazon Technologies, Inc. | System and method for classifying text sentiment classes based on past examples |
CN104462132A (en) * | 2013-09-23 | 2015-03-25 | 华为技术有限公司 | Comment information display method and device |
CN105354183A (en) * | 2015-10-19 | 2016-02-24 | Tcl集团股份有限公司 | Analytic method, apparatus and system for internet comments of household electrical appliance products |
-
2016
- 2016-03-24 CN CN201610173402.0A patent/CN107229636B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201346813A (en) * | 2011-12-28 | 2013-11-16 | Intel Corp | System and method for identifying reviewers with incentives |
CN104123302A (en) * | 2013-04-27 | 2014-10-29 | 四川火狐无线科技有限公司 | Searching method, device and system |
CN103729459A (en) * | 2014-01-10 | 2014-04-16 | 北京邮电大学 | Method for establishing sentiment classification model |
CN104978328A (en) * | 2014-04-03 | 2015-10-14 | 北京奇虎科技有限公司 | Hierarchical classifier obtaining method, text classification method, hierarchical classifier obtaining device and text classification device |
CN105005589A (en) * | 2015-06-26 | 2015-10-28 | 腾讯科技(深圳)有限公司 | Text classification method and text classification device |
Non-Patent Citations (2)
Title |
---|
Chinese microblog sentiment classification based on convolution neural network with content extension method;Xiao Sun等;《2015 International Conference on Affective Computing and Intelligent Interaction (ACII)》;20151207;第408-414页 * |
文本情感分析在产品评论中的应用研究;魏慧玲;《中国优秀硕士学位论文全文数据库信息科技辑》;20140615(第6期);第I138-1208页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107229636A (en) | 2017-10-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11055334B2 (en) | System and method for aligning messages to an event based on semantic similarity | |
US11048752B2 (en) | Estimating social interest in time-based media | |
US11372917B2 (en) | Labeling video files using acoustic vectors | |
US11954142B2 (en) | Method and system for producing story video | |
Habibian et al. | Recommendations for video event recognition using concept vocabularies | |
CN109408639A (en) | A kind of barrage classification method, device, equipment and storage medium | |
Awad et al. | Trecvid semantic indexing of video: A 6-year retrospective | |
Millward et al. | A ‘different class’? Homophily and heterophily in the social class networks of Britpop | |
CN109408672A (en) | A kind of article generation method, device, server and storage medium | |
CN112733654A (en) | Method and device for splitting video strip | |
CN112507163A (en) | Duration prediction model training method, recommendation method, device, equipment and medium | |
KR20200096935A (en) | Method and system for providing multiple profiles | |
CN114025176A (en) | Anchor recommendation method and device, electronic equipment and storage medium | |
CN107229636B (en) | Word classification method and device | |
Srinivas et al. | From Single Screen to YouTube: Tracking the Regional Blockbuster | |
Seaver | Computing Taste: The Making of Algorithmic Music Recommendation | |
CN103984693A (en) | Method and device for enriching a content defined by a timeline and by a chronological text description | |
Jones | The me in media: A functionalist approach to examining motives to produce within the public space of YouTube | |
WO2024047755A1 (en) | Acoustic information output control device, method, and program | |
Levinson | " I Can Has Cultural Influenz?": The Effects of Internet Memes on Popular Culture | |
Filippidis et al. | Audio Event Identification in Sports Media Content: The Case of Basketball | |
Salsabil et al. | LEGAL ANALYSIS OF ENTERTAINMENT APPS IN APPLE APPSTORE USING BIG DATA | |
Pek-Dorji | Opening the gates in Bhutan: Media gatekeepers and the agenda of change | |
Chen et al. | Using business-aware latent topics for image captioning in social media | |
CN118741248A (en) | Multimedia barrage information processing method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TG01 | Patent term adjustment |