CN107229636B - Word classification method and device - Google Patents

Word classification method and device Download PDF

Info

Publication number
CN107229636B
CN107229636B CN201610173402.0A CN201610173402A CN107229636B CN 107229636 B CN107229636 B CN 107229636B CN 201610173402 A CN201610173402 A CN 201610173402A CN 107229636 B CN107229636 B CN 107229636B
Authority
CN
China
Prior art keywords
category
evaluation
comment data
classification
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610173402.0A
Other languages
Chinese (zh)
Other versions
CN107229636A (en
Inventor
孙义康
万翼龙
项丹
周大龙
余洋洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610173402.0A priority Critical patent/CN107229636B/en
Publication of CN107229636A publication Critical patent/CN107229636A/en
Application granted granted Critical
Publication of CN107229636B publication Critical patent/CN107229636B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a word classification method, which comprises the following steps: obtaining comment data of a specified service; according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category; according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category; and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong. The word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.

Description

Word classification method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a word classification method and device.
Background
Whether a real product or a virtual product is concerned, the influence of the public praise on the product is more and more important, the quality of the public praise is from the evaluation of a user, and if the evaluation of the user can be mastered in time, the product can be modified in a targeted manner so as to improve the product better.
In the public opinion monitoring system in the prior art, key information is captured from complicated information on the internet through related professional public opinion software and then added into a public word bank of the public opinion monitoring system. Thus, the public reaction to public opinion events can be known from the public thesaurus.
The public opinion monitoring system in the prior art is applied to the internet, and can extract some keywords of user evaluation on the internet, but the types of services or products related to the evaluation of the user on the internet are various, and service operators cannot make correct judgment on the quality of each service or product according to the keywords in a public word bank.
Disclosure of Invention
The embodiment of the invention provides a word classification method, which can count evaluation words related to the quality of each service according to each service, thereby improving the effectiveness of monitoring. The embodiment of the invention also provides a corresponding device.
The invention provides a word classification method in a first aspect, which comprises the following steps:
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
The second aspect of the present invention provides a word classification device, including:
the acquisition unit is used for acquiring comment data of the specified service;
the first classification unit is used for performing first-level classification on the comment data acquired by the acquisition unit according to a first category contained in the specified service to obtain comment data of each first category;
the second classification unit is used for performing second hierarchical classification on the comment data of each first category obtained by the classification of the first classification unit according to a second category contained in each first category to obtain the comment data of each second category;
the extracting unit is used for extracting the evaluation words in the comment data of each second category classified by the second classifying unit;
and the relationship establishing unit is used for establishing the corresponding relationship between the evaluation terms extracted by the extracting unit and the second category to which the evaluation terms belong.
Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word stock of the public opinion monitoring system, the word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of a public opinion monitoring system according to an embodiment of the present invention;
FIG. 2 is a diagram of an embodiment of a word classification method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a classification architecture of a posting scenario in an embodiment of the invention;
FIG. 4 is a diagram illustrating a classification scheme for movie scenes according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a classification scheme of a cartoon scene according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a classification scheme of a game scenario according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a classification scheme of a literature scene according to an embodiment of the present invention;
FIG. 8 is a diagram of an embodiment of a word classification device according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
FIG. 10 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
FIG. 11 is a schematic diagram of another embodiment of a word classification device in the embodiment of the invention;
fig. 12 is a schematic diagram of another embodiment of the word classification device in the embodiment of the invention.
Detailed Description
The embodiment of the invention provides a word classification method, which can count evaluation words related to the quality of each service according to each service, thereby improving the effectiveness of monitoring. The embodiment of the invention also provides a corresponding device. The following are detailed below.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The word classification method provided by the embodiment of the invention is applied to a public opinion monitoring system, and can provide an evaluation system word bank for positive and negative public opinion analysis aiming at general entertainment business events, activities, products and the like. That is, a special evaluation word library can be provided for a game, an animation, a literary work, a movie, a distribution meeting, and the like. The evaluation thesaurus is not limited to the above fields, and in the future, it is possible to monitor the contents related to propagation, such as activity events and propagation materials having a large influence, and the field related to the evaluation thesaurus dynamically changes according to the demand. Therefore, the content provider can perform targeted improvement by combining the evaluation word stock, and the word stock in the application comprises positive evaluation words and negative evaluation words, so that the content provider can master the advantages and disadvantages of the provided works in time, and further make better content. Moreover, the term of evaluation of the present application is incorporated into a particular product, work, or business, which may also provide efficiency of machine learning.
The public opinion monitoring system generally comprises a user platform, a plurality of classification nodes and a storage array, as shown in fig. 1, wherein a user can evaluate a service through the user platform, and comment data aiming at a certain service is input on the user platform. Comment data received on a user platform can be distributed to classification nodes for classification, classification of the classification nodes can be based on classification according to services, after the services to which the comment data belong are determined, the comment data are input into a storage array for storage, and a storage device can be divided into a plurality of storage areas, for example: the storage area can be divided into a movie storage area, an animation storage area, a literature storage area, a publishing storage area and the like, wherein each storage area can be further subdivided into a plurality of small storage areas, and the small storage areas are used for storing comment data of specific services.
When the user needs to know the relevant evaluation of the specified service, the word classification device can acquire the comment data of the specified service from the storage area of the specified service stored in the storage device, and then classify the comment data to extract the evaluation word.
The device for performing word classification in the embodiment of the present invention may be an independent server for word classification, or may be a service cluster including a plurality of classification nodes.
FIG. 2 is a diagram of an embodiment of a word classification method according to an embodiment of the present invention.
As shown in fig. 2, an embodiment of the method for word classification in the embodiment of the present invention includes:
101. and obtaining comment data of the specified service.
The specified service may be a service specified by a content developer, such as: a movie, a literary work, a cartoon, etc.
102. And according to the first category contained in the specified service, performing first-level classification on the comment data to obtain comment data of each first category.
The specified traffic may include a plurality of first categories, such as: movies may include a first category of movie properties, characters, movie internals, and entireties.
The review data for the movie may be categorized by several first categories of movie characteristics, characters, movie internals, and universes.
103. And according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category.
The first category may also include a plurality of second categories, such as: the film characteristics can include a plurality of second categories such as line-of-speech dialogue, plot/drama, suspense, sound effects, colors, photography/pictures, music, details, special effects/special effects, scenes and the like;
the comment data of the movie characteristics are classified according to lines, scenes, plots, dramas, suspense, sound effects, colors, photographs, pictures, music, details, special effects, and scenes, and each comment data of the second category can be obtained.
Of course, the second category may be further divided into a third category, and the third category may be further divided into a fourth category, and the infinite division is not exhaustive, so the division into the second category is only used as an example in the present application.
104. And extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
Extracting the evaluation terms from the comment data may adopt a keyword extraction method, for example: the comment data in the second category of scenarios are: the theme of the star on the earth is excellent, the subject is good and novel, the keyword novelty can be extracted from the theme, and then, the plot and the novel corresponding relation are established.
In the embodiment of the invention, comment data of a specified service are acquired; according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category; according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category; and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong. Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word stock of the public opinion monitoring system, the word classification method provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Optionally, after the extracting of the evaluation terms in the comment data of each second category, the method may further include:
determining attributes of the evaluation terms;
the establishing of the corresponding relationship between the evaluation terms and the second category to which the evaluation terms belong includes:
and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
The attributes of the evaluation words can be divided into positive and negative, the positive evaluation words are beneficial to estimating the aspect that the content provider continues to be well, and the negative evaluation words are beneficial to estimating the content provider to modify the content in time, so that better service can be provided for users.
In the embodiment of the present invention, a relationship between a first category and a second category is established for a release meeting, a movie, an animation, a game, and a literary work, a third category is also established for some second categories, and a process of performing word classification for different services in the embodiment of the present invention is described below with reference to fig. 3 to 7, respectively.
Fig. 3 is a schematic diagram of a classification architecture of a release meeting scenario in the embodiment of the present invention.
As shown in fig. 3, a first category of conference services may include: content, people, services, promotions, and entitlements, wherein a second category under the content may include: theme, product, facility, music, venue, environment, and time. The second category under the character may include guests, teams, presenters, performers, and spectators. The second category under service may include ticketing, invitations, and live services. The second category under the hype may include media lineups and hype material.
The comment data of the conference service may include positive comment data and negative comment data, the comment data may be classified from the positive side and the negative side, the positive comment may be understood with reference to table 1, for example, and the negative comment may be understood with reference to table 2, for example.
Table 1: corresponding relation between positive evaluation words and categories of conference release business
First class Second class Evaluation word
Content providing method and apparatus Music Scene of should
Content providing method and apparatus Site Air pie
Character Host person Unique
Character Audience member Order of conservation
Service Invitation letter Exquisite
Propaganda Media material Fresh and fresh
Integral body Integral body Gorgeous
Table 2: correspondence between negative evaluation words and categories of conference release business
Figure BDA0000949415080000061
Figure BDA0000949415080000071
Tables 1 and 2 are just a few examples of positive and negative evaluation terms of the conference service, and actually, a public opinion monitoring system may include correspondence between a plurality of evaluation terms and the second category. Thus, the host of the release meeting can know the advantages and the disadvantages of the host through the evaluations, the advantages can continue in the next release meeting, and the disadvantages can be made up as much as possible in the next release meeting.
Fig. 4 is a schematic diagram of a classification architecture of movie scenes in an embodiment of the present invention.
As shown in fig. 4, a first category of movie services may include: movie features, characters, movie internals, and universes. The second category in the movie feature may include line-of-speech dialog, drama, suspense, sound effects, photography, color, music, detail, trick, scene, and the like. The second category of characters may include a director and actors. The second category in the movie interior may include value, feelings, stories and styles. The whole is judged on the whole, and the second category of the whole can be considered as the whole.
The review data of the movie service may include positive review data and negative review data, and the review data may be classified from positive and negative, respectively, the positive reviews may be understood with reference to table 3, for example, and the negative reviews may be understood with reference to table 4, for example.
Table 3: corresponding relation between positive evaluation words and categories of film service
First class Second class Evaluation word
Film characteristics Plot of a scene Warm
Film characteristics Specific effects Impact force
Character Actor(s) Playing with games
Character Director Good of guide
Movie insider Story Vividly moving
Movie insider Value view Positive energy
Integral body Integral body Classic
Table 4: correspondence between negative evaluation words and categories of film services
Figure BDA0000949415080000072
Figure BDA0000949415080000081
Tables 3 and 4 are just a few examples of positive and negative evaluation terms of film services, and actually, a public opinion monitoring system may include a plurality of corresponding relationships between the evaluation terms and the second category. Therefore, the producer of the film can know own advantages and disadvantages through the evaluations, the advantages can continue in the next release meeting, and the disadvantages can be made up as much as possible in the next release meeting.
The taxonomy architecture diagrams of scenes for animations, games and literary works can be understood with reference to fig. 5-7, respectively.
The first and second categories of animation, games and literary works can be understood with reference to fig. 5 to 7 in combination with the above-described conference service of fig. 3 and the movie service of fig. 4, which are not listed in the present application.
In this embodiment of the present invention, after the establishing of the correspondence between the evaluation term and the second category to which the evaluation term belongs, the method may further include:
obtaining comment data updated by the specified service;
determining updated evaluation terms in the second category according to the updated comment data;
and updating the corresponding relation according to the updated evaluation words.
Content providers may try to perfect a given service after learning some comment terms, such as: the method can patch the game to improve the defects in the game, so that some new comment data may exist, updated evaluation words can be proposed according to the new comment data, and the corresponding relation between the second category and the evaluation words is updated. Therefore, the content can be provided to learn new evaluation words in real time, and further better improvement is made.
Optionally, the updating the corresponding relationship according to the updated evaluation term may include:
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
In the embodiment of the invention, when the comment of the film already comprises the scenario-warm evaluation words, the number of the scenario-warm times can be accumulated without repeatedly listing the corresponding relationship of the scenario-warm. When the comment words of the movie do not include the evaluation words of the plot-annular deduction, the corresponding relation of the plot-annular deduction can be increased.
Optionally, before the obtaining of the comment data of the specified service, the method may further include:
and determining the business to which the received comment data belongs according to the keywords in the received comment data.
In the embodiment of the present invention, determining the service to which the received comment data belongs may be understood with reference to the description in fig. 1, and redundant description is not repeated here.
Referring to fig. 8, an embodiment of the apparatus 20 for word classification provided by the embodiment of the present invention includes:
an obtaining unit 201, configured to obtain comment data of a specified service;
a first classification unit 202, configured to perform first-level classification on the comment data acquired by the acquisition unit 201 according to a first category included in the specified service, so as to obtain comment data of each first category;
a second classification unit 203, configured to perform second-level classification on the comment data of each first category obtained by the classification by the first classification unit 202 according to a second category included in each first category, so as to obtain comment data of each second category;
an extracting unit 204, configured to extract an evaluation term in the comment data of each second category classified by the second classifying unit 203;
a relationship establishing unit 205, configured to establish a correspondence relationship between the evaluation term extracted by the extracting unit 204 and the second category to which the evaluation term belongs.
In the embodiment of the present invention, the obtaining unit 201 obtains comment data of a specified service; the first classification unit 202 performs first-level classification on the comment data acquired by the acquisition unit 201 according to a first category included in the specified service, so as to obtain comment data of each first category; the second classification unit 203 performs second-level classification on the comment data of each first category obtained by the classification by the first classification unit 202 according to a second category included in each first category, so as to obtain comment data of each second category; the extracting unit 204 extracts the evaluation terms in the comment data of each second category classified by the second classifying unit 203; the relationship establishing unit 205 establishes a correspondence relationship between the evaluation term extracted by the extracting unit 204 and the second category to which the evaluation term belongs. Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word bank of the public opinion monitoring system, the word classification device provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
Alternatively, on the basis of the embodiment corresponding to fig. 8 and referring to fig. 9, in another alternative embodiment of the apparatus for word classification provided by the embodiment of the present invention, the apparatus 20 further includes a first determining unit 206,
the first determining unit 206, configured to determine the attribute of the evaluation term extracted by the extracting unit 204;
the relationship establishing unit 205 is configured to establish a correspondence relationship between the evaluation term and the second category to which the evaluation term belongs according to the attribute of the evaluation term determined by the first determining unit 206.
Alternatively, on the basis of the embodiment corresponding to fig. 8, referring to fig. 10, in another alternative embodiment of the apparatus 20 for word classification provided in the embodiment of the present invention, the apparatus further includes a second determining unit 207 and an updating unit 208,
the obtaining unit 201 is further configured to obtain comment data updated by the specified service;
the second determining unit 207 is configured to determine, according to the updated comment data acquired by the acquiring unit 201, an updated evaluation term in the second category;
the updating unit 208 is configured to update the corresponding relationship according to the updated evaluation term determined by the second determining unit 207.
Alternatively, on the basis of the embodiment corresponding to fig. 10, in another alternative embodiment of the apparatus 20 for word classification provided by the embodiment of the present invention,
the update unit 208 is configured to:
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
Alternatively, on the basis of the embodiment corresponding to fig. 8, referring to fig. 11, in another alternative embodiment of the apparatus 20 for word classification provided in the embodiment of the present invention, the apparatus further includes a third determining unit 209,
the third determining unit 209 is configured to determine, before the obtaining unit 201 obtains the comment data of the specified service, a service to which the received comment data belongs according to a keyword in the received comment data.
The word classification devices described in fig. 8 to 11 can be understood by referring to the descriptions of fig. 1 to 7, and the description is not repeated here.
Fig. 12 is a schematic structural diagram of the word classification device 20 according to the embodiment of the present invention. The apparatus 20 for word classification is applied to a public opinion monitoring system, the apparatus 20 for word classification includes a processor 210, a memory 250 and a transceiver 230, the memory 250 may include a read-only memory and a random access memory, and provides operating instructions and data to the processor 210. A portion of the memory 250 may also include non-volatile random access memory (NVRAM).
In some embodiments, memory 250 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof:
in an embodiment of the present invention, by calling the operation instructions stored in the memory 250 (which may be stored in the operating system),
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
and extracting the evaluation terms in the comment data of each second category, and establishing the corresponding relation between the evaluation terms and the second category to which the evaluation terms belong.
Compared with the prior art that the quality of each service or product cannot be judged correctly according to the keywords in the public word bank of the public opinion monitoring system, the word classification device provided by the embodiment of the invention can classify words aiming at the specified service, so that the evaluation words related to the quality of the service can be counted according to each service, and the monitoring effectiveness is improved.
The processor 210 controls the operation of the apparatus 20 for word classification, and the processor 210 may also be referred to as a CPU (Central Processing Unit). Memory 250 may include both read-only memory and random access memory and provides instructions and data to processor 210. A portion of the memory 250 may also include non-volatile random access memory (NVRAM). The various components of the word sorting apparatus 20 in a particular application are coupled together by a bus system 220, wherein the bus system 220 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 220 in the figures.
The method disclosed in the above embodiments of the present invention may be applied to the processor 210, or implemented by the processor 210. The processor 210 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 210. The processor 210 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 250, and the processor 210 reads the information in the memory 250 and completes the steps of the above method in combination with the hardware thereof.
Optionally, the processor 210 is further configured to determine an attribute of the evaluation term; and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
Optionally, the processor 210 is further configured to obtain comment data of the specified service update; determining updated evaluation terms in the second category according to the updated comment data; and updating the corresponding relation according to the updated evaluation words.
Optionally, the processor 210 is configured to, when the updated evaluation term is already included in the second category, perform quantity accumulation at the same evaluation term as the updated evaluation term; when the updated evaluation term is not included in the second category, establishing a corresponding relationship between the updated evaluation term and the second category.
Optionally, the processor 210 is further configured to determine, according to a keyword in the received comment data, a service to which the received comment data belongs.
The word classification device provided in fig. 12 can be understood with reference to the descriptions of fig. 1 to fig. 11, and will not be described in detail herein.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
The method and the device for word classification provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used to help understanding the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (8)

1. A method of word classification, comprising:
obtaining comment data of a specified service;
according to the first categories contained in the specified service, carrying out first-level classification on the comment data to obtain comment data of each first category;
according to a second category contained in each first category, carrying out second hierarchical classification on the comment data of each first category to obtain the comment data of each second category;
extracting the evaluation words in the comment data of each second category, and establishing a corresponding relation between the evaluation words and the second category to which the evaluation words belong, wherein the first category and the second category are both the categories of the specified service, and the evaluation words comprise positive evaluation words and negative evaluation words;
obtaining comment data updated by the specified service;
determining updated evaluation terms in the second category according to the updated comment data;
when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms;
when the updated evaluation term is not included in the second category, establishing a corresponding relation between the updated evaluation term and the second category;
counting the evaluation words related to the superiority and inferiority of the specified service class;
when the designated service is a release meeting, a first category contained in the release meeting comprises content, characters, services, promotions and a whole, wherein a second category contained in the content comprises at least one of subject, product, facility, music, field, environment and time, a second category contained in the characters comprises at least one of guests, teams, presenters, performers and audiences, a second category contained in the services comprises at least one of ticketing, invitation and live services, a second category contained in the promotions is at least one of media lineup and material promotions, and a second category contained in the whole is also the whole.
2. The method of claim 1, wherein after the extracting of the evaluation terms in each of the second categories of comment data, the method further comprises:
determining attributes of the evaluation terms;
the establishing of the corresponding relationship between the evaluation terms and the second category to which the evaluation terms belong includes:
and establishing a corresponding relation between the evaluation terms and the second category to which the evaluation terms belong according to the attributes of the evaluation terms.
3. The method according to claim 1 or 2, wherein before the obtaining of comment data of a specified service, the method further comprises:
and determining the business to which the received comment data belongs according to the keywords in the received comment data.
4. An apparatus for word classification, comprising:
the acquisition unit is used for acquiring comment data of the specified service;
the first classification unit is used for performing first-level classification on the comment data acquired by the acquisition unit according to a first category contained in the specified service to obtain comment data of each first category;
the second classification unit is used for performing second hierarchical classification on the comment data of each first category obtained by the classification of the first classification unit according to a second category contained in each first category to obtain the comment data of each second category;
the extracting unit is used for extracting the evaluation words in the comment data of each second category classified by the second classifying unit;
a relationship establishing unit, configured to establish a correspondence between the evaluation terms extracted by the extracting unit and a second category to which the evaluation terms belong, where the first category and the second category are both categories of the specified service, and the evaluation terms include positive evaluation terms and negative evaluation terms;
the device is used for acquiring comment data of the specified business update; determining updated evaluation terms in the second category according to the updated comment data; when the updated evaluation terms are already included in the second category, accumulating the number of the evaluation terms at the same evaluation terms as the updated evaluation terms; when the updated evaluation term is not included in the second category, establishing a corresponding relation between the updated evaluation term and the second category;
the device is used for counting the evaluation words related to the superiority and inferiority of the specified business class;
the apparatus is configured to, when the designated service is a post office, include a first category of content, a character, a service, a promotion, and a whole, wherein a second category of the content includes at least one of a subject, a product, a facility, music, a venue, an environment, and a time, a second category of the character includes at least one of a guest, a team, a host, a performer, and an audience, a second category of the service includes at least one of ticketing, an invitation, and a live service, a second category of the promotion includes at least one of a media lineup and promotional material, and a second category of the whole includes a whole.
5. The apparatus according to claim 4, characterized in that the apparatus further comprises a first determination unit,
the first determining unit is used for determining the attribute of the evaluation term extracted by the extracting unit;
the relationship establishing unit is configured to establish a correspondence relationship between the evaluation term and the second category to which the evaluation term belongs according to the attribute of the evaluation term determined by the first determining unit.
6. The apparatus according to claim 4 or 5, characterized in that the apparatus further comprises a third determination unit,
the third determining unit is configured to determine, according to a keyword in the received comment data, a service to which the received comment data belongs, before the obtaining unit obtains the comment data of the specified service.
7. An apparatus for word classification, comprising: a memory and a processor;
the memory is used for storing operation instructions;
the processor is used for calling the operation instruction to execute the steps of the word classification method according to any one of claims 1-3.
8. A computer-readable storage medium, characterized in that a program is stored in the computer-readable storage medium for implementing the steps of the method of word classification according to any one of claims 1 to 3.
CN201610173402.0A 2016-03-24 2016-03-24 Word classification method and device Active CN107229636B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610173402.0A CN107229636B (en) 2016-03-24 2016-03-24 Word classification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610173402.0A CN107229636B (en) 2016-03-24 2016-03-24 Word classification method and device

Publications (2)

Publication Number Publication Date
CN107229636A CN107229636A (en) 2017-10-03
CN107229636B true CN107229636B (en) 2021-08-13

Family

ID=59931791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610173402.0A Active CN107229636B (en) 2016-03-24 2016-03-24 Word classification method and device

Country Status (1)

Country Link
CN (1) CN107229636B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201346813A (en) * 2011-12-28 2013-11-16 Intel Corp System and method for identifying reviewers with incentives
CN103729459A (en) * 2014-01-10 2014-04-16 北京邮电大学 Method for establishing sentiment classification model
CN104123302A (en) * 2013-04-27 2014-10-29 四川火狐无线科技有限公司 Searching method, device and system
CN104978328A (en) * 2014-04-03 2015-10-14 北京奇虎科技有限公司 Hierarchical classifier obtaining method, text classification method, hierarchical classifier obtaining device and text classification device
CN105005589A (en) * 2015-06-26 2015-10-28 腾讯科技(深圳)有限公司 Text classification method and text classification device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7599847B2 (en) * 2000-06-09 2009-10-06 Airport America Automated internet based interactive travel planning and management system
US20040054627A1 (en) * 2002-09-13 2004-03-18 Rutledge David R. Universal identification system for printed and electronic media
CN103488635A (en) * 2012-06-11 2014-01-01 腾讯科技(深圳)有限公司 Method and device for acquiring product information
CN102890707A (en) * 2012-08-28 2013-01-23 华南理工大学 System for mining emotional tendencies of brief network comments based on conditional random field
US9996504B2 (en) * 2013-07-08 2018-06-12 Amazon Technologies, Inc. System and method for classifying text sentiment classes based on past examples
CN104462132A (en) * 2013-09-23 2015-03-25 华为技术有限公司 Comment information display method and device
CN105354183A (en) * 2015-10-19 2016-02-24 Tcl集团股份有限公司 Analytic method, apparatus and system for internet comments of household electrical appliance products

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201346813A (en) * 2011-12-28 2013-11-16 Intel Corp System and method for identifying reviewers with incentives
CN104123302A (en) * 2013-04-27 2014-10-29 四川火狐无线科技有限公司 Searching method, device and system
CN103729459A (en) * 2014-01-10 2014-04-16 北京邮电大学 Method for establishing sentiment classification model
CN104978328A (en) * 2014-04-03 2015-10-14 北京奇虎科技有限公司 Hierarchical classifier obtaining method, text classification method, hierarchical classifier obtaining device and text classification device
CN105005589A (en) * 2015-06-26 2015-10-28 腾讯科技(深圳)有限公司 Text classification method and text classification device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Chinese microblog sentiment classification based on convolution neural network with content extension method;Xiao Sun等;《2015 International Conference on Affective Computing and Intelligent Interaction (ACII)》;20151207;第408-414页 *
文本情感分析在产品评论中的应用研究;魏慧玲;《中国优秀硕士学位论文全文数据库信息科技辑》;20140615(第6期);第I138-1208页 *

Also Published As

Publication number Publication date
CN107229636A (en) 2017-10-03

Similar Documents

Publication Publication Date Title
US11055334B2 (en) System and method for aligning messages to an event based on semantic similarity
US11048752B2 (en) Estimating social interest in time-based media
US11372917B2 (en) Labeling video files using acoustic vectors
US11954142B2 (en) Method and system for producing story video
Habibian et al. Recommendations for video event recognition using concept vocabularies
CN109408639A (en) A kind of barrage classification method, device, equipment and storage medium
Awad et al. Trecvid semantic indexing of video: A 6-year retrospective
Millward et al. A ‘different class’? Homophily and heterophily in the social class networks of Britpop
CN109408672A (en) A kind of article generation method, device, server and storage medium
CN112733654A (en) Method and device for splitting video strip
CN112507163A (en) Duration prediction model training method, recommendation method, device, equipment and medium
KR20200096935A (en) Method and system for providing multiple profiles
CN114025176A (en) Anchor recommendation method and device, electronic equipment and storage medium
CN107229636B (en) Word classification method and device
Srinivas et al. From Single Screen to YouTube: Tracking the Regional Blockbuster
Seaver Computing Taste: The Making of Algorithmic Music Recommendation
CN103984693A (en) Method and device for enriching a content defined by a timeline and by a chronological text description
Jones The me in media: A functionalist approach to examining motives to produce within the public space of YouTube
WO2024047755A1 (en) Acoustic information output control device, method, and program
Levinson " I Can Has Cultural Influenz?": The Effects of Internet Memes on Popular Culture
Filippidis et al. Audio Event Identification in Sports Media Content: The Case of Basketball
Salsabil et al. LEGAL ANALYSIS OF ENTERTAINMENT APPS IN APPLE APPSTORE USING BIG DATA
Pek-Dorji Opening the gates in Bhutan: Media gatekeepers and the agenda of change
Chen et al. Using business-aware latent topics for image captioning in social media
CN118741248A (en) Multimedia barrage information processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment