WO2020022536A1

WO2020022536A1 - Book recommendation method utilizing similarity between books

Info

Publication number: WO2020022536A1
Application number: PCT/KR2018/008505
Authority: WO
Inventors: 김강산; 박지훈
Original assignee: (주)브레인콜라
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2020-01-30

Abstract

Disclosed is a book recommendation method utilizing similarity between books. According to one aspect of the present invention, provided is a book recommendation method based on similarity between books. The book recommendation method utilizing similarity between books comprises: (1) a step in which a book recommendation server receives book data, for a plurality of books, including at least one from among the title of a book, the body text of the book, and introduction information of the book; (2) a step in which the book recommendation server extracts at least one keyword from the book data on the basis of term frequency and inverse document frequency; (3) a step in which the book recommendation server calculates the degree of similarity between a keyword of a first book and a keyword of a second book; and (4) a step in which the book recommendation server outputs at least one book that is similar to the first book according to the degree of similarity.

Description

Book recommendation method using similarity between books

The present invention relates to a book recommendation method utilizing the similarity between books.

There has been a service of recording and managing books read from the past, and sharing the evaluation of the books. In addition, the book recommendation service is also in operation, and its service is being formed in Korea, but book recommendation service is firmly established in overseas countries, especially in the United States. This book recommendation service is currently in service with a smartphone application as well as an internet website.

When reviewing patents related to performing book recommendation services in Korea, there is 'Book SNS system and its provision method' (No. 10-2014-0038017, published date 2014.03.28). The present disclosure discloses a concept of sharing another user's content based on SNS (Social Networking Service) for book recommendation and using it for book recommendation. Readers who read the book also share information about the book in 'Methods and Systems for Providing Social Network Services by Sharing Reading Information' (No. 10-2014-0133647, Publication Date 2104.11.20). Only the concept of sharing with other readers is disclosed.

All prior book recommendation methods are limited to recommending books to readers using SNS or personalized information, and have not disclosed how to analyze, classify, and recommend the actual contents of books.

Accordingly, the inventor of the present invention analyzes the contents of the book and based on this, suggests a method for recommending a book that the reader may be interested in.

The present invention provides a book recommendation method utilizing the similarity between books.

According to an aspect of the invention, (1) the book recommendation server receiving the book data for a plurality of books including any one or more of the title of the book, the body of the book, the book introduction information; (2) the book recommendation server extracting one or more keywords from the book data based on a word frequency and an inverse document frequency; (3) the book recommendation server calculating a similarity between the keyword of the first book and the keyword of the second book; And (4) the book recommendation server outputting one or more similar books similar to the first book according to the similarity.

According to another aspect of the invention, (1) the book recommendation server receiving the book data for a plurality of books including any one or more of the title of the book, the body of the book, the book introduction information; (2) the book recommendation server extracting one or more keywords from the book data based on a word frequency and an inverse document frequency; (3) the book recommendation server assigning one or more book themes based on the keywords; (4) calculating, by the book recommendation server, the similarity between the book theme of the first book and the book theme of the second book; (5) A book recommendation method is provided based on the similarity between books, including the book recommendation server outputting one or more similar books similar to the first book according to the similarity.

According to another aspect of the invention, (1) the book recommendation server receiving the book data for a plurality of books including any one or more of the title of the book, the body of the book, the book introduction information; (2) the book recommendation server generating at least one group in the book data using a latent rich allocation algorithm (LDA) algorithm; (3) the book recommendation server calculating a similarity between the group of the first book and the group of the second book; And (4) the book recommendation server outputting one or more similar books similar to the first book according to the similarity.

According to another aspect of the invention, (1) the book recommendation server receiving the book data for a plurality of books including any one or more of the title of the book, the body of the book, the book introduction information; (2) the book recommendation server extracts one or more keywords from the book data based on word frequency and inverse document frequency, or assigns one or more book themes based on the keywords; Generating at least one group in the book data based on a latent allocation algorithm (LDA) algorithm; (4) the book recommendation server calculates a first similarity between the keyword of the first book and the keyword of the second book, or calculates a second similarity between the book theme of the first book and the book theme of the second book, or Calculating a third similarity between the group of books and the group of second books; (5) The book recommendation method is provided based on the similarity between books including the book recommendation server outputting one or more similar books similar to the first book according to any one or more of the first to third similarities.

According to another aspect of the invention, (1) the book recommendation server receiving the book data for a plurality of books including any one or more of the title of the book, the body of the book, the book introduction information; (2) the book recommendation server selecting one of first and third similarity calculation methods; (3) When a first similarity calculation method is selected, the book recommendation server extracts one or more keywords from the book data based on a word frequency and an inverse document frequency, and a third similarity calculation method. If selected, the book recommendation server generating one or more groups in the book data based on Latent Dirichlet Allocation (LDA) algorithm; (4) When the first similarity calculation method is selected, the book recommendation server calculates a first similarity between the keyword of the first book and the keyword of the second book, and when the third similarity calculation method is selected, Calculating a third similarity between the groups of second books; (5) The book recommendation method is provided based on the similarity between books, including the book recommendation server outputting one or more similar books similar to the first book according to the first similarity or the third similarity.

In addition, step (2) may be a step of selecting one of the first or the third similarity calculation method according to information of a book or information of a user.

In addition, the first book may be a book that has been given a higher evaluation score by a user than a predetermined reference value, a book included in a shopping cart of an online shopping mall, or a book in which relevant book information has been viewed on the online space by the user. .

The outputting of the similar book may include: excluding books read by the user from the one or more similar books; Outputting the similar books in order to a book score among the one or more similar books; The method may further include any one or more steps of outputting the similar book in consideration of user information among the one or more similar books.

In the step (2), the book recommendation server extracts one or more keywords from the book data based on a word frequency and an inverse document frequency, or based on the keywords. Assigning a book theme, generating at least one group in the book data based on Latent Dirichlet Allocation (LDA) algorithm, or generating a storyflow emotion graph for each of a plurality of books. The book recommendation server calculates a first similarity between the keyword of the first book and the keyword of the second book, or calculates a second similarity between the book theme of the first book and the book theme of the second book, or the group of the first book. Compute a third similarity between the groups of the second book, or a fourth similarity between the storyflow sentiment graph of the first book and the story fluorescence sentiment graph of the second book Calculating a; In the step (5), the recommendation server may output one or more similar books similar to the first book according to any one or more of the first to fourth similarities.

In addition, step (2) may include: generating a window for analyzing words as many as a predetermined ratio with respect to the total number of words included in each book; And sequentially moving the window from the word at the first position included in each book to the story direction of the book, which is the direction of the word at the second position, so that the feelings of the words contained in the window of the position in the moved book are changed. Analyzing may further include generating a storyflow emotion graph.

The emotion of words contained in the window of the moved book is analyzed by sequentially moving the window from the word at the first position included in each book to the story direction of the book in the direction of the word at the second position. The generating of the storyflow emotion graph may include: giving an emotion value for each word included in the window with reference to words included in a predefined emotion word table; Deriving a unit emotion summation summating emotion values for each word in the window; And generating data on the storyflow emotion graph based on the unit emotion sum value.

In addition, the window is sequentially moved from the word at the first position included in each book to the story direction of the book, which is the direction of the word at the second position, so that the feelings of the words contained in the window of the position in the moved book are changed. In the generating of the storyflow emotion graph by analyzing the data, the similarity between the books for moving the window by an average number of words of the number of words included in each page of the book to be analyzed may be utilized.

Step (4) may include: a first similar judgment basic value indicating a difference between a unit feeling sum value of the first book in the story direction and a unit feeling sum value of the second book in the story direction; And a fourth similarity degree based on any one or more of the second similarity determination basic value representing the amount of change of the first similarity determination basic value.

According to another aspect of the invention, it may be a computer program stored in a recording medium for executing the method.

According to the present invention can provide a book recommendation method.

1 is a block diagram of a book recommendation system according to an embodiment of the present invention.

2 to 5 are flowcharts in accordance with an embodiment of the present invention.

Figure 6 is a block diagram of a book recommendation system according to another embodiment of the present invention.

7 is a conceptual diagram illustrating the generation of a window and a storyflow emotion graph.

8 is an emotional value table of words representing the emotion of happiness contained in the emotional word database.

9 illustrates a page of a book in which words with emotion values are present.

10 is a graph showing a change in the unit feeling sum value according to the storyflow direction.

11 is a table showing the unit feeling sum of books according to the embodiment

12 is an emotion graph according to the storyflow.

Fig. 13 is a table showing the similar judgment basic values.

14 to 17 are flowcharts in accordance with an embodiment of the present invention.

18 is a block diagram of a book recommendation system according to another embodiment of the present invention.

As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all transformations, equivalents, and substitutes included in the spirit and scope of the present invention. In the following description of the present invention, if it is determined that the detailed description of the related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.

Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

[Description of the code]

100, 200, 300: Book Recommendation System

110: control unit

120: TF-IDF part

130: LDA part

140: similar judgment

150: book recommendation

160: input and output unit

170: user behavior extraction unit

101: Book Keyword Database

102: Theme Keyword Database

103: Book Theme Database

105: Book Group Database

107: User Behavior Database

109: Book Information Database

220: Appraisal Graph Student West

230: Emotional graph-like judgment unit

240: Book Recommendation Division

250: input / output unit

260: user behavior extraction unit

201: Emotional Word Database

203: Emotion Graph Database

205: User Behavior Database

207: Book Information Database

Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings, and in the following description with reference to the accompanying drawings, the same or corresponding components are given the same reference numerals and redundant description thereof will be omitted. Shall be.

In addition, terms such as first and second used below are merely identification symbols for distinguishing the same or corresponding components, and the same or corresponding components are limited by terms such as the first and second components. no.

In addition, the coupling does not only mean a case where physical contact is directly between the components in the contact relationship between the components, but another component is interposed between the components, and the components are included in the other components. Use it as a comprehensive concept until each contact.

1 is a block diagram of a book recommendation system according to the present invention. Referring to FIG. 1, the book recommendation system includes a control unit 110, a TF-IDF unit 120, an LDA unit 130, a similar determination unit 140, a book recommendation unit 150, an input / output unit 160, and a user. The behavior extracting unit 170, the book keyword database 101, the theme keyword database 103, the book group database 105, the user behavior database 107, and the book information database 109 are shown. This book recommendation system can be a book recommendation server. On the other hand, the database may be configured as a single database. In addition, although the description of the present invention and the claims refer to books, the application of the present invention is not limited to books, and all of the distinguished texts including types such as web novels and comics can be applied.

First, the controller 110 controls the components such as the TF-IDF unit, the LDA unit, the similar decision unit, the book recommendation unit, the input / output unit, and the user behavior extraction unit. For example, the TF-IDF unit may be controlled to receive a control signal from a user and extract a keyword of a book according to the control signal. Of course, the control unit may control the respective databases connected to the control unit. In addition, it can serve to give the theme of the book according to the keyword of the book. In addition, the method may measure the similarity of books in consideration of the genre of the book, the taste of the user, and the like. This will be described in more detail below. Meanwhile, although various databases are included in a server that is a book recommendation system in FIG. 1, the databases may be external to the server.

Next, the TF-IDF unit 120 serves to extract keywords that can represent the book by referring to the word frequency and reverse document frequency in the text of the book. TF-IDF (Term Frequency-Inverse Document Frequency) is a weight used in information retrieval and text mining. It is a statistical value that indicates how important a word is in a particular document when there is a document group or a set of documents. It can be used to extract key words of a document, to rank search results in a search engine, or to obtain a degree of similarity between documents. Extracting keywords by TF-IDF is well known and can follow conventional methods.

More specifically, the TF-IDF unit reads the text of the book, the title of the book, a small page introducing the book, and the like from the book information database. After that, the text of the book, the title of the book, and one or more of the texts introducing the book is analyzed by TF-IDF based to extract keywords. One or more keywords representing books may be extracted, for example, about 50 may be extracted. The keywords for the extracted books are stored in the book keyword database. That is, a plurality of keywords representing books are stored in the book keyword database. Each of the keywords has a weight, and keywords representing books can be expressed in a vector based on the weight.

The controller may assign a theme or genre of each book based on keywords extracted by the TF-IDF unit. More specifically, the control unit may compare the keywords for each book extracted by the TF-IDF unit with the keywords stored in the theme keyword database to assign a theme. The theme keyword database is assigned a theme for each keyword. For example, the theme "economic" is assigned to the keyword "household". For another example, the theme "sport" may be assigned to the keyword "golf". There are thousands of different themes in the theme keyword database. This theme classification can be created by using a classification of a database of an external online bookstore.

On the other hand, each keyword and theme is not limited to one-to-one assignment but may be assigned in a one-to-many manner. For example, "household book" may be assigned to the theme of "economic", but may be assigned to the theme of "diaries" and "phrases" in addition to "economic". The assigned theme for each keyword may be expressed in a vector form. With respect to the keywords of the book analyzed by the TF-IDF unit, the control unit again assigns a theme so that one book can be assigned one or more theme values, and this theme value can form a theme vector. These book themes can be stored in a book theme database.

Next, the LDA unit 130 plays a role of generating one or more groups in the book data by utilizing a latent rich allocation (LDA) algorithm. More specifically, the LDA unit analyzes various book data such as the text, title, and page number of the book to generate a group within the book data. For example, it may be assumed that the first group and the third group are generated when the book data of the first book is analyzed, and the second group and the fourth group are generated when the book data of the second book is analyzed. As described above, since the characteristics of the first group and the second group which have been determined through the similar decision unit are similar, the same group can be determined. Description of this will be described in detail below. The LDA algorithm may follow the conventional invention.

Similarity determination unit 140 serves to determine the similarity between books. More specifically, the similar determination unit calculates keyword sets for books extracted by the TF-IDF unit to calculate similarity of each book. The similarity determination unit may derive the similarity between the keywords by calculating a cosine similarity among the keyword sets representing each of the books described above. This allows you to create a list of books that are similar to a specific book. Such similar information may be stored in a book keyword database, a book information database, or a separate database (not shown).

In addition, the similar determination unit may serve to calculate the similarity between book themes for each book derived by the controller. As described above, the similarity determination unit may derive the similarity between each book by calculating a writing theme value for each book stored in the book theme database.

In addition, the similarity determination unit may determine the similarity between books using the book group information generated by the LDA and stored in the book group database. As described above, for example, the data of the first book is grouped into the first group, the second group, and the third group, and the second book is the fourth group, the fifth group, the third book, the sixth group, Assume that it is grouped into a seventh group. In this case, the similar decision unit compares the first group, the second group, the third group and the fourth group, and the fifth group, and compares the first group, the second group, the third group, the sixth group, and the seventh group. do. As a result, the similar decision unit may determine the similarity between the first book and the second book than the similarity between the first book and the third book. Similarity can also be derived through cosine similarity calculation between groups.

Next, the book recommendation unit 150 may serve to recommend a book based on the similarity between the books derived through the similar determination unit. As described above, the recommendation book may be output in the order of similarity based on the first similarity measured based on the keyword, the second similarity measured based on the theme, the third similarity measured based on the group, and the like.

Since the first similarity is based on a keyword and does not refer to a preset theme keyword database, the book recommendation unit may find similar books regardless of the genre of the book. On the other hand, since the second similarity is based on the theme of the book, that is, genre, and more specifically, it is calculated by using a theme keyword database in which the previously directoryd theme is stored, the book recommendation unit can find similar books based on the genre of the book. have. In addition, since the third similarity is to determine the similarity based on the LDA algorithm-based group for the book content, the book recommendation unit may derive the similarity between books without being limited to keywords or genres.

Therefore, the book recommendation unit may finally recommend the book by mixing the first similarity, the second similarity, and the third similarity in consideration of the user's taste or needs, the user's reading usage behavior, and the characteristics of the book. For example, in the case where the first book is a technical book that includes most of the terminology, the book is characterized by the medical terminology so that similar books may be recommended by increasing the weight of the keyword-based first similarity. On the other hand, when the second book is a novel with a strong genre, a similar book may be recommended by increasing the weight of the theme, that is, the second similarity based on the genre. As such, the book recommendation unit may finally recommend the book by adjusting the weights of the first to third similarities based on the characteristics or genres of the books.

Meanwhile, as described above, the controller may select the first similarity or the third similarity based on information dependent on the book such as the genre of the book and information dependent on the user such as the user behavior. According to the method selected by the control unit, the similar judgment unit calculates the similarity of the books and the book recommendation unit can finally derive the most appropriate similar books. The book information includes the genre of the book, the price of the book, the number of pages of the book, the year of publication of the book, the author nationality of the book, the number of authors, the book publisher, the title of the book, the number of prints, the number of trials, the bookstore, the discount rate, etc. It can be a variety of information. In addition, the user's information may include user age, user gender, user occupation, user religion, user's first language, user residential area, user online website visit history, user college major, user family relationship, user income, etc. .

Next, the book recommendation unit may recommend the book by reflecting not only the first to third similarities but also the user behavior. Specifically, the book recommendation unit selects books that have been given a higher evaluation score than a predetermined reference value, books included in a shopping cart of an online shopping mall, or books that have been read by the user in the online space. It can serve as a recommendation to the user. The role may be played by a similar book extraction module (not shown) in the book recommendation unit.

More specifically, the book recommendation system ultimately recommends books that the user may find interesting. At this time, the book recommendation unit of the Seo Seok recommendation system may analyze the user's behavior patterns, etc., and recommend similar books for books related to this. More specifically, in order to provide the user's personalized information and customized recommendation books, the user's online behavior is identified and used for selecting the recommendation books. For example, when a user gives a high evaluation score to a specific book on an online website or the like, books similar to the book are selected and recommended through the similar decision unit. In addition, when a user purchases a specific book in an online bookstore such as Amazon (registered trademark) or has put it in a shopping cart to make a purchase, the user may select books similar to the specific book and provide recommendation information to the user. In addition, when a user has viewed a book review for a specific book in an online community or a review site, books similar to the book may be selected. The selection of similar books described above may be performed on similar books sorted or classified based on the first to third similarities mentioned above through the similar determination unit.

Various book information such as a website on the external online space, for example, an SNS site, an online bookstore site, an online book review site, and the user's behavior information about the book may be extracted from the site through the user behavior extraction unit 170. Can be. The user behavior information extracted through the user behavior extraction unit may be stored in the user behavior database.

Also, the book recommendation unit may arbitrarily recommend the selected book. If the book extracting unit, especially the similar book extracting module, selects similar books in consideration of the user's behavior, the book extracting selection module (not shown) of the book extracting unit finally outputs the selected similar books according to a predetermined standard. can do. That is, the output role may be a selection recommendation module in the book recommendation unit. The selection recommendation module may recommend the remaining books except those that the user has already read among similar books extracted by the similar book extraction module. In addition, it is possible to recommend the remaining books except for low-ranking books such as users, other users, and critics. Or, considering the user information, the user may recommend books that are inappropriately read. The screening recommendation module excludes books that are not recommended based on the predetermined criteria among similar books. Meanwhile, the selection recommendation module may refer to the book information database for book selection. Various information about each book may be included in the book information database. In addition, the selection recommendation module may finally recommend the book by linking the user behavior database and the book information database. For example, information about a book already read by a particular user may be stored in the user behavior database.

In addition, the selection recommendation module may classify and output the books extracted from the similar book extraction module according to specific criteria. One or more books may be output in order of book rating. In this case, using the book information database, you can retrieve the score of each book and use it. In addition, the books may be sorted and output based on information such as the cumulative sales amount of each similar book and the sales amount for each period.

In addition, similar books may be sorted and output using information personalized for each user. For example, books that are preferred by users of similar ages may be output in order of affinity using information about an age of the user's personal information. In addition, in consideration of gender or inclination of personal information, specific similar books among the similar books may be output first. These operations may be performed using user personal information stored in a user behavior database.

Looking at the database, as described above, the book keyword database includes a set of keywords representing each book, and the theme keyword database includes information on the relationship between keywords and themes such as keywords by theme and keywords by keyword. Theme weights according to keywords may be included. The reverse is also true. The book theme database contains a set of themes that represent each book. The book group database contains groups that represent each book. As described above, the user behavior database may include various book information such as an SNS site, an online bookstore site, an online book review site, and user behavior information about the book. Finally, the book information database may include information related to books, such as book titles, book authors, book publication years, and the like.

2 to 5 are operational flowcharts in accordance with the present invention. Except for the description duplicated with the description of the book recommendation system according to the present invention of Figures 2 to 5 will be described. Referring to FIG. 2, the book recommendation server receives book data (S810). Next, the book recommendation server extracts a keyword from the book data in the TF-IDF method (S830). Next, the book recommendation server calculates the similarity between the extracted keyword sets (S850). Thereafter, the book recommendation server outputs recommended books based on the similarity calculation or the like (S870).

Referring to FIG. 3, steps S810 and S830 are the same, and subsequent steps are different. The book recommendation server allocates a book theme according to the extracted keywords (S835). After that, the book recommendation server calculates the similarity between the assigned book themes (S837). Thereafter, the recommendation book is output based on the calculated similarity (S870).

Referring to FIG. 4, steps S810 are the same, and subsequent steps are different. First, the contents of the book are grouped by the LDA method (S840). Thereafter, the similarity between groups in the grouped book is calculated (S845). Thereafter, the recommendation book is output based on the calculated similarity (S870).

Referring to FIG. 5, steps S810 are the same, and subsequent steps are different. First, it is determined how the similarity between books is calculated according to the characteristics of the book or the needs of the user (S820). When the TF-IDF method is selected, the similarity between the keyword sets is measured according to the TF-IDF method. In contrast, when the LDA method is selected according to the characteristics of the book or the needs of the user, the similarity of the book is measured and output according to the LDA method.

6 is a block diagram of a book recommendation system according to another embodiment of the present invention. Referring to FIG. 6, the book recommendation system includes a control unit 110, an emotion graph generation unit 220, an emotion graph similarity determination unit 230, a book recommendation unit 240, an input / output unit 250, an emotion word database 201. ), An emotion graph database 203, a user behavior database 205, and a book information database 207 are shown. This book recommendation system can be a book recommendation server.

First, the controller 110 controls the components such as an emotion graph generating unit, an emotion graph similar decision unit, a book recommendation unit, an input unit, an output unit, and the like. For example, the controller may serve to control the emotion graph generator to generate an emotion graph according to the control signal by receiving a control signal from the user. Of course, the control unit may control the respective databases connected to the control unit. In FIG. 6, various databases are included in a server that is a book recommendation system. However, the databases may exist outside the server. In addition, although the description of the present invention and the claims refer to books, the application of the present invention is not limited to books, and all of the distinguished texts including types such as web novels and comics can be applied.

Next, the emotion graph generation unit 220 serves to generate an emotion graph for the story flow of books. Storyflow can refer to the unfolding of the story in chronological order from the first chapter to the last chapter of the book. Generally, novels, essays, travel texts, etc., except for explanatory texts and essay texts, contain literary elements and there is a flow of emotion in the text as a whole. For example, in William Shakespeare's novel Romeo and Juliet, two families argued, Romeo and Juliet's love, Juliet and Paris forced to marry, Romeo and Paris duel, Romeo's suicide, Juliet's self-determination, both families It has a storyflow that includes various emotions such as reconciliation. Specifically, in view of the feeling that one of the emotions is happiness, the emotion level is low in the dispute between the two families in the novel, the emotion is strong in the love of Romeo and Juliet in the middle, and the emotion level due to the duel at the end. Is low, but the feeling of happiness rises again because of reconciliation. As described above, in the case of books containing literary elements such as novels and essays, there is a change of emotion according to story flow. The emotion graph generator is responsible for generating these emotion changes as emotion graphs.

In addition, the emotion graph generation unit may select a book for which the emotion graph according to the story flow is to be generated using the genre information of the book which may be stored in the book information database. More specifically, the emotion graph generation unit inputs the story flow of the book to the emotion graph generation unit in order to generate the emotion graph when the genre information of the planned book is novel. On the contrary, when the genre information of the book is a dissertation, it is determined that there is no emotion to be extracted and does not attempt to generate an emotion graph.

Also, the emotion graph generator may analyze the emotion in the book through a window. 7 is a conceptual diagram illustrating the generation of a storyflow emotion graph according to the present invention. Referring to FIG. 7, words constituting a story flow of a book are input into a memory (not shown), and words stored in the memory are divided into parts and read sequentially to perform emotional analysis. More specifically, for example, suppose a book is a novel of 10,000 words. Emotions are analyzed in units of one hundred words for a book consisting of 20,000 words, and the words that are the subject of emotion analysis can be referred to as a group. The window can be moved from the first word of the book to the last word order, and the emotional values of the words contained in the window can be calculated.

The size of the window can be arbitrarily determined by the user. In an embodiment of the present invention, a window sized to allow the number of words corresponding to 1% of the total number of words in the book can be determined. For example, if an entire novel contains ten thousand words, Windows can read one hundred words. Of course, the size of the window is not limited to the above example and may have windows of various sizes.

The size of the window may have words of an average value of the number of words included in the pages of the book. This allows the size of the window to be about each page of the book.

Also, the window can be moved in the storyflow direction of the book. For example, suppose you move a book of 10,000 words to a window that can read hundred words. The position of the first window is located at the first word from the first word, and the emotion can be extracted by analyzing the contents of the book where the first window is located. Then move the window one word in the storyflow direction. As a result, the window is located at the one hundredth word from the second word, and the emotion of the corresponding words can be analyzed. By repeating this process, the window is positioned at the tenth word from the last ninety-nine hundredth word, and the emotions of the corresponding words can be analyzed. Through this process, the analysis of emotions for nine thousand nine hundred days can be performed to complete the emotion analysis of the entire book. On the other hand, since the first emotional analysis result value from nine thousand nine hundred ninety days emotional analysis result value is repeated addition and deletion of a word can be changed smoothly. The above is summarized as follows.

Word Count of Total Books = n

Window size = number of words that can be in the window = m

Word to be analyzed in the window at position a in the book = word a + m-1 to word a through m in the book

The nth word in the book's storyflow order = word (n)

example)

Word Count for Total Books = 10,000

Window Size = 100

Words to be analyzed in the window of the first position = word (1) to word (100)

Words to be analyzed in the window of the second position = word (2) to word (101)

.....

Analysis words in the window at position 99,901 = words (99,901) to (10,000)

On the other hand, the movement of the window can be moved by one word unit, but the present invention is not limited thereto. The emotion value may be calculated by moving one page. In this case, since the word is entirely replaced in the window, the change in the emotion value may be somewhat large. For example, when the size of the window is set to one page size of the book and the moving unit is also set to one page, the emotion result value for each page may be sequentially obtained. In addition, the emotion values may be distributed discretely.

Next, the emotional value calculation method of a window is demonstrated. The emotion graph generator calculates the emotion level by analyzing the words included in the window. In detail, a search is performed for a word related to a specific emotion among the words in the window, and the scores assigned to the searched words are summed to calculate a value for the specific emotion level. 8 is an emotion value table of words representing an emotion of happiness included in the emotion word database 101. Looking at the emotional value table, the word 'laugh' can have an emotional value of 4 and the 'happy' emotion can have a value of '3'. It is not necessary to have only positive emotional values, but some words have negative emotional values. For example, the word 'hate' has an emotional value of '-1' and the word 'plural' has a value of '-3'. The expression of emotion values is not necessarily limited to positive or negative numbers, but is sufficient if the degree of emotion values can be expressed. For example, it can also show as upper, middle, and lower.

When the included words are present in the emotion value table, the final unit emotion total value is calculated by summing the emotion values according to each word. That is, the unit emotion summation value may mean a sum of emotion values of words included in one window. 9 relates to a page in which words with emotion values are present. For example, the words 'laugh', 'love', 'happiness' and 'hate' regarding the feeling of happiness are searched on the page of FIG. When the sum of emotion values of words related to each emotion is summed, the unit feeling sum value of this page is calculated as 7 points of laughter (4) + love (2) + happiness (3) + hatred (-2). By repeatedly calculating the unit emotion sum value in the storyflow direction, an emotion graph of a book about the emotion of happiness can be derived. 5 is a graph showing the unit feeling value according to the story flow direction.

In this example, the emotion graph is generated only for the emotion of happiness, but is not limited thereto, and various emotion graphs such as happiness, surprise, regret, loneliness, and fear may be generated. To this end, various emotion tables may exist in the emotion word database, and various words may exist in the emotion table. In addition, when calculating an emotion value, it is not necessary to be limited to whether a word is the same, and an emotion value may be calculated in the morpheme unit which is the minimum unit of meaning.

Emotion graph similarity determination unit 230 serves to determine the similarity between the emotional graph of each book. More specifically, based on the first similar judgment basic value representing the difference between the unit feeling sum total value of the first book and the unit sentiment sum total value of the second book and the second similar judgment basic value representing the amount of change of the first similar judgment basic value. The similarity between the first book and the second book is determined. In addition, as described above, the similarity between several emotion graphs may not be limited to one emotion, and finally, similarity between the two books may be derived.

FIG. 11 is a table showing unit emotion sum values for each section of books 1 to 4, and FIG. 11 and 12, the difference between the unit emotion sum values for each section between the book 1 and the book 2 is shown as in FIG. 13. The first similarity determination base value means a difference between the unit feeling sum value of the first book and the unit feeling sum value of the second book according to the story direction, and the second similar determination basis value is the value of the first similar determination basis value. The amount of change is shown. Referring to FIG. 8, the first similar judgment basic value between the book 1 and the book 2 is '1, 3, 5, 7, 9, 11, 8, 6, 4, 2, 1, 1, 3, 5'. Listed. On the other hand, the first similarity determination base value between the book 1 and the book 3 is all '1'. The emotional graph-like judgment judges that the smaller the difference in the unit emotion sum of two books is, the more similar it is. That is, it may be determined that the smaller the first similarity determination base value, the more similar the books are. Therefore, referring to FIG. 13, the emotion graph-like judgment unit may determine that book 3 is more similar to book 1 than to book 2. FIG. In addition, when examining the second similar judgment basic value, since the amount of change of the first similar judgment basic value between the book 1 and the book 2 is large, the second similar judgment basic value is also large. On the contrary, since there is no change in the first similar judgment basic value between the book 1 and the book 3, the second similar judgment basic value is '0'. Therefore, the Emotion Graph Similarity Determination Unit may determine that

books

1 and 3, which have a low basis value for second judgment, are more similar than

books

1 and 2. The emotional graph-like decision unit may make a similar decision between the two books by mixing one or more of the first and second similar judgment basic values. In addition, the similarity between the two books can be measured by assigning a weight to the similar determination base value according to the nature of the book.

Since there is no change in the first similar judgment basic value between the book 1 and the book 3 and the book 1 and the book 4, the second similar judgment basic value is '0'. However, since the first similar judgment base value is different, the emotional graph-like judgment unit may determine that the smaller book 1 and the book 3 are more similar than the book 1 and the book 4.

The book recommendation unit 140 selects books that have been given a higher evaluation score than a predetermined reference value, books included in a shopping cart of an online shopping mall, or books that have been read by the user in the online space. To make recommendations to users. The role may be played by a similar book extraction module (not shown) in the book recommendation unit.

More specifically, the book recommendation system ultimately recommends books that the user may find interesting. At this time, the book recommendation unit of the Seo Seok recommendation system can analyze the user's behavior patterns, etc. and recommend similar books for books related to this. More specifically, in order to provide the user's personalized information and customized recommendation books, the user's online behaviors are identified and used for the selection of recommendation books. For example, when a user gives a high evaluation score to a specific book on an online website or the like, books similar to the book are selected and recommended through an emotional graph-like judgment unit. In addition, when a user purchases a specific book in an online bookstore such as Amazon (registered trademark) or has put it in a shopping cart to make a purchase, the user may select books similar to the specific book and provide recommendation information to the user. In addition, when a user has viewed a book review for a specific book in an online community or a review site, books similar to the book may be selected.

Various book information such as a website on the external online space, for example, an SNS site, an online bookstore site, an online book review site, and the user's behavior information about the book may be extracted from the site through the user behavior extraction unit 160. Can be. The user behavior information extracted through the user behavior extraction unit may be stored in the user behavior database.

Also, the book recommendation unit may arbitrarily recommend the selected book. If the book extracting unit, especially the similar book extracting module, selects similar books in consideration of the user's behavior, the book extracting selection module (not shown) of the book extracting unit finally outputs the selected similar books according to a predetermined standard. Do it. That is, the output role may be a selection recommendation module in the book recommendation unit. The selection recommendation module may recommend the remaining books except those that the user has already read among similar books extracted by the similar book extraction module. In addition, it is possible to recommend the remaining books except for low-ranking books such as users, other users, and critics. Or, considering the user information, the user may recommend books that are inappropriately read. The screening recommendation module excludes books that are not recommended based on the predetermined criteria among similar books. Meanwhile, the selection recommendation module may refer to the book information database for book selection. Various information about each book may be included in the book information database. In addition, the selection recommendation module may finally recommend the book by linking the user behavior database and the book information database. For example, information about a book already read by a particular user may be stored in the user behavior database.

In addition, the selection recommendation module may classify and output books extracted from the similar book extraction module according to a specific criterion. One or more books may be output in order of book rating. In this case, using the book information database, you can retrieve the score of each book and use it. In addition, the books may be sorted and output based on information such as the cumulative sales amount of each similar book and the sales amount for each period.

14 to 17 are operational flowcharts in accordance with the present invention. 1 to 13 will be described except for a description redundant to the book recommendation system according to the present invention. Referring to FIG. 14, the book recommendation server generates an emotion graph according to the story flow of books (S910). Next, the book recommendation server determines the similarity between books (interesting books) and other books of interest (S930). As described above, the similarity of the emotion graph may be measured to determine the similarity. Similar books are output as recommended books through the similarity calculation (S950).

15 is a flowchart illustrating further analysis of generating an emotion graph according to the storyflow. Referring to FIG. 15, the book recommendation server determines whether a story exists using the genre information of the book (S911). Next, the book recommendation server divides the book into predetermined sizes (S913). Next, the book recommendation server counts the appearance of the emotion word existing in each section (S915). Thereafter, the book recommendation server calculates a unit emotion sum value for each section by calculating the weight for each emotion word and generates an emotion graph according to the value (S917). The information on the graph is stored in the emotion graph database (S919).

16 is a diagram illustrating a preprocessing step for extracting a book to be recommended to a user. Referring to FIG. 16, a book recommendation server is a book that has been well received by a user, a book that a user has put in a shopping cart of an online shopping mall, or a book that has been steamed by a 'steaming' function in an online shopping mall, and a book information of a user. Check the books that have been read (S920). The book recommendation server determines that the book is a user interest book and analyzes the similarity between the interest book and other books (S930). The book recommendation server outputs a similar book based on the similarity (S950).

17 is a flowchart illustrating an analysis of a step of outputting a recommendation book based on a similarity calculation. Referring to FIG. 17, the book recommendation server extracts a list of similar books to be recommended (S951). The book recommendation server filters the list of similar books by applying user behavior information among the extracted books, for example, book information that the user has already read (S953). User behavior information may include a variety of information, such as gender, age, SNS usage behavior. Next, the book recommendation server filters the list of similar books by applying the book information to the book information, for example, the rating information or the sales amount information (S955). After that, the filtered book is finally output (S957). On the other hand, although expressed by filtering, filtering can be understood as a concept of sorting books. In addition, the steps S951 and S955 need not be limited to sequential and may be changed in order.

18 is a block diagram of a book recommendation system according to another embodiment of the present invention. The description is duplicated with the above description. Referring to FIG. 18, the book recommendation system includes various similarity information extracted from the TF-IDF unit, the LDA unit, and the emotion graph generation unit, for example, keyword similarity between books, similarity between book themes, and similarity between groups of books. For example, the similarity between story flows of books may be measured and recommended to the user.

Through the book recommendation system and book recommendation method as described above, it is possible to recommend optimal books to the user.

The methods and processes described above are, for example, instructions for execution by a processor, controller, or other processing device and may be encoded, compact disk read-only memory (CDROM), magnetic or optical disk, flash memory, random access memory. (RAM) or read only memory (ROM), erasable programmable read only memory (EPROM) or other machine-readable media.

Such a medium may be implemented as any device that includes, stores, communicates, propagates, or moves executable instructions for use by or in connection with an instruction executable system, apparatus, or device. Alternatively or additionally, as analog or digital logic using hardware such as one or more integrated circuits, or one or more processor execution instructions; Or in software of functions defined as an API (application programming interface) or DLL (Dynamic Link Library), local or remote procedure call or available in shared memory; Or as a combination of hardware and software.

In other implementations, the method may be represented by a signal or radio-signal medium. For example, the instructions for implementing the logic of any given program may take the form of an electrical, magnetic, optical, electromagnetic, infrared or other type of signal. The system described above receives these signals at a communication interface, such as a fiber optic interface, an antenna, or other analog or digital signal interface, recovers instructions from the signals, stores them in machine readable memory, and / or uses a processor. You can also run them.

As mentioned above, although an embodiment of the present invention has been described, those of ordinary skill in the art may add, change, delete or add elements within the scope not departing from the spirit of the present invention described in the claims. The present invention may be modified and changed in various ways, etc., which will also be included within the scope of the present invention.

It can be used for the method of recommending books using the similarity between books.

Claims

(1) a book recommendation server receiving book data for a plurality of books including any one or more of a book title, a book body, and book introduction information;

(2) the book recommendation server extracting one or more keywords from the book data based on a word frequency and an inverse document frequency;

(3) the book recommendation server calculating a similarity between the keyword of the first book and the keyword of the second book; And

(4) a book recommendation method using the similarity between books including the book recommendation server outputting one or more similar books similar to the first book according to the similarity.
(1) a book recommendation server receiving book data for a plurality of books including any one or more of a book title, a book body, and book introduction information;

(2) the book recommendation server extracting one or more keywords from the book data based on a word frequency and an inverse document frequency;

(3) the book recommendation server assigning one or more book themes based on the keywords;

(4) calculating, by the book recommendation server, the similarity between the book theme of the first book and the book theme of the second book;

(5) the book recommendation method using the book similarity between the book recommendation server comprising the step of outputting one or more similar books similar to the first book according to the similarity.
(1) a book recommendation server receiving book data for a plurality of books including any one or more of a book title, a book body, and book introduction information;

(2) the book recommendation server generating at least one group in the book data using a latent rich allocation algorithm (LDA) algorithm;

(3) the book recommendation server calculating a similarity between the group of the first book and the group of the second book; And

(4) a book recommendation method using the similarity between books including the book recommendation server outputting one or more similar books similar to the first book according to the similarity.
(1) a book recommendation server receiving book data for a plurality of books including any one or more of a book title, a book body, and book introduction information;

(2) the book recommendation server extracts one or more keywords from the book data based on word frequency and inverse document frequency, or assigns one or more book themes based on the keywords; Generating at least one group in the book data based on a latent allocation algorithm (LDA) algorithm;

(4) the book recommendation server calculates a first similarity between the keyword of the first book and the keyword of the second book, or calculates a second similarity between the book theme of the first book and the book theme of the second book, or Calculating a third similarity between the group of books and the group of second books;

(5) a book recommendation method using the book similarity between the book recommendation server comprising the step of outputting one or more similar books similar to the first book according to any one or more of the first similarity to the third similarity.
(1) a book recommendation server receiving book data for a plurality of books including any one or more of a book title, a book body, and book introduction information;

(2) the book recommendation server selecting one of first and third similarity calculation methods;

(3) when the first similarity calculation method is selected, the book recommendation server extracts one or more keywords from the book data based on a word frequency and an inverse document frequency;

When the third similarity calculation method is selected, the book recommendation server generating one or more groups in the book data based on a late diallet allocation (LDA) algorithm;

(4) when a first similarity calculation method is selected, the book recommendation server calculates a first similarity between the keyword of the first book and the keyword of the second book,

Calculating a third similarity between the group of the first book and the group of the second book when the third similarity calculation scheme is selected;

(5) The book recommendation method using the book similarity between the book recommendation server comprising the step of outputting one or more similar books similar to the first book according to the first or third similarity.
The method of claim 5,

Step (2),

The book recommendation method using the similarity between books, the step of selecting one of the first or the third similarity calculation method according to the information of the book or the user information.
According to claim 1,

The first book,

A book recommendation method using the similarity between books that have been given a higher evaluation score by a user than a predetermined reference value, books included in a shopping cart of an online shopping mall, or books in which relevant book information has been viewed in an online space by a user.
The method according to any one of claims 1 to 3,

Outputting the similar book,

Excluding books read by the user from the one or more similar books;

Outputting the similar books in order to a book score among the one or more similar books;

The book recommendation method using the similarity between books further comprising any one or more of the step of outputting the similar book in consideration of the user information of the one or more similar books.
The method of claim 4, wherein

Step (2),

The book recommendation server extracts one or more keywords from the book data based on word frequency and inverse document frequency, assigns one or more book themes based on the keywords, or LDA (Latent). Generating one or more groups in the book data based on the Dirichlet Allocation algorithm, or generating a storyflow emotion graph for each of a plurality of books.

Step (4),

The book recommendation server calculates a first similarity between a keyword of a first book and a keyword of a second book, a second similarity between a book theme of a first book and a book theme of a second book, or a group of first books. Computing a third similarity between the group of the second book and a fourth similarity between the storyflow sentiment graph of the first book and the story fluoro sentiment graph of the second book;

Step (5),

And the recommendation server outputs one or more similar books similar to the first book according to one or more of the first to fourth similarities.
The method of claim 9,

Step (2),

Generating a window for analyzing words as many as a predetermined ratio with respect to the total number of words included in each book; And

The emotion of words contained in the window of the moved book is analyzed by sequentially moving the window from the word at the first position included in each book to the story direction of the book in the direction of the word at the second position. The book recommendation method using the similarity between books further comprising the step of generating a storyflow emotion graph.
The method of claim 10,

The emotion of words contained in the window of the moved book is analyzed by sequentially moving the window from the word at the first position included in each book to the story direction of the book in the direction of the word at the second position. To create a storyflow emotion graph,

Assigning an emotion value for each word included in the window with reference to words included in a predefined emotion word table;

Deriving a unit emotion summation summating emotion values for each word in the window; And

The book recommendation method using the similarity between books further comprising the step of generating data for the story flow emotion graph based on the unit emotion sum total value.
The method of claim 11,

Step (4),

A first similar judgment basic value representing a difference between a unit feeling sum value of the first book in the story direction and a unit feeling sum value of the second book in the story direction; And

The book recommendation method using the similarity between books, characterized in that for calculating a fourth similarity based on any one or more of the second similar determination basis value indicating the amount of change of the first similar determination basis value.