CN111274472A - Information recommendation method and device, server and readable storage medium - Google Patents
Information recommendation method and device, server and readable storage medium Download PDFInfo
- Publication number
- CN111274472A CN111274472A CN201811475043.XA CN201811475043A CN111274472A CN 111274472 A CN111274472 A CN 111274472A CN 201811475043 A CN201811475043 A CN 201811475043A CN 111274472 A CN111274472 A CN 111274472A
- Authority
- CN
- China
- Prior art keywords
- sample
- samples
- positive
- negative
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 238000012549 training Methods 0.000 claims abstract description 120
- 230000005484 gravity Effects 0.000 claims abstract description 97
- 238000012706 support-vector machine Methods 0.000 claims description 69
- 230000006870 function Effects 0.000 claims description 36
- 238000013507 mapping Methods 0.000 claims description 28
- 230000009466 transformation Effects 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 21
- 238000010586 diagram Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 230000003190 augmentative effect Effects 0.000 description 6
- 230000006399 behavior Effects 0.000 description 6
- 230000015654 memory Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 239000011521 glass Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 241000353135 Psenopsis anomala Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- -1 commodities Substances 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000009751 slip forming Methods 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000003997 social interaction Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides an information recommendation method, an information recommendation device, a server and a readable storage medium, a plurality of positive samples and a plurality of unknown samples in a training sample set are clustered through the gravity center of at least one iteration cycle, a plurality of more accurate negative samples unmatched with a target service event can be obtained, and prediction errors caused by the fact that the samples originally matched with the target service event are taken as characteristic information of the negative samples to be learned in the process of training an information recommendation model are avoided. And then predicting the matching degree of each sample to be predicted and the target service event according to the information recommendation model obtained by training, thereby determining whether information related to the target service event needs to be recommended to the service requester terminal of each sample to be predicted or not, so that the accuracy of information recommendation can be improved, and more convenient and efficient target services are provided for the service requester.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to an information recommendation method, an information recommendation apparatus, a server, and a readable storage medium.
Background
At present, along with the popularization of intelligent terminals, various Applications (APPs) providing life convenience services are also in endless range, and provide services for people who eat and wear the intelligent terminals. These applications can collect a large amount of user service log information each day, which contains the user's habits and preferences, and can also be used to predict the user's behavior independent of the services provided by the application. However, how to determine whether each user is interested in various pre-recommended target services from the service record information of a large number of users, so as to recommend information related to the matched target services for the users in a targeted manner, is a technical problem to be solved by the technical staff in the field.
Disclosure of Invention
In view of this, an object of the embodiments of the present application is to provide an information recommendation method, an apparatus, a server and a readable storage medium, which can recommend matched information related to a target service to a service requester in a targeted manner.
According to an aspect of embodiments of the present application, there is provided an electronic device that may include one or more storage media and one or more processors in communication with the storage media. One or more storage media store machine-readable instructions executable by a processor. When the electronic device is operated, the processor is communicated with the storage medium through the bus, and the processor executes the machine readable instructions to execute the information recommendation method.
According to another aspect of the embodiments of the present application, there is provided an information recommendation method applied to a server, where the method may include:
obtaining a training sample set, the training sample set comprising a plurality of positive samples and a plurality of unknown samples;
obtaining a plurality of negative samples which are not matched with a target service event through gravity center clustering of at least one iteration cycle based on the plurality of positive samples and the plurality of unknown samples, wherein each positive sample is matched with the target service event;
training according to the positive samples and the negative samples to obtain an information recommendation model, and inputting each sample to be predicted into the information recommendation model obtained through training to obtain the matching degree of each sample to be predicted and the target service event;
and for each sample to be predicted, determining whether information related to the target service event needs to be recommended to a service requester terminal of the sample to be predicted or not according to the matching degree of the sample to be predicted and the target service event.
In a possible implementation, the step of obtaining a training sample set may include:
obtaining a plurality of initial positive samples and a plurality of initial unknown samples;
and normalizing the sample characteristics of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
In one possible embodiment, the step of obtaining a plurality of initial positive samples and a plurality of initial unknown samples may include:
obtaining first service record information of a plurality of first users matched with the target service event and second service record information of a plurality of second users except the first users;
extracting feature information from the first service record information to obtain a plurality of initial positive samples, and extracting feature information from the second service record information to obtain a plurality of initial unknown samples.
In a possible implementation, the step of normalizing the sample characteristics of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples within the same value space range may include:
acquiring default values of all sample characteristics of each initial positive sample and each initial unknown sample;
filling default values of the sample features with the default values lower than the first default value threshold value, and eliminating the sample features with the default values larger than the second default value threshold value to obtain a plurality of processed initial positive samples and a plurality of processed initial unknown samples;
and mapping the plurality of initial positive samples and the plurality of initial unknown samples obtained by processing into the same value space range to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
In a possible embodiment, the step of obtaining, through gravity center clustering for at least one iteration cycle based on the plurality of positive samples and the plurality of unknown samples, a plurality of negative samples that do not match the target service event may include:
aiming at each iteration cycle, obtaining a clustered positive cluster and a clustered negative cluster through iteration clustering in the iteration cycle;
calculating a first distance between each sample and the gravity center of the clustered positive cluster and a second distance between each sample and the gravity center of the clustered negative cluster in a plurality of positive samples of the positive cluster and a plurality of negative samples of the negative cluster;
and traversing each sample, if the first distance calculated by the samples is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample.
In a possible implementation manner, the step of obtaining, in each iteration cycle, a clustered positive cluster and a clustered negative cluster through iterative clustering in the iteration cycle may include:
calculating the center of gravity of the plurality of positive samples in a first iteration cycle;
and taking all samples with the distance between the samples and the calculated center of gravity smaller than a first preset threshold value as a first clustered positive cluster, and taking all samples with the distance between the samples and the center of gravity larger than a second preset threshold value as a first clustered negative cluster.
In a possible embodiment, the step of obtaining, through gravity center clustering for at least one iteration cycle based on the plurality of positive samples and the plurality of unknown samples, a plurality of negative samples that do not match the target service event may include:
all samples with the distances to the centers of gravity of the plurality of positive samples smaller than a first preset threshold value are taken as a first clustered positive cluster, and all samples with the distances to the centers of gravity of the plurality of positive samples larger than a second preset threshold value are taken as a first clustered negative cluster;
calculating a first distance between each sample and a center of gravity of the first positive cluster and a second distance between each sample and a center of gravity of the first negative cluster, among a plurality of positive samples of the first positive cluster and a plurality of negative samples of the first negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a corresponding second positive cluster and a corresponding second negative cluster;
calculating a first distance between each sample and a center of gravity of the second positive cluster and a second distance between each sample and a center of gravity of the second negative cluster among a plurality of positive samples of the second positive cluster and a plurality of negative samples of the second negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a third positive cluster and a third negative cluster which correspond to each other;
and taking the third positive cluster as a new second positive cluster, taking the third negative cluster as a new second negative cluster, returning to the step of calculating a first distance between each sample and the gravity center of the second positive cluster and a second distance between each sample and the gravity center of the second negative cluster from the plurality of positive samples of the second positive cluster and the plurality of negative samples of the second negative cluster until an iteration stop condition is met, and taking each negative sample in the finally obtained third negative cluster as a negative sample which is not matched with the target service event.
In a possible embodiment, the iteration stop condition comprises at least one of the following conditions:
the positive samples in the third positive cluster and the negative samples in the third negative cluster no longer change;
the iteration times reach the set times;
the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is less than a set distance.
In a possible implementation, the step of training to obtain an information recommendation model according to the positive samples and the negative samples may include:
and training a support vector machine model according to the positive samples and the negative samples, and taking the support vector machine model obtained by training as the information recommendation model.
In a possible implementation, the step of training to obtain an information recommendation model according to the positive samples and the negative samples may include:
training a support vector machine model from the plurality of positive samples and the plurality of negative samples;
performing sample selection on the plurality of positive samples and the plurality of negative samples according to the trained support vector machine model to obtain a plurality of target positive samples and a plurality of target negative samples;
and training the integrated tree model based on the plurality of target positive samples and the plurality of target negative samples, and taking the integrated tree model obtained by training as a trained information recommendation model.
In one possible implementation, the training of the support vector machine model according to the plurality of positive samples and the plurality of negative samples may include:
mapping the positive samples and the negative samples through a preset kernel function;
inputting each positive sample and each negative sample after mapping transformation into a support vector machine model for training, and calculating the distance between each positive sample and each negative sample after mapping transformation and a hyperplane;
calculating a loss function value of the support vector machine model according to the distance between each positive sample and each negative sample obtained through calculation and the hyperplane;
and adjusting model parameters and a learning rate of the support vector machine model by using a gradient descent method according to the calculated loss function value, returning and inputting each positive sample and each negative sample after mapping transformation into the support vector machine model for training based on the adjusted support vector machine model, calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane until an iteration stop condition is met, and outputting the support vector machine model obtained by training.
In a possible implementation, the step of performing sample selection on the plurality of positive samples and the plurality of negative samples according to the trained support vector machine model to obtain a plurality of target positive samples and a plurality of target negative samples may include:
inputting each positive sample and each negative sample into a trained support vector machine model to obtain the matching rate of each positive sample and each negative sample with the target service event;
and eliminating the samples with the error values larger than a third preset threshold value according to the matching rate of each positive sample and each negative sample with the target service event to obtain a plurality of target positive samples and a plurality of target negative samples.
In a possible implementation manner, the step of determining whether information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted according to the matching degree of the sample to be predicted and the target service event may include:
judging whether the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value or not;
and if the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value, determining that information related to the target service event needs to be recommended to a service requester terminal of the sample to be predicted.
In one possible embodiment, the method may further include:
and if the fact that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted is determined, adding the sample to be predicted to a plurality of positive samples in the training samples.
In one possible embodiment, the method may further include:
if the matching degree of the sample to be predicted and the target service event is greater than a fifth preset threshold and not greater than a fourth preset threshold, sending prompt information to a service requester terminal of the sample to be predicted so as to prompt a user of the service requester terminal to select whether to receive information related to the target service event;
if first indication information which is sent by the service requester terminal and does not receive the information related to the target service event is received, determining that the information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of negative samples which are not matched with the target service event;
if second indication information which is sent by the service requester terminal and used for receiving the information related to the target service event is received, determining that the information related to the target service event needs to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of positive samples in the training samples; and
and if the matching degree of the sample to be predicted and the target service event is not larger than a fifth preset threshold, determining that information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted into a plurality of negative samples which are not matched with the target service event.
In a possible implementation manner, after the step of determining whether information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted according to the matching degree of the sample to be predicted and the target service event, the method includes:
if the fact that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted is determined, an information acquisition request is sent to each third-party server providing the service of the target service event;
and receiving information which is sent by each third-party server in response to the information acquisition request and is related to the target service event, and recommending the information related to the target service event to the service requester terminal of the sample to be predicted.
According to another aspect of the embodiments of the present application, there is provided an information recommendation apparatus applied to a server, the apparatus may include:
an obtaining module configured to obtain a training sample set, where the training sample set includes a plurality of positive samples and a plurality of unknown samples;
a gravity center clustering module, configured to perform gravity center clustering for at least one iteration cycle based on the multiple positive samples and the multiple unknown samples to obtain multiple negative samples that are not matched with a target service event, where each positive sample is matched with the target service event;
the training module is used for training according to the positive samples and the negative samples to obtain an information recommendation model, inputting each sample to be predicted into the information recommendation model obtained through training, and obtaining the matching degree of each sample to be predicted and the target service event;
and the determining module is used for determining whether information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted or not according to the matching degree of the sample to be predicted and the target service event for each sample to be predicted.
According to another aspect of embodiments of the present application, there is provided a readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, may perform the steps of the information recommendation method described above.
Based on any one of the above aspects, the multiple positive samples and the multiple unknown samples in the training sample set are clustered through the gravity center of at least one iteration cycle, so that multiple more accurate negative samples unmatched with the target service event can be obtained, and prediction errors caused by learning by taking the samples originally matched with the target service event as the characteristic information of the negative samples in the process of training the information recommendation model are avoided. And then predicting the matching degree of each sample to be predicted and the target service event according to the information recommendation model obtained by training, thereby determining whether information related to the target service event needs to be recommended to the service requester terminal of each sample to be predicted or not, so that the accuracy of information recommendation can be improved, and more convenient and efficient target services are provided for the service requester.
In order to make the aforementioned objects, features and advantages of the embodiments of the present application more comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a schematic block diagram illustrating interaction of an information recommendation system provided by an embodiment of the present application;
FIG. 2 illustrates a schematic diagram of exemplary hardware and software components of an electronic device that may implement the server, the service requester terminal, and the service provider terminal of FIG. 1 provided by an embodiment of the present application;
FIG. 3 is a flowchart illustrating an information recommendation method according to an embodiment of the present application;
fig. 4 is a flowchart illustrating step S120 in the information recommendation method illustrated in fig. 3;
fig. 5 is a second flowchart illustrating an information recommendation method according to an embodiment of the present application;
FIG. 6 is a block diagram of functional modules of an information recommendation device provided in an embodiment of the present application;
fig. 7 shows a second functional block diagram of an information recommendation device according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some of the embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
To enable those skilled in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario, "net appointment taxi taking scenario". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of a "net appointment taxi taking scenario," it should be understood that this is only one exemplary embodiment. The application can be applied to any other traffic type. For example, the present application may be applied to different transportation system environments, including terrestrial, marine, or airborne, among others, or any combination thereof. The vehicle of the transportation system may include a taxi, a private car, a windmill, a bus, a train, a bullet train, a high speed rail, a subway, a ship, an airplane, a spacecraft, a hot air balloon, or an unmanned vehicle, etc., or any combination thereof. The application can also comprise any service system for online taxi taking, for example, a system for sending and/or receiving express delivery, and a service system for business transaction of buyers and sellers. Applications of the system or method of the present application may include web pages, plug-ins for browsers, client terminals, customization systems, internal analysis systems, or artificial intelligence robots, among others, or any combination thereof.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
The terms "passenger," "requestor," "service person," "service requestor," and "customer" are used interchangeably in this application to refer to an individual, entity, or tool that can request or order a service. The terms "driver," "provider," "service provider," and "provider" are used interchangeably in this application to refer to an individual, entity, or tool that can provide a service. The term "user" in this application may refer to an individual, entity or tool that requests a service, subscribes to a service, provides a service, or facilitates the provision of a service. For example, the user may be a passenger, a driver, an operator, etc., or any combination thereof. In the present application, "passenger" and "passenger terminal" may be used interchangeably, and "driver" and "driver terminal" may be used interchangeably.
In order to solve at least one technical problem described in the background of the present application, the present inventors have found through careful research that, in the current practice, a large number of positive samples matched with a target service event are collected, then, the remaining samples that are uncertain whether to be matched with the target service event are all used as negative samples, an information recommendation model is obtained through training according to the collected positive samples and negative samples, and then, whether each service requester is interested in the target service is identified through the information recommendation model obtained through training, so that information related to the target service is recommended to the service requesters interested in the target service in a targeted manner.
In one example, taking a net appointment application (e.g., dribble travel) as an example, the net appointment application can collect a large amount of travel service record information each day, and the travel service record information includes behavior information of a service requester (e.g., passenger, driver, etc.), such as usage habits, behavior preferences, and the like. Meanwhile, the travel service record information can also be used for predicting the behavior of the service requester independent of the network car booking, for example, the travel service record information may contain the intention of the service requester to buy or rent a car, and if the intention can be judged, the service requester with the intention of buying or renting a car can be identified, and corresponding information recommendation can be performed in a targeted manner.
For the above example, the current scheme is: the method comprises the steps of firstly collecting a large amount of travel service record information of service requesters who have car buying or renting intentions, and extracting characteristic information of the travel service record information to obtain a positive sample. Meanwhile, travel service record information of other service requesters which are uncertain whether the service requesters buy or rent the vehicles is collected, and characteristic information of the travel service record information is extracted, so that negative samples are obtained. Then, an information recommendation model is obtained through training according to the collected positive samples and the collected negative samples, and whether each service requester has the intention of buying or renting the car is identified through the trained information recommendation model, so that the information related to buying or renting the car is recommended for the service requesters interested in the target service in a targeted mode.
However, after careful study, the inventors found that if the samples except the positive sample are all negative samples and the obtained information recommendation model is trained, the prediction result has a great error. For example, the negative samples identified in the current scheme are not all samples that do not match the target service event, but also include samples that match the target service event, so that the feature information of the samples that originally match the target service event, that is, part of the feature information of the positive samples, may be mistakenly learned as the feature information of the negative samples in the process of training the information recommendation model, and thus, a prediction error of the information recommendation model obtained by subsequent training may be inevitably caused.
It should be noted that the above prior art solutions have shortcomings, which are the results of practical and careful study by the inventor, and therefore, the discovery process of the above problems and the solutions proposed by the following embodiments of the present application for the above problems should be the contribution of the inventor to the present application in the process of the present application.
According to the research on the above technical problems by the present inventors, embodiments of the present application provide an information recommendation method, an apparatus, a server, and a readable storage medium, which may obtain a plurality of more accurate negative samples that are not matched with a target service event by performing gravity center clustering on a plurality of positive samples and a plurality of unknown samples in a training sample set through at least one iteration cycle, so as to avoid a prediction error caused by learning a sample that is originally matched with the target service event as feature information of the negative sample in a process of training an information recommendation model. And then predicting the matching degree of each sample to be predicted and the target service event according to the information recommendation model obtained by training, thereby determining whether information related to the target service event needs to be recommended to the service requester terminal of each sample to be predicted or not, so that the accuracy of information recommendation can be improved, and more convenient and efficient target services are provided for the service requester.
Fig. 1 is a schematic diagram of an architecture of an information recommendation system 100 according to an alternative embodiment of the present application. For example, the information recommendation system 100 may be an online transportation service platform relied upon for transportation services such as taxi, designated driving service, express service, carpooling service, bus service, driver rental service, or regular service, or a combination of any of the above. The information recommendation system 100 may include a server 110, a network 120, a service requester terminal 130, a service provider terminal 140, and a database 150, and the server 110 may include a processor for performing an instruction operation therein. The information recommendation system 100 shown in fig. 1 is only one possible example, and in other possible embodiments, the information recommendation system 100 may include only one of the components shown in fig. 1 or may also include other components.
In some embodiments, the server 110 may be a single server or a group of servers. The set of servers can be centralized or distributed (e.g., the servers 110 can be a distributed system). In some embodiments, the server 110 may be local or remote to the terminal. For example, the server 110 may access information stored in the service requester terminal 130, the service provider terminal 140, and the database 150, or any combination thereof, via the network 120. As another example, the server 110 may be directly connected to at least one of the service requester terminal 130, the service provider terminal 140, and the database 150 to access information and/or data stored therein. In some embodiments, the server 110 may be implemented on a cloud platform; by way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud (community cloud), a distributed cloud, an inter-cloud, a multi-cloud, and the like, or any combination thereof. In some embodiments, the server 110 may be implemented on an electronic device 200 having one or more of the components shown in FIG. 2 in the present application.
In some embodiments, the server 110 may include a processor. The processor may process information and/or data related to the service request to perform one or more of the functions described herein. For example, in a express service, the processor may determine the target vehicle based on a service request obtained from the service requester terminal 130. A processor may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a reduced Instruction Set computer (reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.
In some embodiments, the user of the service requestor terminal 130 may be someone other than the actual demander of the service. For example, the user a of the service requester terminal 130 may use the service requester terminal 130 to initiate a service request for the service actual demander B (for example, the user a may call a car for his friend B), or receive service information or instructions from the server 110. In some embodiments, the user of the service provider terminal 140 may be the actual provider of the service or may be another person than the actual provider of the service. For example, user C of the service provider terminal 140 may use the service provider terminal 140 to receive a service request serviced by the service provider entity D (e.g., user C may pick up an order for driver D employed by user C), and/or information or instructions from the server 110. In some embodiments, "service requester" and "service requester terminal" may be used interchangeably, and "service provider" and "service provider terminal" may be used interchangeably.
In some embodiments, the service requester terminal 130 may comprise a mobile device, a tablet computer, a laptop computer, or a built-in device in a motor vehicle, etc., or any combination thereof. In some embodiments, the mobile device may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home devices may include smart lighting devices, control devices for smart electrical devices, smart monitoring devices, smart televisions, smart cameras, or walkie-talkies, or the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart lace, smart glass, a smart helmet, a smart watch, a smart garment, a smart backpack, a smart accessory, and the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a Personal Digital Assistant (PDA), a gaming device, a navigation device, or a point of sale (POS) device, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, virtual reality glass, a virtual reality patch, an augmented reality helmet, augmented reality glass, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or augmented reality device may include various virtual reality products and the like. In some embodiments, the built-in devices in the motor vehicle may include an on-board computer, an on-board television, and the like.
In some embodiments, a database 150 may be connected to the network 120 to communicate with one or more components of the information recommendation system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc.). One or more components in the information recommendation system 100 may access data or instructions stored in the database 150 via the network 120. In some embodiments, the database 150 may be directly connected to one or more components in the information recommendation system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc.); alternatively, in some embodiments, database 150 may also be part of server 110.
In some embodiments, one or more components (e.g., server 110, service requestor terminal 130, service provider terminal 140, etc.) in the information recommendation system 100 may have access to the database 150. In some embodiments, one or more components in the information recommendation system 100 may read and/or modify information related to a service requestor, a service provider, or the public, or any combination thereof, when certain conditions are met. For example, server 110 may read and/or modify information for one or more users after receiving a service request.
In some embodiments, the exchange of information by one or more components in the information recommendation system 100 may be accomplished by a request service. The object of the service request may be any product. In some embodiments, the product may be a tangible product or a non-physical product. Tangible products may include food, pharmaceuticals, commodities, chemical products, appliances, clothing, automobiles, homes, or luxury goods, and the like, or any combination thereof. The non-material product may include a service product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof. The internet product may include a stand-alone host product, a network product, a mobile internet product, a commercial host product, an embedded product, or the like, or any combination thereof. The internet product may be used in software, programs, or systems of the mobile terminal, etc., or any combination thereof. The mobile terminal may include a tablet, a laptop, a mobile phone, a Personal Digital Assistant (PDA), a smart watch, a Point of sale (POS) device, a vehicle-mounted computer, a vehicle-mounted television, a wearable device, or the like, or any combination thereof. The internet product may be, for example, any software and/or application used in a computer or mobile phone. The software and/or applications may relate to social interaction, shopping, transportation, entertainment time, learning, or investment, or the like, or any combination thereof. In some embodiments, the transportation-related software and/or applications may include travel software and/or applications, vehicle dispatch software and/or applications, mapping software and/or applications, and the like. In the vehicle scheduling software and/or application, the vehicle may include a horse, a carriage, a human powered vehicle (e.g., unicycle, bicycle, tricycle, etc.), an automobile (e.g., taxi, bus, privatege, etc.), a train, a subway, a ship, an airplane (e.g., airplane, helicopter, space shuttle, rocket, hot air balloon, etc.), etc., or any combination thereof.
Fig. 2 illustrates a schematic diagram of exemplary hardware and software components of an electronic device 200 of a server 110, a service requester terminal 130, and a service provider terminal 140, which may implement the concepts of the present application, provided by some embodiments of the present application. For example, the processor 220 may be used on the electronic device 200 and to perform the functions herein.
The electronic device 200 may be a general-purpose computer or a special-purpose computer, both of which may be used to implement the information recommendation method of the present application. Although only a single computer is shown, for convenience, the functions described herein may be implemented in a distributed fashion across multiple similar platforms to balance processing loads.
For example, the electronic device 200 may include a network port 210 connected to a network, one or more processors 220 for executing program instructions, a communication bus 230, and a different form of storage medium 240, such as a disk, ROM, or RAM, or any combination thereof. Illustratively, the computer platform may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The method of the present application may be implemented in accordance with these program instructions. The electronic device 200 also includes an Input/Output (I/O) interface 250 between the computer and other Input/Output devices (e.g., keyboard, display screen).
For ease of illustration, only one processor is depicted in the electronic device 200. However, it should be noted that the electronic device 200 in the present application may also comprise a plurality of processors, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor of the electronic device 200 executes steps a and B, it should be understood that steps a and B may also be executed by two different processors together or separately in one processor. For example, a first processor performs step a and a second processor performs step B, or the first processor and the second processor perform steps a and B together.
Fig. 3 is a flowchart illustrating an information recommendation method according to some embodiments of the present application, which may be performed by the server 110 shown in fig. 1. It should be understood that, in other embodiments, the order of some steps in the information recommendation method described in this embodiment may be interchanged according to actual needs, or some steps may be omitted or deleted. The detailed steps of the information recommendation method are described below.
Step S110, a training sample set is obtained.
In this embodiment, the training sample set may include a plurality of positive samples and a plurality of unknown samples. In an alternative embodiment, the step S110 may obtain the training sample set by:
first, a plurality of initial positive samples and a plurality of initial unknown samples are obtained. For example, first service record information of a plurality of first users and second service record information of a plurality of second users other than the first users, which are matched with the target service event, may be obtained, feature information may be extracted from the first service record information to obtain a plurality of initial positive samples, and feature information may be extracted from the second service record information to obtain a plurality of initial unknown samples.
Taking a car networking application as an example, assuming that the target service event is car buying or car renting, all users with car buying or car renting intentions can be used as a plurality of first users, and after determining the first users with car buying or car renting intentions and a plurality of second users other than the first users, feature information (such as use habits, behavior preferences, ages, driving ages, sexes and the like) of each first user is extracted from a first travel order generated by each first user (such as a travel order generated by driving a car, taking a car with a good-interest and the like) to serve as a plurality of initial positive samples. Meanwhile, extracting feature information of each second user from a second travel order generated by each second user as a plurality of initial unknown samples.
The first user or the second user may be any user using the application, and may be a driver, a passenger, a bicycle user, or the like, still taking a car booking application as an example.
On the basis, the sample characteristics of each initial positive sample and each initial unknown sample are normalized to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
As an embodiment, the default values of the respective sample characteristics of each initial positive sample and each initial unknown sample may be obtained before the normalization process is performed on the sample characteristics of each initial positive sample and each initial unknown sample. And then, filling default values into the sample features with the default values lower than the first default value threshold value, and eliminating the sample features with the default values larger than the second default value threshold value to obtain a plurality of processed initial positive samples and a plurality of processed initial unknown samples. By the design, noise samples existing in the initial positive sample and the initial unknown sample can be removed, and training errors caused by the subsequent noise samples are avoided.
The default value filling for the sample features with default values lower than the first default value threshold may be: and filling the default values of the sample characteristics with the default values lower than the first default value threshold value according to the average value of the default values of all the sample characteristics. Alternatively, the default value of the sample feature having the default value lower than the first default threshold value may be filled in by any value between the highest default value and the lowest default value of each sample feature. Or, the default value of the sample feature with the default value lower than the first default value threshold may be filled according to a median default value between the highest default value and the lowest default value of each sample feature, and the like, which is not limited in this embodiment.
Then, the processed multiple initial positive samples and multiple initial unknown samples may be mapped into the same value space range, resulting in multiple positive samples and multiple unknown samples within the same value space range.
For example, the processed multiple initial positive samples and multiple initial unknown samples may be mapped to the same value space range according to any feasible normalization manner, such as dispersion normalization, normal distribution density function, or non-linear function.
And step S120, based on the positive samples and the unknown samples, obtaining a plurality of negative samples which are not matched with the target service event through gravity center clustering in at least one iteration cycle.
In a possible implementation manner, for each iteration cycle, through iterative clustering in the iteration cycle, a clustered positive cluster and a clustered negative cluster are obtained, and in a plurality of positive samples of the positive cluster and a plurality of negative samples of the negative cluster, a first distance between each sample and the gravity center of the clustered positive cluster and a second distance between each sample and the gravity center of the clustered negative cluster are calculated. On the basis, each sample is traversed, each positive sample and each negative sample are included, if the first distance calculated by the samples is smaller than the second distance, the samples are taken as the positive samples, and if the first distance is larger than the second distance, the samples are taken as the negative samples.
By the above-mentioned iterative clustering mode through a plurality of iteration cycles respectively, inaccurate negative samples can be filtered out in each iteration cycle, so that prediction errors caused by learning by taking samples originally matched with the target service events as characteristic information of the negative samples in the process of training the information recommendation model by mistake are avoided.
In addition, in the gravity center clustering process, in a first iteration cycle, the gravity centers of a plurality of positive samples are calculated, all samples, the distance between which and the calculated gravity centers is smaller than a first preset threshold value, are taken as a first positive cluster after clustering, and all samples, the distance between which and the gravity centers is larger than a second preset threshold value, are taken as a first negative cluster after clustering. In each subsequent iteration cycle, a new positive cluster and a new negative cluster are determined.
On the basis of the above, the gravity center clustering process will be further described below by using a specific example.
Referring to fig. 4, in order to provide a process of performing gravity center clustering in the step S120 according to the embodiment of the present application, each sub-step included in the step S120 is described in detail below.
And a substep S121, taking all samples having a distance to the gravity center of the plurality of positive samples smaller than a first preset threshold value as a first clustered positive cluster, and taking all samples having a distance to the gravity center of the plurality of positive samples larger than a second preset threshold value as a first clustered negative cluster.
And a substep S122 of calculating a first distance between each sample and the center of gravity of the first positive cluster and a second distance between each sample and the center of gravity of the first negative cluster among the positive samples of the first positive cluster and the negative samples of the first negative cluster.
And a substep S123 of traversing each sample, taking the sample as a positive sample if the first distance calculated by the sample is smaller than the second distance, and taking the sample as a negative sample if the first distance is greater than the second distance, so as to obtain a corresponding second positive cluster and a corresponding second negative cluster.
And a substep S124 of calculating a first distance between each sample and the center of gravity of the second positive cluster and a second distance between each sample and the center of gravity of the second negative cluster among the positive samples of the second positive cluster and the negative samples of the second negative cluster.
And a substep S125, traversing each sample, taking the sample as a positive sample if the first distance calculated by the sample is smaller than the second distance, and taking the sample as a negative sample if the first distance is greater than the second distance, so as to obtain a corresponding third positive cluster and a corresponding third negative cluster.
And a substep S126, regarding the third positive cluster as a new second positive cluster, and regarding the third negative cluster as a new second negative cluster.
In substep S127, it is determined whether or not the iteration stop condition is satisfied, and if not, the process returns to step S124. And if so, taking each negative sample in the finally obtained third negative cluster as a negative sample which is not matched with the target service event.
Thus, by repeating the iterative clustering process of S124-S127, the barycentric position of the positive cluster and the barycentric position of the negative cluster are continuously updated, and new positive clusters and negative clusters are continuously formed, so that the matching degree between the positive samples in the new positive clusters and the target service events is increasingly greater, the mismatching degree between the negative samples in the new negative clusters and the target service events is increasingly greater, and finally, when the iteration stop condition is satisfied, each negative sample in the third negative cluster obtained finally is taken as a negative sample that is not matched with the target service events.
Wherein the iteration stop condition may include at least one of the following conditions:
1) the positive samples in the third positive cluster and the negative samples in the third negative cluster do not change any more; 2) the iteration times reach the set times; 3) the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is less than the set distance.
Wherein, in the condition 1), the positive samples in the third positive cluster and the negative samples in the third negative cluster no longer change, indicating that the best positive cluster and negative cluster have been formed, the iteration can be stopped. In the condition 2), in order to save the operation amount, the maximum value of the iteration times may be set, and if the iteration times reaches the set times, the iteration of the iteration cycle may be stopped, and each negative sample obtained finally may be used as a negative sample that does not match the target service event. In condition 3), if the center of gravity of the third positive cluster and the third negative cluster moves by a distance less than the set distance, which indicates that the current positive cluster and negative cluster can substantially satisfy the condition, the iteration may be stopped.
It should be noted that the above-mentioned iteration stop conditions may be used in combination or alternatively, for example, the iteration may be stopped when the positive samples in the third positive cluster and the negative samples in the third negative cluster do not change any more, or the iteration may be stopped when the number of iterations reaches a set number, or the iteration may be stopped when the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is smaller than the set distance. Or, the iteration may be stopped when the iteration number reaches a set number and the moving distance of the center of gravity of the third positive cluster and the third negative cluster is smaller than the set distance.
In addition, in an actual implementation process, the iteration stop condition may not be limited to the above example, and a person skilled in the art may design an iteration stop condition different from the above example according to actual requirements.
Therefore, in the embodiment, by clustering the gravity centers of the plurality of positive samples and the plurality of unknown samples in the training sample set through at least one iteration cycle, a plurality of more accurate negative samples unmatched with the target service event can be obtained, and prediction errors caused by learning by taking the samples originally matched with the target service event as the characteristic information of the negative samples in the process of training the information recommendation model are avoided.
Referring to fig. 3 again, in step S130, an information recommendation model is obtained by training according to a plurality of positive samples and a plurality of negative samples, and each sample to be predicted is input into the information recommendation model obtained by training, so as to obtain a matching degree between each sample to be predicted and the target service event.
On the basis of the scheme, the inventor of the application further finds that the gravity center clustering can only process linear features of each sample, but cannot process nonlinear features of each sample, so that a certain number of noise samples can be introduced into positive samples and negative samples obtained after the gravity center clustering, and the accuracy of a subsequent training information recommendation model is influenced.
In order to solve the above problems, the inventors have studied and researched, and introduced a Support Vector Machine (SVM) training method based on the gravity center clustering, so as to process the nonlinear features of each sample, thereby further reducing noise samples.
For example, in one possible implementation, the support vector machine model may be trained based on a plurality of positive samples and a plurality of negative samples, and the trained support vector machine model may be used as the information recommendation model.
In the above embodiment, training the support vector machine model according to the plurality of positive samples and the plurality of negative samples may be implemented as follows:
first, a plurality of positive samples and a plurality of negative samples are mapped by a predetermined kernel function.
In detail, if the support vector machine transforms through some non-linearityThe input space is mapped to a high-dimensional feature space. The dimension of the feature space may be very high. If the solution of the support vector machine only uses the inner product operation, and a certain function K (x, x') exists in the low-dimensional input space, the function K is exactly equal to the inner product in the high-dimensional space, namelyThe support vector machine does not need to calculate complex nonlinear transformation, and the inner product of the nonlinear transformation is directly obtained by the function K (x, x '), so that the calculation is greatly simplified, and the function K (x, x') is called a kernel function.
In this embodiment, the kernel function may be selected in the following manner: if each positive sample and each negative sample are linearly separable and the number of features and the number of samples exceed a certain value, a linear kernel function may be selected; if each positive sample and each negative sample are linearly inseparable and the feature quantity and the sample quantity are lower than a certain value, an RBF (Gaussian radial Basis function) kernel function can be selected; if each positive sample and each negative sample are linearly separable and the number of features and the number of samples are uncertain, a Gaussian kernel function can be selected; the Sigmoid kernel may be selected when the neural network is finally generated.
Then, inputting each positive sample and each negative sample after mapping transformation into a support vector machine model for training, and calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane. And then, calculating a loss function value of the support vector machine model according to the calculated distance between each positive sample and each negative sample and the hyperplane, adjusting model parameters and a learning rate of the support vector machine model by using a gradient descent method according to the calculated loss function value, returning and inputting each positive sample and each negative sample after mapping transformation to the support vector machine model for training based on the adjusted support vector machine model, calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane until an iteration stop condition is met, and outputting the support vector machine model obtained by training.
Wherein, the iteration stop condition may be: the number of iterations reaches a set number threshold, the loss function value does not decrease and/or the loss function value is lower than the set loss function value.
On the basis of the support vector machine model obtained through the training, in another possible implementation manner, the multiple positive samples and the multiple negative samples may be subjected to sample selection according to the trained support vector machine model, so as to obtain multiple target positive samples and multiple target negative samples. And then, training the ensemble tree model based on the multiple target positive samples and the multiple target negative samples, and taking the ensemble tree model obtained through training as a trained information recommendation model.
Alternatively, the sample selection for the plurality of positive samples and the plurality of negative samples according to the trained support vector machine model may be: and inputting each positive sample and each negative sample into the trained support vector machine model to obtain the matching rate of each positive sample and each negative sample and the target service event. And then, according to the matching rate of each positive sample and each negative sample with the target service event, eliminating the samples with the error values larger than a third preset threshold value to obtain a plurality of target positive samples and a plurality of target negative samples.
In this way, noise samples in the positive samples and the negative samples are further removed through the trained support vector machine model, so that the accuracy of each positive sample and each negative sample is further improved, on the basis, the integrated tree model is trained based on the more accurate target positive samples and the more accurate target negative samples, and the trained integrated tree model is used as the trained information recommendation model.
On the basis, for each sample to be predicted, inputting the sample to be predicted into the trained information recommendation model to obtain the matching degree of the sample to be predicted and the target service event.
The sample to be predicted can be understood as feature information of service record information generated by a service requester using the application program, which is collected by the current application program in real time. For example, still taking the current application as a car-booking application, and taking the target service event as a car-buying or car-renting example, the car-booking application may collect, in real time, feature information of each car-booking order generated by the current registered user (e.g., passenger, driver) as a sample to be predicted, and then input the sample to be predicted into the information recommendation model, so as to obtain the interest level of the current registered user in car-buying or car-renting, that is, the matching degree of the sample to be predicted and the target service event.
Step S140, for each sample to be predicted, determining whether the service requester terminal 130 that needs the sample to be predicted recommends information related to the target service event according to the matching degree between the sample to be predicted and the target service event.
In a possible implementation manner, it may be determined whether the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold, and if the matching degree of the sample to be predicted and the target service event is greater than the fourth preset threshold, it is determined that information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted.
On the basis, if it is determined that the information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted, the sample to be predicted is added to the plurality of positive samples in the training samples.
In addition, although it is determined that the matching degree between the sample to be predicted and the target service event is not greater than the fourth preset threshold, it may still be inaccurate to determine that the information related to the target service event does not need to be recommended to the service requester terminal 130 of the sample to be predicted, so in the present embodiment, it is further determined whether the sample to be predicted is greater than the fifth preset threshold. That is, although the matching degree of the sample to be predicted and the target service event is not greater than the fourth preset threshold, if the matching degree is still greater than the fifth preset threshold, a prompt message is sent to the service requester terminal 130 of the sample to be predicted to prompt the user of the service requester terminal 130 to select whether to receive the information related to the target service event. If the first indication information of not receiving the information related to the target service event, which is sent by the service requester terminal 130, is received, it is determined that the information related to the target service event does not need to be recommended to the sample to be predicted, and the sample to be predicted is added to a plurality of negative samples which do not match with the target service event. If second indication information of receiving the information related to the target service event sent by the service requester terminal 130 is received, it is determined that the information related to the target service event needs to be recommended to the sample to be predicted, and the sample to be predicted is added to the plurality of positive samples in the training samples.
If the matching degree of the sample to be predicted and the target service event is not larger than a fifth preset threshold, it may be determined that information related to the target service event does not need to be recommended to the sample to be predicted, and the sample to be predicted is added to a plurality of negative samples which are not matched with the target service event.
Based on the design, whether the service requester terminal 130 of the sample to be predicted needs to recommend the information related to the target service event can be determined according to the matching degree of the sample to be predicted and the target service event, and meanwhile, the sample to be predicted can be used as a positive sample or a negative sample again according to the determination result, so that the number of the positive samples or the negative samples is increased, the training step is returned, and the iterative training is continuously performed, so that the prediction accuracy of the information recommendation model is continuously improved, and more convenient and efficient target services are provided for the service requester.
In a further embodiment, referring to fig. 5, after the step S140, the information recommendation method may further include the steps of:
step S150, if it is determined that information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted, an information acquisition request is sent to each third-party server providing a service of the target service event.
Step S150, receiving the information related to the target service event sent by each third-party server response information obtaining request, and recommending the information related to the target service event to the service requester terminal 130 of the sample to be predicted.
Still taking the target service event as an example of a car buying or renting, if it is determined that information related to the car buying or renting needs to be recommended to the service requester terminal 130 of the sample to be predicted, an information acquisition request is sent to each third party server providing a car buying or renting service (e.g., a third party server of a car selling net, a car home net, a melon seed used car direct selling net, a hi car renting net, a state car renting net, etc.).
Optionally, the information acquisition request may include characteristic information of the sample to be predicted, such as behavior habits, preferences, driving age, gender, and the like of the driver or passenger. Then, each third-party server providing the car buying or renting service acquires the information related to the car buying or renting matched with the characteristic information of the sample to be predicted after responding to the information acquisition request, and then returns the information to the server 110. The server 110, upon receiving the information related to the car buying or car renting returned from each third-party server, recommends the information related to the car buying or car renting to the service requester terminal 130 of the sample to be predicted.
Fig. 6 is a functional block diagram of an information recommendation apparatus 300 according to some embodiments of the present application, where the functions implemented by the information recommendation apparatus 300 may correspond to the steps executed by the method described above. The information recommendation apparatus 300 may be understood as the server 110, or a processor of the server 110, or may be understood as a component that is independent from the server 110 or the processor and implements the functions of the present application under the control of the server 110, as shown in fig. 6, the information recommendation apparatus 300 may include an obtaining module 310, a center-of-gravity clustering module 320, a training module 330, and a determining module 340, where functions of respective functional modules of the information recommendation apparatus 300 are described in detail below.
An obtaining module 310 may be configured to obtain a training sample set, where the training sample set includes a plurality of positive samples and a plurality of unknown samples. It is understood that the obtaining module 310 may be configured to perform the step S110, and for a detailed implementation of the obtaining module 310, reference may be made to the content related to the step S110.
The gravity center clustering module 320 may be configured to perform gravity center clustering for at least one iteration cycle based on a plurality of positive samples and a plurality of unknown samples to obtain a plurality of negative samples that do not match the target service event, where each positive sample matches the target service event. It is understood that the center of gravity clustering module 320 can be used to perform the above step S120, and for the detailed implementation of the center of gravity clustering module 320, reference can be made to the above description regarding step S120.
The training module 330 may be configured to obtain an information recommendation model through training according to the multiple positive samples and the multiple negative samples, and input each sample to be predicted into the information recommendation model obtained through training to obtain a matching degree between each sample to be predicted and the target service event. It is understood that the training module 330 can be used to perform the above step S130, and for the detailed implementation of the training module 330, reference can be made to the above description of step S130.
The determining module 340 may be configured to determine, for each sample to be predicted, whether information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted according to the matching degree between the sample to be predicted and the target service event. It is understood that the determining module 340 can be used to perform the step S140, and the detailed implementation manner of the determining module 340 can refer to the above-mentioned contents related to the step S140.
In one possible implementation, the obtaining module 310 may obtain a plurality of positive samples and a plurality of unknown samples by:
obtaining a plurality of initial positive samples and a plurality of initial unknown samples;
and normalizing the sample characteristics of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
In one possible implementation, the obtaining module 310 may obtain a plurality of initial positive samples and a plurality of initial unknown samples by:
obtaining first service record information of a plurality of first users matched with the target service event and second service record information of a plurality of second users except the first users;
feature information is extracted from the first service record information to obtain a plurality of initial positive samples, and feature information is extracted from the second service record information to obtain a plurality of initial unknown samples.
In a possible implementation, the obtaining module 310 may specifically obtain a plurality of positive samples and a plurality of unknown samples within the same value space range by:
acquiring default values of all sample characteristics of each initial positive sample and each initial unknown sample;
filling default values of the sample features with the default values lower than the first default value threshold value, and eliminating the sample features with the default values larger than the second default value threshold value to obtain a plurality of processed initial positive samples and a plurality of processed initial unknown samples;
and mapping the plurality of initial positive samples and the plurality of initial unknown samples obtained by processing into the same value space range to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
In one possible embodiment, the centroid clustering module 320 is specifically configured to obtain a plurality of negative examples that do not match the target service event by:
aiming at each iteration cycle, obtaining a clustered positive cluster and a clustered negative cluster through iteration clustering in the iteration cycle;
calculating a first distance between each sample and the gravity center of the clustered positive cluster and a second distance between each sample and the gravity center of the clustered negative cluster in a plurality of positive samples of the positive cluster and a plurality of negative samples of the negative cluster;
and traversing each sample, if the first distance calculated by the samples is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample.
In a possible implementation, the centroid clustering module 320 is specifically configured to perform centroid clustering for the first iteration cycle by:
calculating the gravity centers of a plurality of positive samples in a first iteration cycle;
and taking all samples with the distance to the gravity center smaller than a first preset threshold value as a first clustered positive cluster, and taking all samples with the distance to the gravity center larger than a second preset threshold value as a first clustered negative cluster.
In a possible embodiment, the gravity center clustering module 320 is specifically configured to obtain a plurality of negative examples that do not match the target service event by:
all samples with the distances from the gravity centers of the plurality of positive samples smaller than a first preset threshold value are taken as a first clustered positive cluster, and all samples with the distances from the gravity centers of the plurality of positive samples larger than a second preset threshold value are taken as a first clustered negative cluster;
calculating a first distance between each sample and a center of gravity of the first positive cluster and a second distance between each sample and a center of gravity of the first negative cluster, among a plurality of positive samples of the first positive cluster and a plurality of negative samples of the first negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a corresponding second positive cluster and a corresponding second negative cluster;
calculating a first distance between each sample and a center of gravity of the second positive cluster and a second distance between each sample and a center of gravity of the second negative cluster among a plurality of positive samples of the second positive cluster and a plurality of negative samples of the second negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a third positive cluster and a third negative cluster which correspond to each other;
and taking the third positive cluster as a new second positive cluster, taking the third negative cluster as a new second negative cluster, and returning to the step of calculating a first distance between each sample and the gravity center of the second positive cluster and a second distance between each sample and the gravity center of the second negative cluster from a plurality of positive samples of the second positive cluster and a plurality of negative samples of the second negative cluster until an iteration stop condition is met, and taking each finally obtained negative sample in the third negative cluster as a negative sample which is not matched with the target service event.
In one possible embodiment, the iteration stop condition comprises at least one of the following conditions:
the positive samples in the third positive cluster and the negative samples in the third negative cluster do not change any more;
the iteration times reach the set times;
the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is less than the set distance.
In a possible implementation manner, the training module 330 may specifically train to obtain the information recommendation model by:
and training a support vector machine model according to the plurality of positive samples and the plurality of negative samples, and taking the support vector machine model obtained by training as an information recommendation model.
In a possible implementation manner, the training module 330 may specifically train to obtain the information recommendation model by:
training a support vector machine model according to the plurality of positive samples and the plurality of negative samples;
performing sample selection on a plurality of positive samples and a plurality of negative samples according to the trained support vector machine model to obtain a plurality of target positive samples and a plurality of target negative samples;
and training the integrated tree model based on the multiple target positive samples and the multiple target negative samples, and taking the integrated tree model obtained by training as a trained information recommendation model.
In one possible implementation, the training module 330 may specifically train the support vector machine model by:
mapping and transforming the positive samples and the negative samples through a preset kernel function;
inputting each positive sample and each negative sample after mapping transformation into a support vector machine model for training, and calculating the distance between each positive sample and each negative sample after mapping transformation and a hyperplane;
calculating a loss function value of the support vector machine model according to the distance between each positive sample and each negative sample obtained by calculation and the hyperplane;
and adjusting model parameters and a learning rate of the support vector machine model by using a gradient descent method according to the calculated loss function value, returning and inputting each positive sample and each negative sample after mapping transformation into the support vector machine model for training based on the adjusted support vector machine model, calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane until an iteration stop condition is met, and outputting the support vector machine model obtained by training.
In a possible implementation, the training module 330 may obtain a plurality of target positive samples and a plurality of target negative samples by:
inputting each positive sample and each negative sample into a trained support vector machine model to obtain the matching rate of each positive sample and each negative sample with a target service event;
and eliminating the samples with the error values larger than a third preset threshold value according to the matching rate of each positive sample and each negative sample with the target service event to obtain a plurality of target positive samples and a plurality of target negative samples.
In one possible implementation, the determining module 340 may determine whether the information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted by:
judging whether the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value or not;
if the matching degree of the sample to be predicted and the target service event is greater than the fourth preset threshold, it is determined that information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted.
In a possible implementation manner, if it is determined that the information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted, the determining module 340 is further specifically configured to:
the sample to be predicted is added to a plurality of positive samples in the training sample.
In a possible implementation manner, if the matching degree between the sample to be predicted and the target service event is greater than a fifth preset threshold and not greater than a fourth preset threshold, the determining module 340 may be further configured to:
sending a prompt message to the service requester terminal 130 of the sample to be predicted to prompt a user of the service requester terminal 130 to select whether to receive information related to the target service event;
if first indication information that the information related to the target service event is not received and sent by the service requester terminal 130 is received, determining that the information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of negative samples which are not matched with the target service event;
if second indication information of receiving the information related to the target service event sent by the service requester terminal 130 is received, determining that the information related to the target service event needs to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of positive samples in the training samples; and
if the matching degree between the sample to be predicted and the target service event is not greater than the fifth preset threshold, the determining module 340 is specifically further configured to determine that information related to the target service event does not need to be recommended to the sample to be predicted, and add the sample to be predicted to a plurality of negative samples that are not matched with the target service event.
In a possible implementation manner, please further refer to fig. 7, if it is determined that the information related to the target service event needs to be recommended to the service requester terminal 130 of the sample to be predicted, the information recommending apparatus may further include an information recommending module 350, where the information recommending module 350 may be configured to send an information obtaining request to each third-party server providing the service of the target service event, receive the information related to the target service event sent by each third-party server in response to the information obtaining request, and recommend the information related to the target service event to the service requester terminal 130 of the sample to be predicted.
The modules may be connected or in communication with each other via a wired or wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, etc., or any combination thereof. The wireless connection may comprise a connection over a LAN, WAN, bluetooth, ZigBee, NFC, or the like, or any combination thereof. Two or more modules may be combined into a single module, and any one module may be divided into two or more units.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (34)
1. An information recommendation method is applied to a server, and the method comprises the following steps:
obtaining a training sample set, the training sample set comprising a plurality of positive samples and a plurality of unknown samples;
obtaining a plurality of negative samples which are not matched with a target service event through gravity center clustering of at least one iteration cycle based on the plurality of positive samples and the plurality of unknown samples, wherein each positive sample is matched with the target service event;
training according to the positive samples and the negative samples to obtain an information recommendation model, and inputting each sample to be predicted into the information recommendation model obtained through training to obtain the matching degree of each sample to be predicted and the target service event;
and for each sample to be predicted, determining whether the service requester terminal needing the sample to be predicted recommends information related to the target service event according to the matching degree of the sample to be predicted and the target service event.
2. The information recommendation method according to claim 1, wherein the step of obtaining a training sample set comprises:
obtaining a plurality of initial positive samples and a plurality of initial unknown samples;
and normalizing the sample characteristics of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
3. The information recommendation method of claim 2, wherein the step of obtaining a plurality of initial positive samples and a plurality of initial unknown samples comprises:
obtaining first service record information of a plurality of first users matched with the target service event and second service record information of a plurality of second users except the first users;
extracting feature information from the first service record information to obtain a plurality of initial positive samples, and extracting feature information from the second service record information to obtain a plurality of initial unknown samples.
4. The information recommendation method according to claim 2, wherein the step of normalizing the sample features of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples within the same value space range comprises:
acquiring default values of all sample characteristics of each initial positive sample and each initial unknown sample;
filling default values of the sample features with the default values lower than the first default value threshold value, and eliminating the sample features with the default values larger than the second default value threshold value to obtain a plurality of processed initial positive samples and a plurality of processed initial unknown samples;
and mapping the plurality of initial positive samples and the plurality of initial unknown samples obtained by processing into the same value space range to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
5. The information recommendation method according to claim 1, wherein said step of obtaining a plurality of negative samples that do not match the target service event through gravity center clustering of at least one iteration cycle based on the plurality of positive samples and a plurality of unknown samples comprises:
aiming at each iteration cycle, obtaining a clustered positive cluster and a clustered negative cluster through iteration clustering in the iteration cycle;
calculating a first distance between each sample and the gravity center of the clustered positive cluster and a second distance between each sample and the gravity center of the clustered negative cluster in a plurality of positive samples of the positive cluster and a plurality of negative samples of the negative cluster;
and traversing each sample, if the first distance calculated by the samples is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample.
6. The information recommendation method according to claim 5, wherein the step of obtaining a clustered positive cluster and a clustered negative cluster through iterative clustering in each iteration cycle comprises:
calculating the center of gravity of the plurality of positive samples in a first iteration cycle;
and taking all samples with the distance between the samples and the calculated center of gravity smaller than a first preset threshold value as a first clustered positive cluster, and taking all samples with the distance between the samples and the center of gravity larger than a second preset threshold value as a first clustered negative cluster.
7. The information recommendation method according to any one of claims 1-6, wherein the step of obtaining a plurality of negative samples that do not match the target service event through gravity center clustering of at least one iteration cycle based on the plurality of positive samples and the plurality of unknown samples comprises:
all samples with the distances to the centers of gravity of the plurality of positive samples smaller than a first preset threshold value are taken as a first clustered positive cluster, and all samples with the distances to the centers of gravity of the plurality of positive samples larger than a second preset threshold value are taken as a first clustered negative cluster;
calculating a first distance between each sample and a center of gravity of the first positive cluster and a second distance between each sample and a center of gravity of the first negative cluster, among a plurality of positive samples of the first positive cluster and a plurality of negative samples of the first negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a corresponding second positive cluster and a corresponding second negative cluster;
calculating a first distance between each sample and a center of gravity of the second positive cluster and a second distance between each sample and a center of gravity of the second negative cluster among a plurality of positive samples of the second positive cluster and a plurality of negative samples of the second negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a third positive cluster and a third negative cluster which correspond to each other;
and taking the third positive cluster as a new second positive cluster, taking the third negative cluster as a new second negative cluster, returning to the step of calculating a first distance between each sample and the gravity center of the second positive cluster and a second distance between each sample and the gravity center of the second negative cluster from the plurality of positive samples of the second positive cluster and the plurality of negative samples of the second negative cluster until an iteration stop condition is met, and taking each negative sample in the finally obtained third negative cluster as a negative sample which is not matched with the target service event.
8. The information recommendation method according to claim 7, wherein the iteration stop condition includes at least one of the following conditions:
the positive samples in the third positive cluster and the negative samples in the third negative cluster no longer change;
the iteration times reach the set times;
the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is less than a set distance.
9. The information recommendation method according to claim 1, wherein the step of training an information recommendation model according to the positive samples and the negative samples comprises:
and training a support vector machine model according to the positive samples and the negative samples, and taking the support vector machine model obtained by training as the information recommendation model.
10. The information recommendation method according to claim 1, wherein the step of training an information recommendation model according to the positive samples and the negative samples comprises:
training a support vector machine model from the plurality of positive samples and the plurality of negative samples;
performing sample selection on the plurality of positive samples and the plurality of negative samples according to the trained support vector machine model to obtain a plurality of target positive samples and a plurality of target negative samples;
and training the integrated tree model based on the plurality of target positive samples and the plurality of target negative samples, and taking the integrated tree model obtained by training as a trained information recommendation model.
11. The information recommendation method according to claim 9 or 10, wherein the step of training a support vector machine model based on the plurality of positive samples and the plurality of negative samples comprises:
mapping the positive samples and the negative samples through a preset kernel function;
inputting each positive sample and each negative sample after mapping transformation into a support vector machine model for training, and calculating the distance between each positive sample and each negative sample after mapping transformation and a hyperplane;
calculating a loss function value of the support vector machine model according to the distance between each positive sample and each negative sample obtained through calculation and the hyperplane;
and adjusting model parameters and a learning rate of the support vector machine model by using a gradient descent method according to the calculated loss function value, returning and inputting each positive sample and each negative sample after mapping transformation into the support vector machine model for training based on the adjusted support vector machine model, calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane until an iteration stop condition is met, and outputting the support vector machine model obtained by training.
12. The information recommendation method according to claim 10, wherein the step of performing sample selection on the positive samples and the negative samples according to the trained support vector machine model to obtain target positive samples and target negative samples comprises:
inputting each positive sample and each negative sample into a trained support vector machine model to obtain the matching rate of each positive sample and each negative sample with the target service event;
and eliminating the samples with the error values larger than a third preset threshold value according to the matching rate of each positive sample and each negative sample with the target service event to obtain a plurality of target positive samples and a plurality of target negative samples.
13. The information recommendation method according to claim 1, wherein the step of determining whether information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted according to the matching degree of the sample to be predicted and the target service event comprises:
judging whether the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value or not;
and if the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value, determining that information related to the target service event needs to be recommended to a service requester terminal of the sample to be predicted.
14. The information recommendation method of claim 13, further comprising:
and if the fact that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted is determined, adding the sample to be predicted to a plurality of positive samples in the training samples.
15. The information recommendation method of claim 13, further comprising:
if the matching degree of the sample to be predicted and the target service event is greater than a fifth preset threshold and not greater than a fourth preset threshold, sending prompt information to a service requester terminal of the sample to be predicted so as to prompt a user of the service requester terminal to select whether to receive information related to the target service event;
if first indication information which is sent by the service requester terminal and does not receive the information related to the target service event is received, determining that the information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of negative samples which are not matched with the target service event;
if second indication information which is sent by the service requester terminal and used for receiving the information related to the target service event is received, determining that the information related to the target service event needs to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of positive samples in the training samples; and
and if the matching degree of the sample to be predicted and the target service event is not larger than a fifth preset threshold, determining that information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted into a plurality of negative samples which are not matched with the target service event.
16. The information recommendation method according to claim 1, wherein after the step of determining whether the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted according to the matching degree of the sample to be predicted and the target service event, the method comprises:
if the fact that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted is determined, an information acquisition request is sent to each third-party server providing the service of the target service event;
and receiving information which is sent by each third-party server in response to the information acquisition request and is related to the target service event, and recommending the information related to the target service event to the service requester terminal of the sample to be predicted.
17. An information recommendation device applied to a server, the device comprising:
an obtaining module configured to obtain a training sample set, where the training sample set includes a plurality of positive samples and a plurality of unknown samples;
a gravity center clustering module, configured to perform gravity center clustering for at least one iteration cycle based on the multiple positive samples and the multiple unknown samples to obtain multiple negative samples that are not matched with a target service event, where each positive sample is matched with the target service event;
the training module is used for training according to the positive samples and the negative samples to obtain an information recommendation model, inputting each sample to be predicted into the information recommendation model obtained through training, and obtaining the matching degree of each sample to be predicted and the target service event;
and the determining module is used for determining whether information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted or not according to the matching degree of the sample to be predicted and the target service event for each sample to be predicted.
18. The information recommendation device of claim 17, wherein the obtaining module obtains the plurality of positive samples and the plurality of unknown samples by:
obtaining a plurality of initial positive samples and a plurality of initial unknown samples;
and normalizing the sample characteristics of each initial positive sample and each initial unknown sample to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
19. The information recommendation device of claim 18, wherein the obtaining module obtains the plurality of initial positive samples and the plurality of initial unknown samples by:
obtaining first service record information of a plurality of first users matched with the target service event and second service record information of a plurality of second users except the first users;
extracting feature information from the first service record information to obtain a plurality of initial positive samples, and extracting feature information from the second service record information to obtain a plurality of initial unknown samples.
20. The information recommendation device according to claim 18, wherein the obtaining module obtains the plurality of positive samples and the plurality of unknown samples in the same value space range by:
acquiring default values of all sample characteristics of each initial positive sample and each initial unknown sample;
filling default values of the sample features with the default values lower than the first default value threshold value, and eliminating the sample features with the default values larger than the second default value threshold value to obtain a plurality of processed initial positive samples and a plurality of processed initial unknown samples;
and mapping the plurality of initial positive samples and the plurality of initial unknown samples obtained by processing into the same value space range to obtain a plurality of positive samples and a plurality of unknown samples in the same value space range.
21. The information recommendation device according to claim 17, wherein the centroid clustering module is specifically configured to obtain a plurality of negative examples that do not match the target service event by:
aiming at each iteration cycle, obtaining a clustered positive cluster and a clustered negative cluster through iteration clustering in the iteration cycle;
calculating a first distance between each sample and the gravity center of the clustered positive cluster and a second distance between each sample and the gravity center of the clustered negative cluster in a plurality of positive samples of the positive cluster and a plurality of negative samples of the negative cluster;
and traversing each sample, if the first distance calculated by the samples is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample.
22. The information recommendation device according to claim 21, wherein the center-of-gravity clustering module is specifically configured to perform center-of-gravity clustering for a first iteration cycle by:
calculating the center of gravity of the plurality of positive samples in a first iteration cycle;
and taking all samples with the distance between the samples and the calculated center of gravity smaller than a first preset threshold value as a first clustered positive cluster, and taking all samples with the distance between the samples and the center of gravity larger than a second preset threshold value as a first clustered negative cluster.
23. The information recommendation device according to any of claims 17-22, wherein the centroid clustering module, in particular for obtaining negative examples not matching target service events, comprises:
all samples with the distances to the centers of gravity of the plurality of positive samples smaller than a first preset threshold value are taken as a first clustered positive cluster, and all samples with the distances to the centers of gravity of the plurality of positive samples larger than a second preset threshold value are taken as a first clustered negative cluster;
calculating a first distance between each sample and a center of gravity of the first positive cluster and a second distance between each sample and a center of gravity of the first negative cluster, among a plurality of positive samples of the first positive cluster and a plurality of negative samples of the first negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a corresponding second positive cluster and a corresponding second negative cluster;
calculating a first distance between each sample and a center of gravity of the second positive cluster and a second distance between each sample and a center of gravity of the second negative cluster among a plurality of positive samples of the second positive cluster and a plurality of negative samples of the second negative cluster;
traversing each sample, if the first distance calculated by the sample is smaller than the second distance, taking the sample as a positive sample, and if the first distance is larger than the second distance, taking the sample as a negative sample to obtain a third positive cluster and a third negative cluster which correspond to each other;
and taking the third positive cluster as a new second positive cluster, taking the third negative cluster as a new second negative cluster, returning to the step of calculating a first distance between each sample and the gravity center of the second positive cluster and a second distance between each sample and the gravity center of the second negative cluster from the plurality of positive samples of the second positive cluster and the plurality of negative samples of the second negative cluster until an iteration stop condition is met, and taking each negative sample in the finally obtained third negative cluster as a negative sample which is not matched with the target service event.
24. The information recommendation device according to claim 23, wherein the iteration stop condition comprises at least one of:
the positive samples in the third positive cluster and the negative samples in the third negative cluster no longer change;
the iteration times reach the set times;
the moving distance of the centers of gravity of the third positive cluster and the third negative cluster is less than a set distance.
25. The information recommendation device according to claim 17, wherein the training module is configured to obtain the information recommendation model by training in the following manner:
and training a support vector machine model according to the positive samples and the negative samples, and taking the support vector machine model obtained by training as the information recommendation model.
26. The information recommendation device according to claim 17, wherein the training module is configured to obtain the information recommendation model by training in the following manner:
training a support vector machine model from the plurality of positive samples and the plurality of negative samples;
performing sample selection on the plurality of positive samples and the plurality of negative samples according to the trained support vector machine model to obtain a plurality of target positive samples and a plurality of target negative samples;
and training the integrated tree model based on the plurality of target positive samples and the plurality of target negative samples, and taking the integrated tree model obtained by training as a trained information recommendation model.
27. The information recommendation device according to claim 25 or 26, wherein the training module trains the support vector machine model by:
mapping the positive samples and the negative samples through a preset kernel function;
inputting each positive sample and each negative sample after mapping transformation into a support vector machine model for training, and calculating the distance between each positive sample and each negative sample after mapping transformation and a hyperplane;
calculating a loss function value of the support vector machine model according to the distance between each positive sample and each negative sample obtained through calculation and the hyperplane;
and adjusting model parameters and a learning rate of the support vector machine model by using a gradient descent method according to the calculated loss function value, returning and inputting each positive sample and each negative sample after mapping transformation into the support vector machine model for training based on the adjusted support vector machine model, calculating the distance between each positive sample and each negative sample after mapping transformation and the hyperplane until an iteration stop condition is met, and outputting the support vector machine model obtained by training.
28. The information recommendation device of claim 27, wherein the training module obtains the plurality of target positive examples and the plurality of target negative examples by:
inputting each positive sample and each negative sample into a trained support vector machine model to obtain the matching rate of each positive sample and each negative sample with the target service event;
and eliminating the samples with the error values larger than a third preset threshold value according to the matching rate of each positive sample and each negative sample with the target service event to obtain a plurality of target positive samples and a plurality of target negative samples.
29. The information recommendation device of claim 17, wherein the determining module determines whether the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted by:
judging whether the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value or not;
and if the matching degree of the sample to be predicted and the target service event is greater than a fourth preset threshold value, determining that information related to the target service event needs to be recommended to a service requester terminal of the sample to be predicted.
30. The information recommendation device according to claim 17, wherein if it is determined that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted, the determining module is further configured to:
adding the sample to be predicted to a plurality of positive samples in the training sample.
31. The information recommendation device according to claim 29, wherein if the matching degree between the sample to be predicted and the target service event is not greater than the fourth preset threshold, the determining module is further configured to:
sending prompt information to the service requester terminal of the sample to be predicted to prompt a user of the service requester terminal to select whether to receive information related to the target service event;
if first indication information which is sent by the service requester terminal and does not receive the information related to the target service event is received, determining that the information related to the target service event does not need to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of negative samples which are not matched with the target service event;
and if second indication information which is sent by the service requester terminal and used for receiving the information related to the target service event is received, determining that the information related to the target service event needs to be recommended to the sample to be predicted, and adding the sample to be predicted to a plurality of positive samples in the training samples.
32. The information recommendation device according to claim 17, wherein if it is determined that the information related to the target service event needs to be recommended to the service requester terminal of the sample to be predicted, the device further comprises:
and the information recommending module is used for sending an information obtaining request to each third-party server providing the service of the target service event, receiving information which is sent by each third-party server responding to the information obtaining request and is related to the target service event, and recommending the information related to the target service event to the service requester terminal of the sample to be predicted.
33. A server, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the server is running, the processor executing the machine-readable instructions to perform the steps of the information recommendation method according to any one of claims 1-16.
34. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, performs the steps of the information recommendation method according to any one of claims 1-16.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811475043.XA CN111274472A (en) | 2018-12-04 | 2018-12-04 | Information recommendation method and device, server and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811475043.XA CN111274472A (en) | 2018-12-04 | 2018-12-04 | Information recommendation method and device, server and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111274472A true CN111274472A (en) | 2020-06-12 |
Family
ID=70998649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811475043.XA Pending CN111274472A (en) | 2018-12-04 | 2018-12-04 | Information recommendation method and device, server and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111274472A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111861178A (en) * | 2020-07-13 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Service matching model training method, service matching method, device and medium |
CN113077304A (en) * | 2021-03-22 | 2021-07-06 | 海南太美航空股份有限公司 | Flight information recommendation method and system and electronic equipment |
CN113191812A (en) * | 2021-05-12 | 2021-07-30 | 深圳索信达数据技术有限公司 | Service recommendation method, computer device and computer-readable storage medium |
CN113204654A (en) * | 2021-04-21 | 2021-08-03 | 北京达佳互联信息技术有限公司 | Data recommendation method and device, server and storage medium |
CN113407689A (en) * | 2021-06-15 | 2021-09-17 | 北京三快在线科技有限公司 | Method and device for model training and business execution |
CN115238837A (en) * | 2022-09-23 | 2022-10-25 | 荣耀终端有限公司 | Data processing method and device, electronic equipment and storage medium |
CN115239025A (en) * | 2022-09-21 | 2022-10-25 | 荣耀终端有限公司 | Payment prediction method and electronic equipment |
CN118133010A (en) * | 2024-02-23 | 2024-06-04 | 北京航空航天大学 | Graph model-based manufacturing cloud service recommendation model training method and recommendation method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103345645A (en) * | 2013-06-27 | 2013-10-09 | 复旦大学 | Commodity image category forecasting method based on online shopping platform |
US20170061021A1 (en) * | 2015-08-28 | 2017-03-02 | Yandex Europe Ag | Method and apparatus for generating a recommended content list |
CN107871144A (en) * | 2017-11-24 | 2018-04-03 | 税友软件集团股份有限公司 | Invoice trade name sorting technique, system, equipment and computer-readable recording medium |
CN107911491A (en) * | 2017-12-27 | 2018-04-13 | 广东欧珀移动通信有限公司 | Information recommendation method, device and storage medium, server and mobile terminal |
CN108304441A (en) * | 2017-11-14 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Network resource recommended method, device, electronic equipment, server and storage medium |
CN108763538A (en) * | 2018-05-31 | 2018-11-06 | 北京嘀嘀无限科技发展有限公司 | A kind of method and device in the geographical locations determining point of interest POI |
-
2018
- 2018-12-04 CN CN201811475043.XA patent/CN111274472A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103345645A (en) * | 2013-06-27 | 2013-10-09 | 复旦大学 | Commodity image category forecasting method based on online shopping platform |
US20170061021A1 (en) * | 2015-08-28 | 2017-03-02 | Yandex Europe Ag | Method and apparatus for generating a recommended content list |
CN108304441A (en) * | 2017-11-14 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Network resource recommended method, device, electronic equipment, server and storage medium |
CN107871144A (en) * | 2017-11-24 | 2018-04-03 | 税友软件集团股份有限公司 | Invoice trade name sorting technique, system, equipment and computer-readable recording medium |
CN107911491A (en) * | 2017-12-27 | 2018-04-13 | 广东欧珀移动通信有限公司 | Information recommendation method, device and storage medium, server and mobile terminal |
CN108763538A (en) * | 2018-05-31 | 2018-11-06 | 北京嘀嘀无限科技发展有限公司 | A kind of method and device in the geographical locations determining point of interest POI |
Non-Patent Citations (1)
Title |
---|
高秋云: "基于用户兴趣度和地理位置的活动推荐" * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111861178B (en) * | 2020-07-13 | 2024-06-07 | 北京嘀嘀无限科技发展有限公司 | Training method of service matching model, service matching method, equipment and medium |
CN111861178A (en) * | 2020-07-13 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Service matching model training method, service matching method, device and medium |
CN113077304B (en) * | 2021-03-22 | 2023-01-13 | 海南太美航空股份有限公司 | Flight information recommendation method and system and electronic equipment |
CN113077304A (en) * | 2021-03-22 | 2021-07-06 | 海南太美航空股份有限公司 | Flight information recommendation method and system and electronic equipment |
CN113204654A (en) * | 2021-04-21 | 2021-08-03 | 北京达佳互联信息技术有限公司 | Data recommendation method and device, server and storage medium |
CN113204654B (en) * | 2021-04-21 | 2024-03-29 | 北京达佳互联信息技术有限公司 | Data recommendation method, device, server and storage medium |
CN113191812B (en) * | 2021-05-12 | 2024-02-02 | 深圳索信达数据技术有限公司 | Service recommendation method, computer equipment and computer readable storage medium |
CN113191812A (en) * | 2021-05-12 | 2021-07-30 | 深圳索信达数据技术有限公司 | Service recommendation method, computer device and computer-readable storage medium |
CN113407689A (en) * | 2021-06-15 | 2021-09-17 | 北京三快在线科技有限公司 | Method and device for model training and business execution |
CN115239025A (en) * | 2022-09-21 | 2022-10-25 | 荣耀终端有限公司 | Payment prediction method and electronic equipment |
CN115238837B (en) * | 2022-09-23 | 2023-04-18 | 荣耀终端有限公司 | Data processing method and device, electronic equipment and storage medium |
CN115238837A (en) * | 2022-09-23 | 2022-10-25 | 荣耀终端有限公司 | Data processing method and device, electronic equipment and storage medium |
CN118133010A (en) * | 2024-02-23 | 2024-06-04 | 北京航空航天大学 | Graph model-based manufacturing cloud service recommendation model training method and recommendation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111274472A (en) | Information recommendation method and device, server and readable storage medium | |
TWI676783B (en) | Method and system for estimating time of arrival | |
TWI670677B (en) | Systems and methods for recommending an estimated time of arrival | |
CN114944059B (en) | Method and system for determining estimated arrival time | |
WO2018191856A1 (en) | System and method for determining safety score of driver | |
US20170364933A1 (en) | User maintenance system and method | |
CN108780562B (en) | System and method for updating service sequences | |
EP3642769A1 (en) | Systems and methods for service request allocation | |
GB2550523A (en) | Methods and systems for order processing | |
CN111353092B (en) | Service pushing method, device, server and readable storage medium | |
TW202009807A (en) | Systems and methods for allocating orders | |
JP7047096B2 (en) | Systems and methods for determining estimated arrival times for online-to-offline services | |
CN112236787A (en) | System and method for generating personalized destination recommendations | |
CN111104585B (en) | Question recommending method and device | |
CN111367575B (en) | User behavior prediction method and device, electronic equipment and storage medium | |
CN111680382A (en) | Grade prediction model training method, grade prediction device and electronic equipment | |
CN111259119B (en) | Question recommending method and device | |
CN111222903B (en) | System and method for processing data from an online on-demand service platform | |
US11017340B2 (en) | Systems and methods for cheat examination | |
CN111274471B (en) | Information pushing method, device, server and readable storage medium | |
CN111259229B (en) | Question recommending method and device | |
WO2019128477A1 (en) | Systems and methods for assigning service requests | |
CN111275062A (en) | Model training method, device, server and computer readable storage medium | |
CN111353090B (en) | Service distribution method, device, server and readable storage medium | |
CN111831763B (en) | Map processing method, device, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20240614 |
|
AD01 | Patent right deemed abandoned |