CN117522566A - Credit transaction risk identification method, credit transaction risk identification device, electronic equipment and storage medium - Google Patents
Credit transaction risk identification method, credit transaction risk identification device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117522566A CN117522566A CN202311634725.1A CN202311634725A CN117522566A CN 117522566 A CN117522566 A CN 117522566A CN 202311634725 A CN202311634725 A CN 202311634725A CN 117522566 A CN117522566 A CN 117522566A
- Authority
- CN
- China
- Prior art keywords
- credit
- sample
- client
- customer
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000010586 diagram Methods 0.000 claims abstract description 37
- 238000007621 cluster analysis Methods 0.000 claims abstract description 16
- 230000002159 abnormal effect Effects 0.000 claims description 94
- 239000011159 matrix material Substances 0.000 claims description 40
- 238000013524 data verification Methods 0.000 claims description 8
- 238000003062 neural network model Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 6
- 238000013502 data validation Methods 0.000 claims 1
- 238000001514 detection method Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 7
- 238000012795 verification Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000007794 visualization technique Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The application provides a credit transaction risk identification method, a credit transaction risk identification device, electronic equipment and a storage medium, wherein the method comprises the following steps: performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from each credit client group; determining an association between each sample credit customer based on the distance between each sample credit customer; establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge; and determining credit transaction risk identification results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map. The method and the device provided by the application improve the recognition efficiency of the credit transaction risk and improve the recognition accuracy of the credit transaction risk.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a credit transaction risk identification method, apparatus, electronic device, and storage medium.
Background
When a bank transacts a credit application (e.g., credit card service) made by a customer, it is often necessary to identify the customer as a risk. In the prior art, huge credit customer data is usually analyzed manually, the identification efficiency of credit transaction risks is low, and the identification accuracy is poor.
Therefore, how to improve the recognition efficiency and recognition accuracy of credit transaction risk is a technical problem to be solved in the industry.
Disclosure of Invention
The application provides a credit transaction risk identification method, a credit transaction risk identification device, electronic equipment and a storage medium, which are used for solving the technical problems of how to improve the identification efficiency and identification accuracy of credit transaction risks.
The application provides a credit transaction risk identification method, which comprises the following steps:
performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from each credit client group;
determining an association between each sample credit customer based on the distance between each sample credit customer;
establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge;
and determining credit transaction risk identification results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map.
In some embodiments, the determining the association between the sample credit customers based on the distance between the sample credit customers includes:
determining a preset distance threshold and the distance between each sample credit client;
comparing the distance between each sample credit customer with the preset distance threshold;
under the condition that the distance is smaller than the preset distance threshold, determining the association relation value of the two sample credit customers corresponding to the distance as a first value;
under the condition that the distance is larger than or equal to the preset distance threshold, determining that the association relation value of the two sample credit customers corresponding to the distance is a second value;
constructing an adjacency matrix based on the association relation values of each sample credit client; the adjacency matrix is used for representing the association relation of each sample credit client.
In some embodiments, after the constructing the adjacency matrix based on the association values of the respective sample credit customers, the method further comprises:
and determining the network viscosity of the network graph based on the number of the first numerical value and the second numerical value corresponding to each sample credit client in the adjacency matrix.
In some embodiments, after the determining the network viscosity of the network map based on the adjacency matrix, the method further comprises:
determining node attributes of each sample credit customer based on credit transaction data of each sample credit customer;
establishing a network autoregressive model based on the adjacency matrix and node attributes of each sample credit client; the network autoregressive model is used for predicting the network structure of the network graph.
In some embodiments, after the network map is built with each sample credit client as a node and the association relationship between each sample credit client as an edge, the method further includes:
the network map is visually displayed based on at least one of a force directed graph, a Sang Ji graph, and a tree graph.
In some embodiments, after determining the credit transaction risk identification result for each sample credit customer based on the credit transaction data for each sample credit customer in the network map, the method further comprises:
in the case that the credit transaction risk identification result of the sample credit client is abnormal, determining each credit client in a credit client group to which the sample credit client belongs as a candidate abnormal credit client;
Verifying the credit transaction data of each candidate abnormal credit customer, and determining the abnormal credit customer in each candidate abnormal credit customer based on the credit transaction data verification result of each candidate abnormal credit customer.
In some embodiments, the verifying the credit transaction data of each candidate abnormal credit customer, determining an abnormal credit customer among each candidate abnormal credit customer based on the credit transaction data verification results of each candidate abnormal credit customer, includes:
acquiring social media data of each candidate abnormal credit client;
inputting credit transaction data and/or social media data of each candidate abnormal credit customer into a credit transaction risk identification model to obtain a credit transaction risk identification result of each candidate abnormal credit customer output by the credit transaction risk identification model;
determining the candidate abnormal credit clients with abnormal credit transaction risk recognition results as abnormal credit clients;
the credit transaction risk recognition model is obtained by training a neural network model serving as an initial model based on credit transaction data and/or social media data of a plurality of abnormal credit customers.
The application provides a credit transaction risk identification device, comprising:
a clustering unit for performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from the respective credit client groups;
a determining unit configured to determine an association relationship between the respective sample credit customers based on a distance between the respective sample credit customers;
the drawing building unit is used for building a network drawing by taking each sample credit client as a node and the association relation among each sample credit client as an edge;
and the identification unit is used for determining credit transaction risk identification results of all sample credit customers based on the credit transaction data of all sample credit customers in the network graph.
The present application provides an electronic device comprising a memory, in which a computer program is stored, and a processor arranged to perform the credit transaction risk identification method by means of the computer program.
The present application provides a computer readable storage medium comprising a stored program, wherein the program when run performs the credit transaction risk identification method.
The credit transaction risk identification method, the device, the electronic equipment and the storage medium provided by the application perform cluster analysis on credit transaction data of a plurality of credit clients, determine a plurality of credit client groups and draw sample credit clients from each credit client group; determining an association between each sample credit customer based on the distance between each sample credit customer; establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge; determining credit transaction risk recognition results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map; the method for analyzing huge credit customer data is realized by adopting the methods of cluster analysis and grouping sample extraction, and the method for constructing a network diagram to analyze credit transaction is adopted, so that potential credit transaction risks among sample credit customers can be mined from the incidence relation among the sample credit customers, the recognition range of abnormal credit customers is shortened, the recognition efficiency of credit transaction risks is improved, and the recognition accuracy of credit transaction risks is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the technical solutions of the present application or the prior art, the following description will briefly introduce the drawings used in the embodiments or the description of the prior art, and it is obvious that, in the following description, the drawings are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a credit transaction risk identification method provided by the present application;
FIG. 2 is a schematic diagram of a credit transaction risk identification device provided by the present application;
fig. 3 is a schematic structural diagram of an electronic device provided in the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like herein are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or units or modules is not necessarily limited to those steps or units or modules that are expressly listed or inherent to such process, method, article, or apparatus.
In the technical scheme of the application, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the client personal information all accord with the regulations of related laws and regulations, necessary security measures are adopted, and the public order harmony is not violated.
Fig. 1 is a schematic flow chart of a credit transaction risk identification method provided in the present application, and as shown in fig. 1, the method includes steps 110, 120, 130 and 140.
Step 110, performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from each credit client group.
Specifically, the execution subject of the credit transaction risk identification method provided by the embodiment of the application is a credit transaction risk identification device. The apparatus may be implemented in software, such as a credit transaction risk identification program running in a computer; it may also be a device, such as a mobile terminal, tablet computer, desktop computer, server, or the like, that performs the credit transaction risk identification method.
A credit customer refers to an individual or business that requires borrowing or credit support. Credit transaction data refers to data related to credit transactions including information on loan amount, loan period, loan interest rate, repayment record, and the like. Such data is typically recorded in databases of credit institutions such as banks, financial institutions, etc. for assessing customer credit risk, formulating loan policies, performing risk control, etc.
The clustering analysis is a data analysis method, which divides the objects in the data set into a plurality of groups with similar characteristics, the objects in each group have higher similarity, and the objects in different groups have larger difference. In the financial field, cluster analysis is often used to categorize customer groups in order to better understand customer characteristics, behavior, and requirements, thereby formulating targeted marketing strategies, product designs, and risk management measures.
Thus, a cluster analysis algorithm may be employed to process credit transaction data for multiple credit customers, dividing the multiple credit customers into multiple credit customer groups. Individual credit clients in each credit client group have a high degree of similarity. The credit transaction data is a hybrid variable, including both continuous and nominal type variables (measured in monetary units), so that the clustering algorithm may employ at least one of a golgi (go) distance based algorithm, a partition around center point (Partitioning Around Medoid, PAM) based clustering algorithm, and a contour coefficient based (Silhouette Coefficient) algorithm.
At least one sample may be drawn in each credit client group as a sample credit client or a sample may be drawn at a preset percentage as a sample credit client, which is analyzed as a proxy for each credit client group. The method of extraction may employ random extraction or extraction based on the location of individual samples in the credit client group. For example, a cluster center of each credit client group after clustering may be calculated, a distance between each credit client in each credit client group and the cluster center may be calculated, each credit client may be arranged in ascending order according to the distance, and a preset number of credit clients may be selected as sample credit clients according to the arrangement sequence number.
Step 120, determining an association relationship between each sample credit client based on the distance between each sample credit client.
Specifically, after the sample credit customers are drawn, distances between the respective sample credit customers may be calculated, and the association relationship between the respective sample credit customers may be determined based on these distances. The association here may be a business connection that the credit client has in the credit transaction. The smaller the distance between two sample credit customers, the more likely an association is to exist between the two sample credit customers; the greater the distance between two sample credit customers, the less likely an association is to exist between the two sample credit customers.
And 130, establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge.
Specifically, graph computation has been developed in the big data age, which is a technique of researching complex association relationships between things, and describing, characterizing, analyzing, and computing them. The network map may show associations of everything, such as between sample credit customers.
The network graph is mainly composed of nodes (Vertex) and edges (Edge). Nodes represent things and edges represent relationships between things. The network graph can be established with each sample credit client as a node and the association between each sample credit client as an edge. The network map may be used to identify credit transaction risk.
Step 140, determining the credit transaction risk recognition result of each sample credit customer based on the credit transaction data of each sample credit customer in the network map.
Specifically, the credit transaction risk of each sample credit customer can be identified according to the credit transaction data of the sample credit customers in close contact with each other in the network map, so as to obtain the credit transaction risk identification result of each sample credit customer.
For example, sample credit customers with close-coupled connections may be determined in a network map, the credit transaction data of these sample credit customers may be analyzed, if there is an abnormal credit transaction for the credit transaction data of any sample credit customer, the credit transaction risk identification result of that sample credit customer may be considered abnormal, while the credit transaction risk identification results of the individual sample credit customers with close-coupled connections to that sample credit customer are also determined to be abnormal (that sample credit customer is an abnormal credit customer). The credit transaction risk identification result can be verified by a manual verification mode.
For another example, a community is a tightly connected collection of nodes in a network graph. The links between nodes within communities are relatively strong, while the links between nodes in different communities are relatively weak. A community detection algorithm (Community Detection, also known as a community discovery algorithm) may be employed to detect communities in the network graph for dividing the network graph into communities. The community detection algorithm may include an algorithm for community detection based on spectral clustering, an algorithm for community detection based on hierarchical clustering, an algorithm for community detection based on Modularity (Modularity), and the like. Common community detection algorithms are Girvan-Newman algorithm, louvan algorithm, clauset-Newman-Moore algorithm, tag propagation algorithm, and the like. The community detection algorithm can be adopted in the network graph, the whole network graph is divided into a plurality of communities, a plurality of sample credit clients in the same community have closer connection relations, and credit transaction data of each sample credit client are accurately analyzed in each community, so that credit transaction risk identification results of each sample credit client are determined.
After determining the credit transaction risk recognition results of the respective sample credit clients, the credit transaction risk recognition results of the respective credit clients in the credit client group to which the respective sample credit clients belong may be determined according to the similarity of the respective credit clients in the same credit client group.
The credit transaction risk identification method provided by the embodiment of the application performs cluster analysis on credit transaction data of a plurality of credit clients, determines a plurality of credit client groups, and extracts sample credit clients from each credit client group; determining an association between each sample credit customer based on the distance between each sample credit customer; establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge; determining credit transaction risk recognition results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map; the method for analyzing huge credit customer data is realized by adopting the methods of cluster analysis and grouping sample extraction, and the method for constructing a network diagram to analyze credit transaction is adopted, so that potential credit transaction risks among sample credit customers can be mined from the incidence relation among the sample credit customers, the recognition range of abnormal credit customers is shortened, the recognition efficiency of credit transaction risks is improved, and the recognition accuracy of credit transaction risks is improved.
It should be noted that each embodiment of the present application may be freely combined, permuted, or executed separately, and does not need to rely on or rely on a fixed execution sequence.
In some embodiments, step 120 comprises:
determining a preset distance threshold and the distance between each sample credit client;
comparing the distance between each sample credit customer with a preset distance threshold;
under the condition that the distance is smaller than a preset distance threshold value, determining the association relation value of two sample credit customers corresponding to the distance as a first value;
under the condition that the distance is larger than or equal to a preset distance threshold value, determining the association relation value of two sample credit customers corresponding to the distance as a second value;
constructing an adjacency matrix based on the association relation values of each sample credit client; the adjacency matrix is used to represent the associations of individual sample credit customers.
Specifically, the distance between each sample credit customer can be calculated through a Golgi distance algorithm or other distance algorithms, and a preset distance threshold is set according to actual conditions.
The distance between each sample credit client is compared to a preset distance threshold, and if the distance is less than the preset distance threshold, an association relationship may exist between the two sample credit clients. The strength of the association relationship can be represented by an association relationship value. For example, a first value may be used to indicate a stronger association and a second value may be used to indicate a weaker association. The first value may take a "1" and the second value may take a "0". Thus, there may be an association between two sample credit customers, and the association value may be represented by a "1". If the distance is greater than or equal to the preset distance threshold, the fact that the association relationship between the two sample credit customers possibly does not exist is indicated, and the association relationship value can be represented by 0.
If a two-dimensional array is used to represent the association between each sample credit client, the value of the association is the value of the element in the two-dimensional array, so that an adjacency matrix representing the association of each sample credit client can be obtained.
If the preset distance threshold is set smaller, the first values in the adjacency matrix are smaller, and when the network diagram is generated in the later period, edges (connection) in the diagram are smaller, so that the association relationship between sample credit customers in the network diagram is smaller, visual confusion of the network diagram can be avoided, and closely contacted customers can be highlighted; if the preset distance threshold is set to be larger, the first value in the adjacency matrix is more, and when the network diagram is generated later, edges (connection) in the diagram are more, so that the association relationship between sample credit customers in the network diagram is more, visual confusion possibly occurs on the network diagram, and closely contacted customers cannot be highlighted. Therefore, it is necessary to reasonably set the preset distance threshold.
In addition, the preset percentage of the samples is extracted from the credit client group, the number of nodes in the network diagram can be determined, and the network diagram generated in the later stage can be appropriate in size by reasonably setting the preset percentage, so that a better visual effect can be obtained.
According to the credit transaction risk identification method provided by the embodiment of the invention, the preset distance threshold value can be set, the distance between each sample credit client is compared with the preset distance threshold value, and the adjacency matrix is generated to represent the association relation of each sample credit client, so that the connection analysis of each node in the network diagram is conveniently carried out through the adjacency matrix, and the identification efficiency of the credit transaction risk is improved.
In some embodiments, after constructing the adjacency matrix based on the association values of the respective sample credit customers, the method further comprises:
the network viscosity of the network map is determined based on the number of first and second values corresponding to each sample credit client in the adjacency matrix.
In particular, network viscosity refers to the degree of closeness of connections between nodes in a network graph.
The step of determining the network viscosity of the network map from the adjacency matrix may comprise: 1. constructing an adjacency matrix; and constructing an adjacency matrix according to the edge list or the connection relation of the network graph. Rows and columns of the matrix represent nodes in the network, respectively, and elements in the matrix represent connection strengths between the nodes; 2. normalizing the adjacency matrix; the adjacency matrix may be normalized such that each element is between 0 and 1. Common normalization methods include dividing each row by the number of degrees of that row, or dividing the entire matrix by its largest element; 3. calculating the network viscosity; the viscosity can be measured in different ways, and common indicators include average degrees of nodes, average connection strength of a network, clustering coefficients, and the like. The network viscosity in the embodiment of the application can be determined according to the quantity proportion of the first value and the second value corresponding to each sample credit client. For example, the ratio of the number of the first values "1" to the total number of the first values "1" and the second values "0" can be used as the network viscosity.
Furthermore, the relationship density of individual sample credit customers, i.e., the number of customers that each sample credit customer knows about other credit customers, can also be calculated in the graph network.
Through network viscosity and relationship density, network graphs can be analyzed and studied. For example, the mesh size in the network map may be assumed to conduct quantitative studies of credit customer network relationships. Under the condition of controlling other conditions to be unchanged, researching the influence of the network size on the overall network viscosity and the estimated value size; under the condition of controlling other conditions, the influence of the relationship density size among individuals on the overall network viscosity and the estimated value size is studied.
In some embodiments, after determining the network viscosity of the network map based on the adjacency matrix, the method further comprises:
determining node attributes of each sample credit customer based on credit transaction data of each sample credit customer;
establishing a network autoregressive model based on the adjacency matrix and node attributes of each sample credit client; the network autoregressive model is used for predicting the network structure of the network graph.
Specifically, in analyzing a network graph, it is generally necessary to consider node attributes and association relationships between nodes. Node attributes are features or attributes associated with a node, e.g., node attributes may include identification attributes, category attributes, numerical attributes, and the like. For example, when a social network analysis method is used to analyze a network graph, data generally required includes "source", "target", "value", that is, values "i" and "j" from the ith node to the jth node, weights of edges, "name", "group", "size", that is, a name of each node, a category number where the node is located, and a size of the node. For the node to which the sample credit client corresponds, the node attribute may be the total amount that the sample credit client can loan. The thickness of the edges represents the degree of tightness of the connection between credit customers.
A network autoregressive model for predicting the network structure of the network map may be constructed. The network autoregressive model (Network Autoregressive Model) is a method for modeling and predicting dynamic network evolution. Based on past network structure and node attribute information, the method utilizes an autoregressive model to predict future network evolution. The adjacency matrix corresponding to different time and the node attribute of each sample credit client can be input into a network autoregressive model, and the evolution trend of the future network can be predicted by the network autoregressive model through learning past network structure and node attribute information. In addition, when network data is lost or incomplete, the network autoregressive model can utilize existing data to reconstruct and recover the missing network structure. The missing part of the data can be filled up through the prediction capability of the model, so that the reconstruction and recovery of the network are realized.
According to the credit transaction risk identification method provided by the embodiment of the application, the network structure of the network graph is predicted by constructing and establishing the network autoregressive model, so that potential credit transaction risks among sample credit clients can be mined in the generated network graph, the identification efficiency of the credit transaction risks is improved, and the identification accuracy of the credit transaction risks is improved.
In some embodiments, step 130 is followed by:
the network map is visually displayed based on at least one of the force directed graph, the Sang Ji graph, and the tree graph.
Specifically, force directed graph (Force-directed graph) is a network graph visualization method. In the force directed graph, each node represents an entity or object in the network, and each edge represents a connection or relationship between them, which can be used to display complex network partitioning relationships.
Sang Jitu (Sankey diagram) is a graphical representation method for visualizing traffic, energy, resources, etc. The method displays the quantity relation and the flow condition among different nodes in a graphical mode, so that the distribution and the flow direction of data are conveniently and better understood. In Sang Jitu, each node represents a state or process, and each link represents the transfer and flow of data or material between different states.
A Tree diagram (Tree diagram) is a graphic representation method for expressing hierarchical structure information, which can clearly show parent-child relationships and constituent structures between various hierarchies. In a tree graph, the overall structure resembles a tree, with each node representing a hierarchy or class and the edges representing the relationships between the nodes.
The interactive map can be generated in three modes, wherein the size of a circle represents the size of a credit client, namely the sum of loans which can be carried out by the client and a family, the thickness of an edge represents the degree of tightness of the connection between the credit clients, and two different circle colors represent two classifications. Meanwhile, through setting of parameters (zoom), non-restoration interactive operation can be performed, network nodes or edges are dragged, layout can be automatically transformed, and the relation between the credit clients is displayed more clearly and more intuitively. Meanwhile, through setting of parameters (font size), a mouse is placed on a network node (circle), so that the identification of a credit client represented by the node can be displayed, and judgment and analysis are facilitated.
According to the credit transaction risk identification method provided by the embodiment of the application, the network diagram can be visually displayed through at least one of the force guide diagram, the Sang Ji diagram and the tree diagram, so that the network diagram can be conveniently and intuitively displayed, and the credit transaction risk identification efficiency is improved.
In some embodiments, step 140 further comprises, after:
in the case that the credit transaction risk identification result of the sample credit client is abnormal, determining each credit client in a credit client group to which the sample credit client belongs as a candidate abnormal credit client;
Verifying the credit transaction data of each candidate abnormal credit customer, and determining the abnormal credit customer in each candidate abnormal credit customer based on the credit transaction data verification result of each candidate abnormal credit customer.
In particular, in the event that the credit transaction risk identification result of the sample credit client is abnormal, each credit client in the credit client group to which the sample credit client belongs may be determined as a candidate abnormal credit client. The candidate abnormal credit customer is not necessarily a real abnormal credit customer and therefore needs to be further verified.
The credit transaction data of each candidate abnormal credit customer may be verified using either manual verification or computer verification. For example, manual verification may be implemented by revisiting candidate abnormal credit customers; social media data of candidate abnormal credit customers can also be collected for analysis, so that manual verification or computer verification is realized.
If the credit transaction data verification result of the candidate abnormal credit client is passed, the candidate abnormal credit client is a normal credit client; if the candidate abnormal credit customer's credit transaction data verification result is not passed, the candidate abnormal credit customer is an abnormal credit customer.
According to the credit transaction risk identification method provided by the embodiment of the application, the credit transaction data of each candidate abnormal credit customer are verified, so that the identification accuracy of the credit transaction risk is improved.
In some embodiments, verifying credit transaction data for each candidate abnormal credit customer, determining an abnormal credit customer among each candidate abnormal credit customer based on credit transaction data verification results for each candidate abnormal credit customer, comprising:
acquiring social media data of each candidate abnormal credit client;
inputting credit transaction data and/or social media data of each candidate abnormal credit customer into a credit transaction risk identification model to obtain a credit transaction risk identification result of each candidate abnormal credit customer output by the credit transaction risk identification model;
determining a candidate abnormal credit client with abnormal credit transaction risk identification results as an abnormal credit client;
the credit transaction risk recognition model is obtained by training an initial model based on credit transaction data and/or social media data of a plurality of sample abnormal credit clients by taking the neural network model as the initial model.
In particular, social media data refers to various digital content generated by credit customers on a social media platform, including text, image, video, and voice data. By analyzing the social media data of the credit client, the behavior and the interest of the client on the social media platform can be obtained. If the customer has abnormal transaction activity or is interested in abnormal transaction activity, the credit customer is at a higher risk of conducting an abnormal credit transaction.
The neural network model can be used as an initial model, and a credit transaction risk identification model is obtained after training and is used for identifying the credit transaction risk of each candidate abnormal credit customer, so as to obtain the credit transaction risk identification result of the candidate abnormal credit customer.
The pre-training process for the credit transaction risk identification model is as follows:
first, credit transaction data and/or social media data for a large number of credit customers is collected. The sample labels of these credit customers are abnormal (there are abnormal credit transactions). That is, these credit customers are sample abnormal credit customers. Secondly, taking the neural network model as an initial model, inputting credit transaction data and/or social media data of a plurality of sample abnormal credit customers into the initial model, and outputting credit transaction risk prediction results by the initial model. And taking the credit transaction risk prediction result as a prediction value, taking the sample label as an actual value, adjusting parameters of the initial model according to a difference value between the prediction value and the actual value, improving the prediction capability of the initial model, and finally obtaining the credit transaction risk identification model.
The neural network model may be a feedforward neural network, a convolutional neural network, a time-series neural network, or the like, and the type of the neural network model is not particularly limited in the embodiments of the present application.
According to the credit transaction risk identification method provided by the embodiment of the application, the credit transaction risk of the candidate abnormal credit clients is identified by adopting the neural network method, so that the identification accuracy of the credit transaction risk is improved.
On the basis of the above embodiments, the embodiments of the present application provide a credit transaction risk identification method, which is mainly used for identifying a transaction risk that may exist in a bank credit card service, including:
step one, generating a network adjacency matrix
By analyzing the characteristics of credit data, clustering analysis and grouping sample extraction are carried out on huge credit data, the distance between each sample is calculated, the connection relation of credit clients is judged according to the calculated distance, and an adjacency matrix is generated.
Step two, establishing a network diagram
And directly importing the attributes of the nodes by taking each sample as a node. With the two aforementioned data (node attributes and adjacency matrix), a network graph can be built directly.
Step three, establishing a network autoregressive model to judge the viscosity between clients
Network viscosity is calculated by the adjacency matrix and a network relationship for the data set is analyzed by generating a network autoregressive model.
Step four, visually displaying the network diagram
An interactable network diagram can be drawn for visual display.
The apparatus provided in the embodiments of the present application will be described below, and the apparatus described below and the method described above may be referred to correspondingly.
Fig. 2 is a schematic structural diagram of a credit transaction risk identification device provided in the present application, as shown in fig. 2, the device includes:
a clustering unit 210 for performing cluster analysis on credit transaction data of a plurality of credit customers, determining a plurality of credit customer groups, and extracting sample credit customers from the respective credit customer groups;
a determining unit 220 for determining an association relationship between the respective sample credit customers based on the distances between the respective sample credit customers;
a mapping unit 230, configured to build a network map with each sample credit client as a node and with the association relationship between each sample credit client as an edge;
an identification unit 240 for determining credit transaction risk identification results for each sample credit customer based on the credit transaction data for each sample credit customer in the network map.
The credit transaction risk recognition device provided by the embodiment of the application performs cluster analysis on credit transaction data of a plurality of credit clients, determines a plurality of credit client groups, and extracts sample credit clients from each credit client group; determining an association between each sample credit customer based on the distance between each sample credit customer; establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge; determining credit transaction risk recognition results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map; the method for analyzing huge credit customer data is realized by adopting the methods of cluster analysis and grouping sample extraction, and the method for constructing a network diagram to analyze credit transaction is adopted, so that potential credit transaction risks among sample credit customers can be mined from the incidence relation among the sample credit customers, the recognition range of abnormal credit customers is shortened, the recognition efficiency of credit transaction risks is improved, and the recognition accuracy of credit transaction risks is improved.
In some embodiments, the determining unit is to:
determining a preset distance threshold and the distance between each sample credit client;
comparing the distance between each sample credit customer with a preset distance threshold;
under the condition that the distance is smaller than a preset distance threshold value, determining the association relation value of two sample credit customers corresponding to the distance as a first value;
under the condition that the distance is larger than or equal to a preset distance threshold value, determining the association relation value of two sample credit customers corresponding to the distance as a second value;
constructing an adjacency matrix based on the association relation values of each sample credit client; the adjacency matrix is used to represent the associations of individual sample credit customers.
In some embodiments, the determining unit is to:
the network viscosity of the network map is determined based on the number of first and second values corresponding to each sample credit client in the adjacency matrix.
In some embodiments, the determining unit is to:
determining node attributes of each sample credit customer based on credit transaction data of each sample credit customer;
establishing a network autoregressive model based on the adjacency matrix and node attributes of each sample credit client; the network autoregressive model is used for predicting the network structure of the network graph.
In some embodiments, further comprising:
and the display unit is used for visually displaying the network graph based on at least one of the force guide graph, the Sang Ji graph and the tree graph.
In some embodiments, the identification unit is configured to:
in the case that the credit transaction risk identification result of the sample credit client is abnormal, determining each credit client in a credit client group to which the sample credit client belongs as a candidate abnormal credit client;
verifying the credit transaction data of each candidate abnormal credit customer, and determining the abnormal credit customer in each candidate abnormal credit customer based on the credit transaction data verification result of each candidate abnormal credit customer.
In some embodiments, the identification unit is configured to:
acquiring social media data of each candidate abnormal credit client;
inputting credit transaction data and/or social media data of each candidate abnormal credit customer into a credit transaction risk identification model to obtain a credit transaction risk identification result of each candidate abnormal credit customer output by the credit transaction risk identification model;
determining a candidate abnormal credit client with abnormal credit transaction risk identification results as an abnormal credit client;
The credit transaction risk recognition model is obtained by training an initial model based on credit transaction data and/or social media data of a plurality of sample abnormal credit clients by taking the neural network model as the initial model.
Fig. 3 is a schematic structural diagram of an electronic device provided in the present application, and as shown in fig. 3, the electronic device may include: processor (Processor) 310, communication interface (Communications Interface) 320, memory (Memory) 330 and communication bus (Communications Bus) 340, wherein Processor 310, communication interface 320 and Memory 330 accomplish communication with each other through communication bus 340. The processor 310 may invoke logic commands in the memory 330 to perform the following method:
performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from each credit client group; determining an association between each sample credit customer based on the distance between each sample credit customer; establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge; and determining credit transaction risk identification results of the sample credit customers based on the credit transaction data of the sample credit customers in the network map.
In addition, the logic commands in the memory described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several commands for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The processor in the electronic device provided by the embodiment of the present application may call the logic instruction in the memory to implement the above method, and the specific implementation manner of the processor is consistent with the implementation manner of the foregoing method, and may achieve the same beneficial effects, which are not described herein again.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the methods provided by the above embodiments.
The specific embodiment is consistent with the foregoing method embodiment, and the same beneficial effects can be achieved, and will not be described herein.
Embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.
Claims (10)
1. A method of credit transaction risk identification, comprising:
performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from each credit client group;
determining an association between each sample credit customer based on the distance between each sample credit customer;
establishing a network diagram by taking each sample credit client as a node and the association relation among each sample credit client as an edge;
and determining credit transaction risk identification results of each sample credit customer based on the credit transaction data of each sample credit customer in the network map.
2. The credit transaction risk identification method of claim 1, wherein the determining the association between the respective sample credit customers based on the distance between the respective sample credit customers includes:
determining a preset distance threshold and the distance between each sample credit client;
comparing the distance between each sample credit customer with the preset distance threshold;
under the condition that the distance is smaller than the preset distance threshold, determining the association relation value of the two sample credit customers corresponding to the distance as a first value;
under the condition that the distance is larger than or equal to the preset distance threshold, determining that the association relation value of the two sample credit customers corresponding to the distance is a second value;
constructing an adjacency matrix based on the association relation values of each sample credit client; the adjacency matrix is used for representing the association relation of each sample credit client.
3. The credit transaction risk identification method of claim 2, wherein after constructing the adjacency matrix based on the association values of the respective sample credit customers, the method further comprises:
and determining the network viscosity of the network graph based on the number of the first numerical value and the second numerical value corresponding to each sample credit client in the adjacency matrix.
4. The credit transaction risk identification method of claim 3, wherein after the determining the network viscosity of the network map based on the adjacency matrix, the method further comprises:
determining node attributes of each sample credit customer based on credit transaction data of each sample credit customer;
establishing a network autoregressive model based on the adjacency matrix and node attributes of each sample credit client; the network autoregressive model is used for predicting the network structure of the network graph.
5. The credit transaction risk identification method according to claim 1, wherein after the network map is established with each sample credit client as a node and with the association relationship between each sample credit client as an edge, the method further comprises:
the network map is visually displayed based on at least one of a force directed graph, a Sang Ji graph, and a tree graph.
6. The credit transaction risk identification method of claim 1, wherein after the determining the credit transaction risk identification result for each sample credit customer based on the credit transaction data for each sample credit customer in the network map, the method further comprises:
In the case that the credit transaction risk identification result of the sample credit client is abnormal, determining each credit client in a credit client group to which the sample credit client belongs as a candidate abnormal credit client;
verifying the credit transaction data of each candidate abnormal credit customer, and determining the abnormal credit customer in each candidate abnormal credit customer based on the credit transaction data verification result of each candidate abnormal credit customer.
7. The credit transaction risk identification method of claim 1, wherein the validating the credit transaction data of each candidate abnormal credit customer, determining an abnormal credit customer among each candidate abnormal credit customer based on the credit transaction data validation results of each candidate abnormal credit customer, comprises:
acquiring social media data of each candidate abnormal credit client;
inputting credit transaction data and/or social media data of each candidate abnormal credit customer into a credit transaction risk identification model to obtain a credit transaction risk identification result of each candidate abnormal credit customer output by the credit transaction risk identification model;
determining the candidate abnormal credit clients with abnormal credit transaction risk recognition results as abnormal credit clients;
The credit transaction risk recognition model is obtained by training a neural network model serving as an initial model based on credit transaction data and/or social media data of a plurality of abnormal credit customers.
8. A credit transaction risk identification device, comprising:
a clustering unit for performing cluster analysis on credit transaction data of a plurality of credit clients, determining a plurality of credit client groups, and extracting sample credit clients from the respective credit client groups;
a determining unit configured to determine an association relationship between the respective sample credit customers based on a distance between the respective sample credit customers;
the drawing building unit is used for building a network drawing by taking each sample credit client as a node and the association relation among each sample credit client as an edge;
and the identification unit is used for determining credit transaction risk identification results of all sample credit customers based on the credit transaction data of all sample credit customers in the network graph.
9. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the credit transaction risk identification method according to any of claims 1 to 7 by means of the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the credit transaction risk identification method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311634725.1A CN117522566A (en) | 2023-11-30 | 2023-11-30 | Credit transaction risk identification method, credit transaction risk identification device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311634725.1A CN117522566A (en) | 2023-11-30 | 2023-11-30 | Credit transaction risk identification method, credit transaction risk identification device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117522566A true CN117522566A (en) | 2024-02-06 |
Family
ID=89766270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311634725.1A Pending CN117522566A (en) | 2023-11-30 | 2023-11-30 | Credit transaction risk identification method, credit transaction risk identification device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117522566A (en) |
-
2023
- 2023-11-30 CN CN202311634725.1A patent/CN117522566A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108876600A (en) | Warning information method for pushing, device, computer equipment and medium | |
CN110390465A (en) | Air control analysis and processing method, device and the computer equipment of business datum | |
US20150269669A1 (en) | Loan risk assessment using cluster-based classification for diagnostics | |
CN112991079B (en) | Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium | |
CN112700324A (en) | User loan default prediction method based on combination of Catboost and restricted Boltzmann machine | |
WO2022143431A1 (en) | Method and apparatus for training anti-money laundering model | |
CN111563187A (en) | Relationship determination method, device and system and electronic equipment | |
CN113344700A (en) | Wind control model construction method and device based on multi-objective optimization and electronic equipment | |
CN112581271A (en) | Merchant transaction risk monitoring method, device, equipment and storage medium | |
CN111861487A (en) | Financial transaction data processing method, and fraud monitoring method and device | |
CN115545886A (en) | Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium | |
CN113762973A (en) | Data processing method and device, computer readable medium and electronic equipment | |
CN116307765A (en) | Artificial intelligence government affair data review method and system | |
CN115439928A (en) | Operation behavior identification method and device | |
CN114612239A (en) | Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence | |
CN117522403A (en) | GCN abnormal customer early warning method and device based on subgraph fusion | |
CN117670359A (en) | Abnormal transaction data identification method and device, storage medium and electronic equipment | |
CN112990989A (en) | Value prediction model input data generation method, device, equipment and medium | |
CN116739764A (en) | Transaction risk detection method, device, equipment and medium based on machine learning | |
CN111652708A (en) | Risk assessment method and device applied to house mortgage loan products | |
CN117575595A (en) | Payment risk identification method, device, computer equipment and storage medium | |
CN110619564B (en) | Anti-fraud feature generation method and device | |
Oztas et al. | Enhancing Anti-Money Laundering: Development of a Synthetic Transaction Monitoring Dataset | |
Xiao et al. | Explainable fraud detection for few labeled time series data | |
CN110458684A (en) | A kind of anti-fraud detection method of finance based on two-way shot and long term Memory Neural Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |