CN112866319A

CN112866319A - Log data processing method, system and storage medium

Info

Publication number: CN112866319A
Application number: CN201911191616.0A
Authority: CN
Inventors: 戴婧; 文勇; 赵甜; 黄强; 李正新; 张文英; 曾媛媛
Original assignee: SF Technology Co Ltd
Current assignee: SF Technology Co Ltd
Priority date: 2019-11-28
Filing date: 2019-11-28
Publication date: 2021-05-28
Anticipated expiration: 2039-11-28
Also published as: CN112866319B

Abstract

The application relates to a log data processing method, a system and a storage medium. The method comprises the following steps: acquiring log data corresponding to more than one service servers corresponding to different service types; each service server generates corresponding log files from the log data according to a preset format; each service server respectively sends the corresponding log file to the distribution processing cluster; the flow distribution processing cluster performs flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file; the first sub-streaming log file comprises full log data, and the second sub-streaming log file comprises real-time log data; and the shunting processing cluster respectively stores the first shunting log file and the second shunting log file to corresponding storage equipment. By adopting the method, the log data processing efficiency can be improved.

Description

Log data processing method, system and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, a system, and a storage medium for processing log data.

Background

With the development of computer technology, the service functions provided by each company are more and more complex, the number of corresponding system servers is more and more, and effective processing of log data plays a crucial role in uniformly managing and maintaining each module of each system. In the conventional art, corresponding log data is also stored in different servers for different systems and different service functions. When the log data needs to be checked, all the servers can be traversed in sequence to query the related log data.

However, the current log data processing method is complex in operation in the aspects of log data query, problem quick positioning and the like, so that the log data processing efficiency is low.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a log data processing method, system and storage medium capable of improving log data processing efficiency.

A method of log data processing, the method comprising:

acquiring log data corresponding to more than one service servers corresponding to different service types;

generating the log data into corresponding log files by each service server according to a preset format;

each service server respectively sends the corresponding log file to the distribution processing cluster;

the shunting processing cluster carries out shunting processing on the log data according to the generation time of each log data in the log file to obtain a first shunting log file and a second shunting log file; the first split log file comprises full log data and the second split log file comprises real-time log data;

and the shunting processing cluster respectively stores the first shunting log file and the second shunting log file to corresponding storage equipment.

A log data processing system comprises more than one business server corresponding to different service types, a distribution processing cluster and a storage device,

each service server is used for acquiring the corresponding log data;

each business server is also used for generating the corresponding log file from the log data according to a preset format;

each service server is also used for respectively sending the corresponding log files to the distribution processing cluster;

the flow distribution processing cluster is used for carrying out flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file; the first split log file comprises full log data and the second split log file comprises real-time log data;

and the shunting processing cluster is further used for respectively storing the first shunting log file and the second shunting log file to corresponding storage equipment.

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the log data processing method, the system and the storage medium, the log data corresponding to more than one service server corresponding to different service types are obtained, and because the log data corresponding to the service servers of the service types have different log formats, the service servers can generate the log data into corresponding log files according to the preset format, so that the log data in the log files have the same log format. The flow distribution processing cluster can perform flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file, and the first flow distribution log file and the second flow distribution log file are stored respectively, so that log query operation is more convenient, problem location is quicker, and log data processing efficiency is improved.

Drawings

FIG. 1 is a diagram illustrating an exemplary implementation of a log data processing method;

FIG. 2 is a flowchart illustrating a method for processing log data according to an embodiment;

FIG. 3 is a schematic diagram illustrating a log data acquisition process of a user terminal according to an embodiment;

FIG. 4 is a flow diagram that illustrates the processing of log data splitting in one embodiment;

FIG. 5 is a flow chart illustrating a system-associated method of processing log data in one embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The log data processing method provided by the application can be applied to the application environment shown in fig. 1. The application environment includes more than one traffic server 102, offload processing cluster 104, and storage device 106, which correspond to different service types. Each service server 102 communicates with the distribution processing cluster 104 through a network, and the distribution processing cluster 104 communicates with each storage device 106 through a network. Each service server 102 may be implemented by an independent server or a server cluster composed of a plurality of servers. Each storage device 106 may be a terminal or a server. The terminal may specifically be a desktop terminal or a mobile terminal, and the mobile terminal may specifically be at least one of a mobile phone, a tablet computer, a notebook computer, and the like.

Each service server 102 acquires log data corresponding to each service server; each service server 102 generates the log data into a corresponding log file according to a preset format; each service server 102 respectively sends the corresponding log file to the distribution processing cluster; the shunting processing cluster 104 carries out shunting processing on the log data according to the generation time of each log data in the log file to obtain a first shunting log file and a second shunting log file; the split processing cluster 104 stores the first split log file and the second split log file to the corresponding storage devices 106, respectively. Those skilled in the art will understand that the application environment shown in fig. 1 is only a part of the scenario related to the present application, and does not constitute a limitation to the application environment of the present application.

In an embodiment, as shown in fig. 2, a log data processing method is provided, which is described by taking the method as an example applied to each service server 102, the split processing cluster 104 and the storage device 106 in fig. 1, and includes the following steps:

s202, acquiring log data corresponding to more than one service server corresponding to different service types.

The service server has a function of processing corresponding service logic, and the service servers of different service types can be a general server and a micro server. The universal server is a server with the function of processing the local service logic of the user terminal and the universal server. The micro server is a server with a business logic function corresponding to the micro service. The log data is information for recording hardware, software and system problems in the system, and can also monitor events occurring in the system, so as to check the cause of error occurrence or find traces left by an attacker when the attacker is attacked.

Specifically, more than one service servers corresponding to different service types are deployed in the log data processing system, and the log data corresponding to each service server can be acquired by each service server.

In an embodiment, the service servers of different service types include a general server and a micro server, and the step S202 is a step of acquiring log data corresponding to more than one service servers corresponding to different service types, specifically including: acquiring log data generated locally by a user terminal and a universal server through the universal server; and acquiring the log data generated locally by the micro server through the micro server.

Specifically, the user operates the front-end page of the user terminal, and the user terminal can generate corresponding front-end log data. The application layer service and the management background service in the general server can generate log data when respectively providing corresponding services. The log data generated locally by the user terminal and the general-purpose server can be obtained through the general-purpose server. The micro server can generate log data in the process of providing the micro service, and the log data generated locally by the micro server is obtained through the micro server. In this way, log data can be quickly acquired.

In one embodiment, the step of obtaining, by a general-purpose server, log data generated locally by a user terminal and the general-purpose server specifically includes: the general server receives log data sent by different user terminals; the method comprises the steps that a universal server obtains log data generated locally by the universal server; the step of sending the log data by the user terminal comprises the following steps: the method comprises the steps that a user terminal obtains log data generated by the user terminal through a front-end page frame fused with a log data interceptor in advance; the user terminal asynchronously transmits the log data generated by the user terminal to the general-purpose server.

The log data interceptor is an interceptor for intercepting all normal and abnormal log data and is used for acquiring the log data generated when a user operates a front-end page of the user terminal. The front-end page frame is a frame used by a developer in developing a website page, and the front-end page frame may be a boottrap webpage frame), an AmazeUI (Amaze User Interface, cross-screen front-end frame), a LayUI (Lay User Interface, front-end frame), and the like.

Specifically, a log data interceptor is fused in advance in the front-end page frame, a user can generate corresponding log data when operating the user terminal, and the user terminal can acquire the log data generated by the user terminal through the front-end page frame fused with the log data interceptor in advance. The user terminal can transmit the log data generated by the user operation at the user terminal to the corresponding general server through asynchronization, and the general server can receive the log data sent by different user terminals. The general server can generate corresponding log data when providing corresponding service, and the general server can directly obtain locally generated log data.

In an embodiment, as shown in fig. 3, as can be seen from a log data acquisition flow diagram of a user terminal, a user can access a page through the user terminal, the page can load a resource corresponding to a user operation from the internet, and when the page finishes loading the resource, the user terminal can acquire all normal and abnormal log data generated by the user terminal by pre-fusing a monitoring browser event in a front-end frame. The user terminal can transmit the log data generated by the user terminal to the corresponding server through asynchronous transmission.

In one embodiment, a common exception interceptor may be added to the front-end page frame, and the exception log data generated when the user operates the user terminal may be obtained by fusing the exception interceptor in the front-end page frame in advance. The exception interceptor can call the public code and access the corresponding server to store the abnormal log data. Therefore, all normal and abnormal log data corresponding to different service servers can be acquired more quickly.

And S204, generating corresponding log files for the log data by each service server according to a preset format.

Specifically, the log data from different sources have different log configuration files, and each service server can generate the log data from different sources into the log file corresponding to the log data with a uniform format according to a preset format.

In one embodiment, the Log file may be configured by Log4j (Log4j — Log for JAVA, a Log of JAVA). Log4j is a Log component that can be configured in a Log format, and Log4j includes three important components: a log recorder, an output end and a log formatter. The logger is used to control which logging statements are enabled or disabled and to level limit the log information. The output is used to specify whether the log is to be printed to a console or a file. The log formatter is used to control a display format of the log information.

In one embodiment, the log format may include different log fields, which may specifically include a date, a time, a log level, a code location, a log content, an error code, and so forth. Each service server can perform self-defined setting on the sequence of the log fields according to service requirements. The log field is not limited herein.

And S206, each service server respectively sends the corresponding log file to the distribution processing cluster.

The distribution processing cluster is a cluster with a function of distributing and processing log data. Specifically, each service server stores a log file corresponding to each service server, and each service server can send the log file corresponding to each service server to the split processing cluster.

In an embodiment, step S206, that is, the step of sending the respective log file to the split processing cluster by each service server specifically includes: each business server monitors a log file corresponding to each business server through collecting codes; and when monitoring that the log files corresponding to the service servers are updated, the service servers respectively send the updated log files corresponding to the service servers to the shunting processing cluster.

The collection code is a code with a function of monitoring and collecting log data from a corresponding log file. Specifically, each service server is deployed with a corresponding acquisition code, and the log file can update log data in the log file as time increases. Each service server can monitor the corresponding log file through collecting the code. When monitoring that the log files corresponding to the service servers are updated, the service servers can respectively send the updated log files corresponding to the service servers to the distribution processing cluster.

In one embodiment, the acquisition code may be specifically a code written in JAVA (object oriented programming language), and the acquisition code may be deployed on each service server and specify a path monitored by the acquisition code. It can be understood that all the running function codes on each service server can record the log data corresponding to each service server and output the recorded log data to the corresponding log file. The acquisition code can monitor all log files, and when the corresponding log files are monitored, log data are read. The acquisition code may record the current read position. And when monitoring that the log file is updated next time, continuously reading the log data from the position recorded last time. Therefore, each service server can send the latest log data to the shunting processing cluster, and the log processing efficiency is further improved.

S208, the flow distribution processing cluster performs flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file; the first split log file includes full log data and the second split log file includes real-time log data.

The first split log file is a historical log file and is used for recording all log data corresponding to each service server, for example, log data before 6 months can be queried through the first split log file. The first sub-streaming log file comprises full log data, and the full log data is all log data screened from all log data of the corresponding log file according to a preset log field. The second split log file is a real-time log file in a preset time period, and is used for recording log data in the preset time period corresponding to each service server, for example, the preset time period is 15 days, and only the log data before 15 days can be queried through the second split log file. The second sub-flow log file comprises real-time log data, and the real-time log data are log data screened from all log data of the log file according to a preset time period. Wherein the full amount of log data comprises real-time log data.

Specifically, a shunting processing cluster is deployed in the log data processing system, and each log data in the log file includes the generation time of each log data. The flow distribution processing cluster can perform flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file.

In an embodiment, the splitting processing cluster performs splitting processing on the log data, specifically, a splitting time threshold is set according to the generation time of each log data in the log file, and then the log data is split according to the splitting time threshold. For example, the shunting time threshold is 15 days, and the shunting processing cluster may generate the first shunting log file according to all log data within 15 days and outside 15 days. The split processing cluster can generate a second split log file according to log data within 15 days.

In one embodiment, the offloading processing cluster may receive log data sent from each service server through KAFKA (distributed publish-subscribe messaging system), and may further offload processing of the log data through FLINK (distributed streaming data streaming engine). As shown in FIG. 4, the FLINK will automatically consume the LOG data in KAFKA, and after reading the LOG data from KAFKA, the FLINK will flush and FILTER the LOG data through ACSP _ LOG _ SOURCE, TRIM _ JSON, NULL _ FILTER, ACSP _ LOG _ BDP _ FORMAT, ACSP _ LOG _ HIVE (SINK), and ACSP _ LOG _ ES (SINK) components. It is to be appreciated that the ACSP _ LOG _ SOURCE component can consume data from the ESG _ ACSP _ CORE _ BUS _ LOG topic of KAFKA. The TRIM _ JSON component can format each row of log consumed from KAFKA to obtain the data content in JSON format, and converts the data content into a JSON character string through JSON tool class. The NULL FILTER component intercepts data-empty content generated during the formatting process and does not pass downstream. The ACSP _ LOG _ BDP _ FORMAT component splits JSON data content, obtains a preset number of fields, for example, five fields, and stores the preset number of fields in a high (data warehouse analysis system) database. The ACSP LOG ES component is a third party ES (distributed search) search platform that stores LOG data and supports fast searches.

S210, the shunting processing cluster respectively stores the first shunting log file and the second shunting log file to corresponding storage equipment.

Specifically, the stream processing cluster may obtain addresses of storage devices corresponding to the first split log file and the second split log file, respectively. According to the addresses of the respective storage devices, the shunting processing cluster can store the first shunting log file and the second shunting log file to the corresponding storage devices respectively.

In one embodiment, the storage device may be specifically a HIVE and ES retrieval engine. Specifically, the split processing cluster may store the first split log file in the HIVE physical storage, which is convenient for subsequent full log data statistics. For example, a page may be made in the management background service, through which log data at any time may be queried, for example, log data 3 months ago may be queried. The offload processing cluster may store the second offload log file in the ES search engine, and the offload processing cluster may set a time period for which the second offload log file is stored in the ES search engine. For example, the second split log file can only be stored in the ES search engine for 15 days, and log data after 15 days will be automatically deleted. A page can be made in the management background service, and the real-time log data of the current 15 days can be inquired through the ES retrieval engine.

In one embodiment, as shown in fig. 5, the client accesses the user terminal in the extranet application through the internet, the user terminal may display a front-end page, and the client operates the front-end page of the user terminal to generate log data. The user terminal may asynchronously transfer the log data to a JETTY (lightweight container) application layer. Meanwhile, normal service call can be carried out between the user terminal and the application layer. The application layer can communicate with DUBBOX (open source micro service architecture) micro service through a PUBLIC (PUBLIC gateway), communicate with management background service through a domain name address, and asynchronously transmit log data generated by the user terminal and the application layer to the management background service. The management background service can store the Log data generated by the service and the received Log data of the user terminal and the application layer into a Log4j Log file of the server. The Log generated by the microservice may be recorded in the Log4j Log file of the microservice itself. The management background service and the micro service are respectively provided with a collection program, the collection program can regularly monitor and collect all log files on the management background service and the micro service, and the collected log files, namely BEE (archive) files are respectively sent to the KAFKA cluster. The FLINK can automatically consume the log data of the log file in KAFKA and carry out shunt processing on all the log data. The FLINK can store all log data into the HIVE physical storage in full, so that subsequent full statistics is facilitated. The FLINK can store log data for approximately 15 days in the ES retrieval engine. And a query display page can be made in the management background, and real-time log data in the ES and historical log data in the HIVE can be queried.

In one embodiment, the front-end log data generated by the user terminal, the log data generated by the application layer, the log data generated by the background management service and the log data generated by the micro-service have different log formats. And the corresponding service servers can generate log files with the same log format according to a preset format. The corresponding service servers can respectively send all log files with the same log format to the KAFKA cluster. KAFKA can collect and sample the log data in all received log files and check whether the log format corresponding to the log data meets the service requirement. When the KAFKA log collection samples do not meet the service requirements, the preset format can be reset, and a corresponding log file whether meeting the service requirements is generated.

In the log data processing method, more than one service servers corresponding to different service types are used for obtaining the corresponding log data, and the log data corresponding to the service servers of each service type have different log formats, so that each service server can generate the log data into the corresponding log file according to the preset format, and the log data in the log file have the same log format. The flow distribution processing cluster can perform flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file, and the first flow distribution log file and the second flow distribution log file are stored respectively, so that log query operation is more convenient, problem location is quicker, and log data processing efficiency is improved.

In an embodiment, step S204, that is, the step of generating the log data into the corresponding log file by each service server according to the preset format specifically includes: each service server determines the same preset format according to preset service requirements; and generating corresponding log files for the log data by each service server according to the preset fields in the same preset format and the sequencing of the preset fields.

Specifically, the log data acquired by each service server has different log formats, and each service server can determine the same preset format according to preset service requirements. The preset format may include preset fields and an ordering of the preset fields. Each service server can generate the log data into corresponding log files according to the preset fields in the same preset format and the sequencing of the preset fields.

In one embodiment, the preset fields include Log levels, and Log data information to be output in Log4j defines 6 Log levels, which are, in turn, TRACE, DEBUG, INFO, WARN, ERROR, and far. When the log is output, only the log data information with the level higher than the level specified in the configuration can be really output. Wherein, TRACE is a very low log level, TRACE tracks the calling of functions, and TRACE cannot contain variable parameters but only can prompt the calling relation of functions. DEBUG is a fine-grained level and is mainly used for printing some running information in the development process. The INFO is a coarse level of granularity for highlighting the running process of the application and printing some information of interest or importance to the user. The WARN indicates that a potential error situation occurs, and some information is not error information but also gives some prompt to the developer. ERROR indicates that the ERROR event has occurred but still does not affect the continued operation of the system for printing ERROR and exception information. FATAL is a high log level to indicate that each serious error event will cause an exit of the application.

In the embodiment, the unified log format is set through the preset field, so that log data can be managed more conveniently according to service requirements, and the log data can be inquired in the same page in a unified manner.

In an embodiment, step S208, that is, the step of the split processing cluster splitting the log data according to the generation time of each log data in the log file to obtain the first split log file and the second split log file includes: the flow distribution processing cluster determines the generation time of each log data in the log file according to the time field in the preset format; the flow distribution processing cluster performs descending sorting on the generation time of each log data in the log file to obtain a corresponding time sorting result; the shunting processing cluster screens first shunting log data which meet first preset conditions from all log data of the log file according to the time sequencing result and a preset field to be stored, and generates a first shunting log file according to the first shunting log data; and the flow distribution processing cluster screens out second flow distribution log data which meet a second preset condition from all log data of the log file according to the time sequencing result, and generates a second flow distribution log file according to the second flow distribution log data.

Specifically, the preset format includes a time field, and each piece of log data corresponds to the generation time of the log data. The shunting processing cluster can determine the generation time of each log data in the log file according to the time field in the preset format. The shunting processing cluster can perform descending sorting on the generation time of each log data in the log file to obtain a corresponding time sorting result. The shunting processing cluster can determine a preset field to be stored, a first preset condition and a second preset condition corresponding to the log data according to the service requirement. The shunting processing cluster can screen out first shunting log data meeting a first preset condition from all log data of the log file according to the time sequencing result and a preset field to be stored, and generates a first shunting log file according to the first shunting log data. The flow distribution processing cluster can screen out second flow distribution log data meeting a second preset condition from all log data of the log files according to the time sequencing result, and generates a second flow distribution log file according to the second flow distribution log data.

In an embodiment, the first preset condition may specifically be that log data corresponding to 5 fields are screened from all log data corresponding to the time sorting result for full storage, for example, the 5 fields may specifically be timekill, thread (code position), level (log level), logernname (log name), and message (log content). The second preset condition may specifically be that log data corresponding to the last 15 days is screened from all log data corresponding to the time sorting result and stored in real time.

In the above embodiment, according to the first preset condition and the second preset condition, the log data is subjected to shunting processing, so that the log data can be subjected to classification management according to service requirements, and the log data processing efficiency is further improved.

In an embodiment, the storage device includes a full storage device and a real-time storage device, and step S210 is that the step of the split processing cluster storing the first split log file and the second split log file to the corresponding storage devices respectively includes: the shunting processing cluster respectively acquires the addresses of the full-scale storage equipment and the addresses of the real-time storage equipment; the shunting processing cluster stores the first shunting log file into a physical memory of the full storage device according to the address of the full storage device; and the shunting processing cluster stores the second shunting log file to the real-time storage device according to the address of the real-time storage device.

The full-scale storage device is a device for storing the first shunt log file in full scale, and the real-time storage device is a device for storing the second shunt log file in real time. Specifically, a full storage device and a real-time storage device are deployed in the log data processing system, and the shunting processing cluster can respectively obtain an address of the full storage device and an address of the real-time storage device. The shunting processing cluster can find the full storage device according to the address of the full storage device, and further stores the first shunting log file into a physical memory of the full storage device. The shunting processing cluster can find the real-time storage equipment according to the address of the real-time storage equipment, and store the second storage equipment shunting log file to the real-time storage equipment.

In one embodiment, the developer may add a tracking identifier requested by the user terminal to the function code corresponding to the user terminal, and may further associate the log data generated by the user terminal with the log data generated by each service server according to the tracking identifier. And generating a log data query page in the management background service, querying the full log data in the full storage device through the query page in a link mode, and querying the real-time log data in the real-time storage device through the query page in the link mode.

In the above embodiment, according to the address of the full storage device and the address of the real-time storage device, the shunting processing cluster can quickly query the full storage device and the real-time storage device, and then can update and store the first shunting log file and the second shunting log file in the corresponding storage devices in real time, so that the storage efficiency of log data is improved.

In a specific embodiment, the log data processing method comprises the following steps:

the general server receives log data sent by different user terminals.

The user terminal acquires the log data generated by the user terminal through a front-end page frame fused with the log data interceptor in advance.

The user terminal asynchronously transmits the log data generated by the user terminal to the general-purpose server.

The general-purpose server acquires log data generated locally by the general-purpose server.

And acquiring the log data generated locally by the micro server through the micro server.

And each service server determines the same preset format according to the preset service requirement.

And generating corresponding log files for the log data by each service server according to the preset fields in the same preset format and the sequencing of the preset fields.

And each service server monitors the corresponding log file by acquiring the code.

And when monitoring that the log files corresponding to the service servers are updated, the service servers respectively send the updated log files corresponding to the service servers to the shunting processing cluster.

And the flow distribution processing cluster determines the generation time of each log data in the log file according to the time field in the preset format.

And the flow distribution processing cluster performs descending sorting on the generation time of each log data in the log file to obtain a corresponding time sorting result.

And the shunting processing cluster screens out first shunting log data which meet a first preset condition from all log data of the log file according to the time sequencing result and a preset field to be stored, and generates a first shunting log file according to the first shunting log data.

And the flow distribution processing cluster screens out second flow distribution log data which meet a second preset condition from all log data of the log file according to the time sequencing result, and generates a second flow distribution log file according to the second flow distribution log data.

And the shunting processing cluster respectively acquires the addresses of the full storage equipment and the real-time storage equipment.

And the shunting processing cluster stores the first shunting log file into a physical memory of the full storage device according to the address of the full storage device.

And the shunting processing cluster stores the second shunting log file to the real-time storage device according to the address of the real-time storage device.

According to the log data processing method, the log data corresponding to more than one service server corresponding to different service types are obtained, and because the log data corresponding to the service servers of each service type have different log formats, each service server can generate the log data into the corresponding log file according to the preset format, so that the log data in the log file have the same log format. The flow distribution processing cluster can perform flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file, and the first flow distribution log file and the second flow distribution log file are stored respectively, so that log query operation is more convenient, problem location is quicker, and log data processing efficiency is improved.

It should be understood that although the steps in the above specific embodiments are shown in order, the steps are not necessarily performed in order. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in the above specific embodiments may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, referring to fig. 1, there is provided a log data processing system comprising: more than one traffic server 102, offload processing cluster 104, and storage device 106 corresponding to different service types, wherein:

each service server 102 is configured to obtain corresponding log data.

Each service server 102 is further configured to generate a corresponding log file from the log data according to a preset format.

Each service server 102 is further configured to send the corresponding log file to the distribution processing cluster 104.

The distribution processing cluster 104 is used for distributing the log data according to the generation time of each log data in the log file to obtain a first distribution log file and a second distribution log file; the first split log file includes full log data and the second split log file includes real-time log data.

The split processing cluster 104 is further configured to store the first split log file and the second split log file to corresponding storage devices 106, respectively.

In one embodiment, each service server 102 is further configured to obtain log data generated locally by the user terminal and the general-purpose server; and acquiring log data generated locally by the micro server.

In one embodiment, each service server 102 is further configured to receive log data sent by different user terminals; acquiring log data locally generated by a general server; the step of sending the log data by the user terminal comprises the following steps: the method comprises the steps that a user terminal obtains log data generated by the user terminal through a front-end page frame fused with a log data interceptor in advance; the user terminal asynchronously transmits the log data generated by the user terminal to the general-purpose server.

In an embodiment, each service server 102 is further configured to determine the same preset format according to a preset service requirement; and generating corresponding log files from the log data according to the preset fields in the same preset format and the sequence of the preset fields.

In one embodiment, each service server 102 is further configured to monitor a respective log file by collecting a code; when monitoring that the log file corresponding to each service server is updated, each service server 102 sends the updated log file corresponding to each service server to the distribution processing cluster.

In an embodiment, the distribution processing cluster 104 is further configured to determine, according to a time field in a preset format, generation time of each log data in the log file; sorting the generation time of each log data in the log file in a descending order to obtain a corresponding time sorting result; screening first shunt log data meeting a first preset condition from all log data of the log file according to the time sequencing result and a preset field to be stored, and generating a first shunt log file according to the first shunt log data; and screening second shunt log data meeting a second preset condition from all log data of the log file according to the time sequencing result, and generating a second shunt log file according to the second shunt log data.

In an embodiment, the offloading processing cluster 104 is further configured to obtain addresses of the full storage device and addresses of the real-time storage devices, respectively; storing the first shunt log file into a physical memory of the full storage device according to the address of the full storage device; and storing the second shunt log file to the real-time storage device according to the address of the real-time storage device.

The log data processing system acquires the log data corresponding to more than one service server corresponding to different service types, and the log data corresponding to the service servers of the service types have different log formats, so that the service servers can generate the log data into corresponding log files according to the preset format, and the log data in the log files have the same log format. The flow distribution processing cluster can perform flow distribution processing on the log data according to the generation time of each log data in the log file to obtain a first flow distribution log file and a second flow distribution log file, and the first flow distribution log file and the second flow distribution log file are stored respectively, so that log query operation is more convenient, problem location is quicker, and log data processing efficiency is improved.

In one embodiment, a computer device is provided, which may be the business servers 102 in fig. 1 described above, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing log data processing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a log data processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the log data processing method described above. Here, the steps of the log data processing method may be steps in the log data processing methods of the respective embodiments described above.

In one embodiment, a computer-readable storage medium is provided, in which a computer program is stored, which, when executed by a processor, causes the processor to perform the steps of the above-described log data processing method. Here, the steps of the log data processing method may be steps in the log data processing methods of the respective embodiments described above.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of processing log data, the method comprising:

2. The method according to claim 1, wherein the service servers of different service types include a general server and a micro server, and the obtaining of the respective log data by more than one service servers corresponding to different service types includes:

acquiring log data generated locally by a user terminal and the universal server through the universal server;

and acquiring the log data locally generated by the micro server through the micro server.

3. The method according to claim 2, wherein the obtaining, by the generic server, log data generated locally by the user terminal and the generic server comprises:

the universal server receives log data sent by different user terminals;

the universal server acquires log data generated locally by the universal server;

the step of sending the log data by the user terminal comprises:

the user terminal acquires log data generated by the user terminal through a front-end page frame fused with a log data interceptor in advance;

and the user terminal asynchronously transmits the log data generated by the user terminal to the universal server.

4. The method according to claim 1, wherein the generating, by each of the service servers, the log data into a corresponding log file according to a preset format includes:

each service server determines the same preset format according to preset service requirements;

5. The method according to claim 1, wherein the sending, by each of the service servers, the log file corresponding to each of the service servers to the offload processing cluster respectively includes:

each business server monitors a corresponding log file by acquiring a code;

and when monitoring that the log files corresponding to the service servers are updated, the service servers respectively send the updated log files corresponding to the service servers to the distribution processing cluster.

6. The method according to claim 1, wherein the splitting processing cluster performs splitting processing on the log data according to the generation time of each log data in the log file to obtain a first split log file and a second split log file, and includes:

the shunting processing cluster determines the generation time of each log data in the log file according to the time field in the preset format;

the shunting processing cluster performs descending sorting on the generation time of each log data in the log file to obtain a corresponding time sorting result;

the shunting processing cluster screens out first shunting log data meeting a first preset condition from all log data of the log file according to the time sequencing result and a preset field to be stored, and generates a first shunting log file according to the first shunting log data;

7. The method according to any one of claims 1 to 6, wherein the storage devices include a full storage device and a real-time storage device, and the offloading processing cluster stores the first offloading log file and the second offloading log file to the respective storage devices, respectively, including:

the shunting processing cluster respectively acquires the addresses of the full storage equipment and the real-time storage equipment;

the shunting processing cluster stores the first shunting log file into a physical memory of the full storage device according to the address of the full storage device;

and the flow distribution processing cluster stores the second flow distribution log file to the real-time storage equipment according to the address of the real-time storage equipment.

8. A log data processing system is characterized in that the system comprises more than one business servers corresponding to different service types, a distribution processing cluster and a storage device,

each service server is used for acquiring the corresponding log data;

9. The system according to claim 8, wherein each service server is further configured to obtain log data generated locally by the user terminal and the general server; and acquiring the log data generated locally by the micro server.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.