US20100241893A1 - Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment - Google Patents
Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment Download PDFInfo
- Publication number
- US20100241893A1 US20100241893A1 US12/406,875 US40687509A US2010241893A1 US 20100241893 A1 US20100241893 A1 US 20100241893A1 US 40687509 A US40687509 A US 40687509A US 2010241893 A1 US2010241893 A1 US 2010241893A1
- Authority
- US
- United States
- Prior art keywords
- input data
- interpretation
- computing environment
- output
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24532—Query optimisation of parallel queries
Definitions
- This disclosure relates generally to interpretation and execution of a customizable database request using an extensible computer process and an available computing environment.
- a database analyst may seek to request information from a database but may be prevented from doing so by a lack of an ability to customize a database query.
- the database analyst may also be unable to distribute the processing of the query across a distributed computational environment, which may include one or more servers.
- the database analyst may be restricted to a limited set of queries that may limit the effectiveness of the analyst's ability to obtain information from the database. The analyst may therefore seek data inefficiently using an excessive number of queries.
- the data analyst may also be required to transfer the processed information of the database to a separate process to analyze the data.
- the database analyst may therefore be required to spend an excessive amount of time obtaining information, which may lead to a delay, an additional cost of the analyst's time, an additional time for a processor usage, and a greater possibility of incurring a human made error.
- the database analyst may ultimately fail to find a desired information.
- a method includes generating an interpretation of a customizable database request which includes an extensible computer process and providing an input guidance to available processors of an available computing environment. The method further includes automatically distributing an execution of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the execution is limited to at least a part of an input data. The method also includes automatically assembling a response using a distributed output of the execution.
- the input guidance may be provided to each of the available processors and may be comprised of certain portions of the input data.
- the input guidance may be used to determine which of the available processors are to perform functions related to the at least the part of the input data.
- the method may further include providing an information to the extensible computer process about its context in the customizable database request, and processing an interpretation of the customizable database request based on the information provided.
- the extensible computer process may be a developer provided-computer program, and the information provided may include at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data and the output data, a distribution information, a length of the input data and the output data, and a custom parameter.
- the custom parameter may be at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request.
- the method may further include post processing an output of each of the available processors when automatically assembling the response.
- the post processing may include at least one database operation including at least one of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
- the method may further include pre-processing an input of each of the available processors when providing the input guidance to the available processors.
- the available computing environment may be comprised of at least two servers.
- the customizable database request may specify the input data for the extensible computer process.
- the input data may be structured in a form comprising at least one of a database table and an output of a different database query.
- the input data may be unstructured in a form comprising a content of at least one file in a computing environment.
- the method may further include detecting a fault in the execution of the interpretation, and automatically rectifying an output effect of the fault. Rectifying the output effect of the fault may include at least one of reprocessing an operation, excluding a corrupted data, and logging the corrupted data.
- the customizable database request may be comprised of at least one of a predetermined function, a developer created function, and an analyst created function.
- a system may include a query planning module to generate an interpretation of a database request which includes an extensible computer process, and a parallelization module to provide an information to available processors of an available computing environment and to automatically distribute an execution of the interpretation across the available computing environment operating concurrently and in parallel.
- a component of the execution may be limited to at least a part of an input data.
- the system may further include a response organization module to automatically assemble a response using a distributed output of the execution.
- the information may be used to provide each of the available processors certain portions of the input data, and to determine which of the available processors are to perform functions related to the at least the part of the input data.
- the system may include a reference module to provide an extensible computer process information about its context in the database request.
- the system may include a dynamic interpretation module to process information that affects the interpretation of the database request based on the information provided, wherein the extensible computer process is a developer provided-computer program.
- the information provided may include a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters.
- the custom parameters may be at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the database request.
- a method in yet another aspect, includes generating an interpretation of a customizable database request which includes an extensible computer process, and providing an input guidance to available processors of an available computing environment.
- the input guidance determines which of the available processors are to perform functions related to the at least a part of an input data.
- the method further includes pre-processing an input of each of the available processors when providing the input guidance to the available processors, and automatically distributing an analysis phase of the interpretation across the available computing environment operating concurrently and in parallel. A component of the analysis phase is limited to at least a part of the input data.
- the method further includes automatically distributing an additional analysis phase of the interpretation across the available computing environment, and automatically assembling a response using a distributed output of the additional analysis phase.
- the method also includes post processing an output of each of the available processors when automatically assembling the response.
- the post processing includes at least one database operation including one or more of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
- the method may include providing an extensible computer process information about its context in the customizable database request, and processing information that affects the interpretation of the customizable database request based on the information provided.
- the extensible computer process is a developer provided-computer program
- the information provided includes at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters.
- the custom parameters are one or more of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request.
- FIG. 1 is a system view illustrating processing of a customizable database query using a developer extensible operation and an available computing environment, according to one embodiment.
- FIG. 2 is an exploded view of the available computing environment, according to one embodiment.
- FIG. 3 is an exploded view of a query planning module, according to one embodiment.
- FIG. 4 is an exploded view of a monitoring module, according to another embodiment.
- FIG. 5 is an illustration of processing input data to generate a query response, according to another embodiment.
- FIG. 6 is a system view of an alternate embodiment of processing of a customizable database query using a developer extensible operation and an available computing environment.
- FIG. 7 is an illustration of processing input data to generate a query response, according to an alternate embodiment.
- FIG. 8 is a diagrammatic system view of a data processing system in which any of the embodiments disclosed herein may be performed, according to one embodiment.
- FIG. 9 is a process flow of interpreting and executing a customizable database request, according to one embodiment.
- FIG. 10 is a process flow of automatically distributing an analysis phase and an additional analysis phase of the interpretation of a customizable database request across the available computing environment, according to one embodiment.
- FIG. 1 is a system view illustrating processing of a customizable database query using a developer extensible operation and an available computing environment, according to one embodiment.
- FIG. 1 illustrates an extensible computer process 100 , a query planning module 102 , an analysis phase 104 , an additional analysis phase 106 A-N, an available computing environment 112 , a monitoring module 114 , a response 116 , a user interface 118 , an analyst 120 , a developer 122 , a customizable database request 124 , and servers 126 .
- FIG. 1 illustrates an analyst 120 providing a customizable database request 124 to a extensible computer process 100 .
- the analyst 120 may be a database analyst who is familiar with SQL (e.g., a Structured Query Language).
- SQL may be a database computer language designed for the retrieval and management of data in relational database management systems (RDBMS), database schema creation and modification, and database object access control management.
- RDBMS relational database management systems
- the analyst 120 may have limited knowledge of other programming languages, and may have a substantially limited ability to create programs, to modify software, and to manage software distributed across multiple processors.
- the analyst 120 may be tasked with searching for data rather than developing programs.
- the customizable database request 124 may consist of a SQL instruction and/or it may be written in any programming language.
- the customizable database request 124 may be customized to include a function (e.g., a nested SQL command, a mathematical equation, a variable, a standard deviation, etc.).
- the function may be created by the analyst 120 , the developer 122 , and/or it may be a predetermined function.
- the function may be customized to search multiple records at once, to retrieve and/or manipulate data in multiple forms (e.g., tables, images, unstructured data 584 , text files, programs, sound files, photos, etc.).
- the function may access data in one form and generate data in another form.
- the customizable database request 124 may further specify an input data 510 for the extensible computer process 100 .
- the customizable database request 124 may allow the process to be scaled in accordance with a changing system hardware and/or performance of a system.
- the function may allow user-implemented procedural code to be uploaded to a database and executed at each node of a system.
- a user e.g., an analyst 120 , a developer 122 , etc.
- the customizable database request 124 may take in input using a set of rows in a table (e.g., a persistent table in a database, the output of a SQL SELECT statement and/or the output of another function, etc.).
- the customizable database request 124 may result in an output that includes a relation of a set of rows (e.g., an output unrelated to the input.)
- the customizable database request 124 and/or a function of the customizable database request 124 may be placed into a SQL SELECT query and/or any other query as though it were itself a table.
- This integration with SQL may allow for composing SQL and procedural code invocations in any form and shape.
- the code may be written in Java, Python, and/or any other language.
- the customizable database request 124 may include a function that is written in Java that is then invoked as part of a SQL query statement.
- the function may convert sets of rows to sets of rows.
- the function may be parallelized to operate on rows across multiple nodes simultaneously.
- the function may be invoked on arbitrary sets of rows and/or rows grouped together by a PARTITION BY clause. Within a partition, rows may be further sorted using an ORDER BY clause.
- a function may split strings into words.
- the function may be invoked once for every row in an input table.
- the function may include Java procedural code that takes each document and emits a row for each word.
- the function may define a column that appears in its output rows.
- a function may be created to compute the 10 most-frequently occurring words in a body of text using the function to split strings into words.
- a function of the customizable database request 124 may perform sessionization by mapping each click in a clickstream to a unique session identifier.
- the function may define a session as a sequence of clicks by a particular user where no more than n seconds pass between successive clicks (e.g., if a click from a user isn't seen for n seconds, a new session is started.
- the function may use a userid and/or a timestamp attribute.
- the function may include as parameters the name of the timestamp attribute, the number of seconds between clicks that results in starting a new session.
- a clickstream table may be partitioned by userid, and partition tuples may be sequenced by timestamp.
- the sessionize function may then be invoked against each of the ordered partitions and/or emit the input tuples with an appropriate sessionid added.
- the customizable database request 124 may be received by an extensible computer process 100 , which may be designed to take into consideration future growth by allowing the addition and/or modification of functionality.
- the addition of new functionality and/or the modification of existing functionality may be accomplished with limited impact to existing system functions.
- a developer 122 may be familiar with a type of programming involving database analysis, query modification, and/or data searches.
- the developer 122 may possess limited knowledge regarding programs to distribute an analysis across multiple computing systems.
- the developer 122 may support and/or design software for the analyst 120 .
- the developer 122 may adapt the extensible computer process 100 to add new functions, modify existing functions, and/or add additional language ability to the software.
- the extensible computer process 100 may communicate with a query planning module 102 to generate a query interpretation of the customizable database request 124 .
- the query interpretation may be formatted to be distributable (e.g., separated into individual tasks for separate processes, etc.).
- the query interpretation may convert the customizable database request 124 from any computer language (e.g., a machine-readable artificial language designed to express computations that can be performed by a machine, C++, SQL, Perl, Java, Prolog, etc.) into a preferred programming language.
- the query interpretation may automatically format the customizable database query to be processed using a distributable, multiphase analysis.
- the query planning module 102 may generate an interpretation (e.g., the query interpretation) of the customizable database request, which may include an extensible computer process.
- the query planning module 102 may optimize the analysis phase and/or the additional analysis phase using a parameter (e.g., an expected output file size, an input file format, a table dimension, etc.).
- the query planning module may provide an input guidance to available processors of the available computing environment.
- the input guidance may include certain portions of the input data, and the input guidance may be used to determine which of the available processors are to perform functions related to different parts of the input data.
- the query planning module 102 may use the parameter to allocate a system resource (e.g., memory, power supply output, processor usage, a number of servers applied, a sequence of processors used, a timing of processes analyzed, etc.).
- the allocation of a system resource may include a distribution of processes across an available computing environment 112 , a selection of a type of analysis to apply, and/or a selection of input data to review.
- the execution of the interpretation may be automatically distributed across an available computing environment operating concurrently and in parallel, and a component of the execution may be limited to a part of the input data.
- the part of the input data may be a subset of the input data, which may allow the execution to be divided into separate tasks to be processed by different machines.
- the available computing environment 112 may be comprised of servers that are and/or will be available to process data.
- the available computing environment 112 may be better illustrated in FIG. 2 .
- the query interpretation may be dynamically determined based on a context (e.g., a repeated pattern of requested information, an association between an analyst's customizable database request 124 and an input data 510 , an available input data 510 , etc.).
- the context of the customizable database request 124 may include the type of requested information, the language of the request, and/or the expected response 116 .
- the analysis phase 104 and/or the additional analysis phase 106 A-N may be adjusted to provide a response 116 that includes GPS coordinates (e.g., latitude and/or longitude, etc.).
- the query interpretation may automatically provide alternate responses based on a variation of the requested parameters, such as by expanding or contracting a search parameter to provide alternate responses, varying search parameters, and searching for peak values.
- the interpretation of the customizable database request generated by the query planning module 102 may be processed based on a contextual information provided to the extensible computer process.
- the extensible computer process may be a developer provided-computer program.
- the information provided may include a format of the input data and the output data, whether the input data and the output data are ordered and in which form, grouping information, statistics of the input data and the output data, a distribution information, a length of the input data and the output data, and a custom parameter.
- the custom parameter may be a number, a string, and/or a list of numbers of strings.
- the custom parameter may further include a content of a file in the available computing environment, and/or a result of the customizable database request (e.g., the response 116 ).
- the query interpretation generated by the query planning module 102 may be communicated to an analysis phase 104 , which may be automatically distributed across an available computing environment 112 .
- the automatic distribution of the query interpretation may allow separate machines to analyze the query using portions of an input data 510 simultaneously, in parallel, in an overlapping sequence, and/or in series.
- the analysis phase 104 may include a component that is limited to a part of the input data 510 .
- the component may process a part of a “map” phase of a MapReduce analysis (e.g., a framework for computing a distributable problem).
- the component may process a part of the analysis phase 104 using its part of the input data 510 .
- the analysis phase 104 may also include an additional component that uses the output of the component to generate an additional output (e.g., the additional component operates in series with the component, the additional component uses the output of the component as one of several inputs, etc.).
- the analysis phase 104 may process the query interpretation using the input data 510 , which may be acquired from the database 108 A-N.
- the input data 510 may include structured data and/or unstructured data 584 , as illustrated in FIG. 5 .
- the input data of the analysis phase may be generated using a combination of multiple data sources (e.g., multiple tables, storage devices, etc.).
- the portion of the input data used by a component of the analysis phase 104 may also be generated using a combination of multiple data sources.
- the analysis phase 104 may communicate with a monitoring module 114 and/or the additional analysis phase 106 A-N, which may be automatically distributed across the available computing environment (e.g., currently available servers, virtual machines, processors, etc.).
- the additional analysis phase 106 A-N may access a greater amount of information that the amount of the input data 510 used by the analysis phase 104 .
- the additional analysis phase 106 A-N may operate in parallel, in series, or in any other pattern with the analysis phase 104 .
- the response 116 may be automatically assembled using a distributed output of the additional analysis phase 106 A-N.
- the output of the additional analysis phase 106 A-N may be distributed across multiple processors, servers, and/or virtual machines, and a complete resulting output may require an accumulation of all distributed parts of the additional analysis phase 106 A-N output.
- the assembled output may be the response 116 .
- the response 116 may be displayed through a user interface (e.g., a web browser, a terminal, a PC, a server, a monitor, etc.).
- the monitoring module 114 may observe the input data 510 provided to the analysis phase 104 , the available computing environment 112 , the input to the additional analysis phase 106 A-N, the processing of information by the additional analysis phase 106 A-N, and the assembled response 116 .
- the monitoring module 114 may manage the automatic distribution of the analysis phase 104 and/or the additional analysis phase 106 A-N across the available computing environment 112 .
- the monitoring module 114 may assemble the distributed output of the additional analysis phase 106 A-N to generate the response 116 .
- the monitoring module 114 may detect a fault (e.g., an exception, a hardware failure, a system crash, a processor failure, a data error, a processing error, etc.) in the analysis phase 104 and/or the additional analysis phase 106 A-N.
- the monitoring module 114 may automatically rectify an output effect (e.g., a data corruption, a propagating data error, a system failure, etc.) of the fault.
- the rectification may include one or more of reprocessing an operation (e.g., a component of the analysis phase 104 , the additional analysis phase 106 A-N, etc.), excluding a corrupted data, and/or logging a corrupted data.
- the rectification may include isolating a fault generating process and/or hardware mechanism.
- the monitoring module 114 may rectify an output effect automatically (e.g., without intervention by the developer 122 and/or analyst 120 ).
- FIG. 2 is an exploded view of the available computing environment 112 illustrated in FIG. 1 , according to one embodiment.
- FIG. 2 illustrates the available computing environment 112 , the servers 126 A-N, and the databases 108 A-N, according to one embodiment.
- the available computing environment 112 may include one or more servers that are currently or will be open to process information within a preferred time frame.
- the servers 126 A-N of the available computing environment 112 may be comprised of one or more separate servers, virtual machines, client devices, and/or separate processors of a single server.
- the servers 126 A-N may communicate with one or more databases (e.g., databases 108 A-N), which may be included within the available computing environment 112 .
- the servers 126 A-N and the databases 108 A-N may communicate with each other via a LAN, a WAN, a MAN, and/or any other network arrangement.
- the databases 108 A-N may include direct attached storage devices, volatile and/or non-volatile memory.
- FIG. 3 is an exploded view of the query planning module, according to one embodiment.
- FIG. 3 includes the query planning module 102 , an optimization module 330 , a SQL instruction module 332 , a dynamic interpretation module 334 , a function module 336 , a developer operation module 338 , a translation module 340 , and a reference module 342 .
- the query planning module 102 may include multiple modules to perform various functions.
- the optimization module 330 may optimize the analysis phase 104 and/or the additional analysis phase 106 A-N using a parameter included with the customizable data request.
- the parameter may include a prediction and/or expectation regarding the response 116 (e.g., an output memory requirement, a number of generated responses, a range of response outputs, a type of input data 510 , etc.).
- the SQL instruction module 332 may interpret a SQL command, a nested SQL instruction, etc.
- the dynamic interpretation module 334 may dynamically determine a query interpretation of the customizable database request 124 based on a context (e.g., a scope and/or format of the customizable database request 124 , an aspect of the input data 510 , the available computing environment, etc.). The analysis may be dynamically altered in accordance with the query interpretation.
- a context e.g., a scope and/or format of the customizable database request 124 , an aspect of the input data 510 , the available computing environment, etc.
- the function module 336 may alter the query interpretation based on a function (e.g., a predetermined function, an analyst and/or developer created function, etc.).
- the function may be an equation, a programming command, a sequence of commands, etc.
- the developer operation module 338 may generate the query interpretation based on an operation added and/or modified by a developer in the extensible computer process 100 .
- the translation module 340 may generate the query interpretation by translating the customizable database request 124 from any language (e.g., a computer programming language such as SQL, Java, dBase, and/or a human language such as Indonesian, Russian, Spanish, and/or Chinese).
- the reference module 342 may provide an extensible computer process information about its context in the database request.
- FIG. 4 is an exploded view of the monitoring module, according to another embodiment.
- FIG. 4 illustrates the monitoring module 114 , a detection module 450 , a rectification module 452 , a parallelization module 454 , an additional parallelization module 456 , and a response organization module 458 .
- the detection module 450 may observe an input and/or an output of the analysis phase 104 , the servers 126 A-N, and the available computing environment 112 , the additional analysis phase 106 A-N.
- the detection module 450 may also observe the operation and transmitted data of the database 108 A-N, the query planning module, and/or the extensible computer process 100 .
- the detection module 450 may automatically detect a fault in the analysis phase 104 and/or the additional analysis phase 106 A-N.
- the rectification module 452 may automatically rectify an output effect (e.g., a process failure, a system crash, a corrupted data, a propagating failure, etc.) of the fault.
- the automatic rectification may include an isolation of the fault generating mechanism (e.g., a process, a server, a component, etc.).
- the automatic rectification may include re-executing an interrupted process (e.g., the analysis phase 104 , the component, the additional analysis phase 106 A-N, etc.).
- the automatic rectification may include logging the fault and/or the corrupted data.
- the rectified data may be excluded (e.g., from a query response, a repeated analysis phase 104 , etc.).
- the parallelization module 454 may automatically distribute the analysis phase of the query interpretation across an available computing environment.
- the additional parallelization module 456 may automatically distribute the additional analysis phase of the query interpretation across the available computing environment.
- the parallelization module 454 and/or the additional parallelization module 456 may consider a number of processors available, the number of analyses to be performed, and/or the sequence of the distributed processes.
- the response organization module 458 may automatically assemble the response 116 using the distributed output of the additional analysis phase.
- the response organization module 458 may wait for a completion of all necessary processes prior to assembling the response 116 .
- the response organization module 458 may further post process an output of each of the available processors when automatically assembling the response.
- the post processing may include a database operation, such as an aggregation operation, a sorting operation, and/or an invocation of a separate extensible computer process (e.g., an external program, a developer created function, a third-party software, etc.).
- FIG. 5 is an illustration of processing input data to generate a query response, according to another embodiment.
- FIG. 5 illustrates the analysis phase 104 , the additional analysis phase 106 , the input data 510 , the response 116 , a component 560 , an additional component 562 , a table 564 , text 566 , an object 568 , an audio file 570 , a video file 572 , an output table 574 , an output text 576 , an output object 578 , an audio file 580 , an output video file 582 , and an unstructured data 584 .
- FIG. 5 illustrates a variety of types and forms that may be taken by the input data 510 and/or the response 116 .
- the input data 510 may include the table 564 , the text 566 , the object 568 , the audio file 570 , and/or the video file 572 .
- the input data may be structured in a form including a database table and/or an output of a different database query.
- the response 116 may include the output table 574 , the output text 576 , the output object 578 , the output audio file 580 , and/or the output video file 582 .
- the table 564 and/or the output table 574 may be structured data.
- the text 566 , the object 568 , the audio file 570 , the video file 572 , the output text 576 , the output object 578 , the output audio file 580 , and/or the output video file 582 may be unstructured data 584 .
- the input data 510 may be unstructured in a form including a content of at least one file in a computing environment.
- the unstructured data 584 may include a mix of data types, including images and audio files, text, programs, and/or word processing files.
- the input data 510 may be communicated to the analysis phase 104 , which may process the data in the component 560 and/or the additional component 562 .
- the output of the analysis phase 104 may be received by the additional analysis phase 106 A-N, which may generate the response 116 .
- the additional analysis phase 106 A-N may consist of one or more phases.
- the response 116 may be formed of the same and/or a different data type from the input data 510 .
- FIG. 6 is a system view of an alternate embodiment of processing of a customizable database query using a developer extensible operation and an available computing environment 112 .
- FIG. 6 illustrates the query planning module 102 , the analysis phase 104 , the additional analysis phase 106 , the database 108 , the input data 510 , the monitoring module 114 , the response 116 , the analyst 120 , the developer 122 , M 686 A-C, R 688 A-B, and intermediate files 690 .
- the query planning module 102 may receive a customizable database request 124 from the analyst 120 .
- the developer 122 may contribute to and/or modify the customizable database request 124 .
- the query planning module 102 may communicate a query interpretation to the analysis phase 104 .
- the analysis phase 104 may receive an input data 510 from the database 108 .
- the input data 510 may be divided into split 0 - 4 .
- the analysis phase may include multiple components M 686 A-C.
- the additional analysis phase 106 may include the R 688 A-B.
- the M 686 A-C may each represent a map operation performed on a limited data input (e.g., split 0 and 1 , split 2 and 4 , split 3 , etc.).
- the M 686 A-C may generate intermediate files 690 , which may be communicated to the additional analysis phase 106 .
- the R 688 A-B may represent reduce operations in which the output of the map phases are accessible by each of the reduce operations.
- the R 688 A-B of the additional analysis phase 106 may produce output file 0 - 1 (e.g., the response 116 ).
- FIG. 7 is an illustration of processing input data to generate a response, according to an alternate embodiment.
- FIG. 7 illustrates the input data 510 , the analysis phase 104 , the additional analysis phase 106 , the response 116 , the developer 122 , the M 686 A-B and the R 688 A-B.
- the input data 510 may include two text files (e.g., the dog, the cat).
- the analysis phase 104 may separate the text files into separate parts (e.g., the, dog, the, cat, etc.).
- the output of the operations M 686 A-B may be automatically redistributed to the parts of the additional analysis phase 106 .
- the outputs of the operations M 686 A-B may be sorted and/or categorized.
- the operations of the additional analysis phase, R 688 A-B may form the response 116 .
- the query response may include a count of each word (e.g., 1 “cat,” 1 “dog,” 2 “the,” etc.).
- the M 686 A-B may each be limited to a part of the input data 510 .
- the R 688 A-B may be capable of accessing all outputs of the analysis phase 104 .
- the developer 122 may customize and/or affect the operations (e.g., the M 686 A-B, the R 688 A-B, etc.) while the distribution of the analysis phase 104 and/or the additional analysis phase 106 are automatically handled.
- the operations e.g., the M 686 A-B, the R 688 A-B, etc.
- FIG. 8 is a diagrammatic system view of a data processing system in which any of the embodiments disclosed herein may be performed, according to one embodiment.
- the diagrammatic system view 800 of FIG. 8 illustrates a processor 802 , a main memory 804 , a static memory 806 , a bus 808 , a video display 810 , an alpha-numeric input device 812 , a cursor control device 814 , a drive unit 816 , a signal generation device 818 , a network interface device 820 , a machine readable medium 822 , instructions 824 , and a network 826 , according to one embodiment.
- the diagrammatic system view 800 may indicate a personal computer and/or the data processing system in which one or more operations disclosed herein are performed.
- the processor 802 may be a microprocessor, a state machine, an application specific integrated circuit, a field programmable gate array, etc. (e.g., Intel® Pentium® processor).
- the main memory 804 may be a dynamic random access memory and/or a primary memory of a computer system.
- the static memory 806 may be a hard drive, a flash drive, and/or other memory information associated with the data processing system.
- the bus 808 may be an interconnection between various circuits and/or structures of the data processing system.
- the video display 810 may provide graphical representation of information on the data processing system.
- the alpha-numeric input device 812 may be a keypad, a keyboard and/or any other input device of text (e.g., a special device to aid the physically handicapped).
- the cursor control device 814 may be a pointing device such as a mouse.
- the drive unit 816 may be the hard drive, a storage system, and/or other longer term storage subsystem.
- the signal generation device 818 may be a bios and/or a functional operating system of the data processing system.
- the network interface device 820 may be a device that performs interface functions such as code conversion, protocol conversion and/or buffering required for communication to and from the network 826 .
- the machine readable medium 822 may provide instructions on which any of the methods disclosed herein may be performed.
- the instructions 824 may provide source code and/or data code to the processor 802 to enable any one or more operations disclosed herein.
- FIG. 9 is a process flow of interpreting and executing a customizable database request, according to one embodiment.
- an interpretation of a customizable database request may be generated (e.g., using the translation module 340 and/or the query planning module 102 ), which may include an extensible computer process.
- an input guidance may be provided to available processors of an available computing environment 112 .
- an input of each of the available processors may be pre-processed (e.g., using the query planning module 102 ) when providing the input guidance to the available processors.
- an information may be provided to the extensible computer process about its context in the customizable database request (e.g., using the dynamic interpretation module 334 and/or the reference module 342 ).
- an interpretation of the customizable database request may be processed (e.g., using the query planning module 102 ) based on the information provided.
- an execution of the interpretation may be automatically distributed (e.g., using the analysis phase 104 ) across the available computing environment operating concurrently and in parallel (e.g., using the reference module 342 ).
- a fault may be detected (e.g., using the detection module 450 of the monitoring module 114 ) in the execution of the interpretation.
- a response may be automatically assembled (e.g., by the response organization module 458 ) using a distributed output of the execution.
- an output of each of the available processors may be post processed (e.g., by the response organization module 458 ) when automatically assembling the response.
- FIG. 10 is a process flow of automatically distributing an analysis phase and an additional analysis phase of the interpretation of a customizable database request across the available computing environment, according to one embodiment.
- an interpretation of a customizable database request which includes an extensible computer process may be generated (e.g., using the SQL instruction module 332 , the translation module 340 , and/or the optimization module 330 of the query planning module 102 ).
- an input guidance may be provided to available processors of an available computing environment.
- an extensible computer process information may be provided information about its context in the customizable database request (e.g., using the reference module 342 ).
- an information may be processed (e.g., using the dynamic interpretation module 334 ) that affects the interpretation of the customizable database request based on the information provided.
- an input of each of the available processors may be pre-processed when providing the input guidance to the available processors.
- an analysis phase of the interpretation may be automatically distributed (e.g., using the parallelization module 454 ) across the available computing environment operating computing environment operating concurrently and in parallel.
- an additional analysis phase of the interpretation may be automatically distributed (e.g., using the additional parallelization module 456 ) across the available computing environment.
- an output of each of the available processors may be post processed when the response is automatically assembled (e.g., using the response organization module 458 ).
- the various devices, modules, analyzers, generators, etc. described herein may be enabled and operated using hardware circuitry (e.g., CMOS based logic circuitry), firmware, software and/or any combination of hardware, firmware, and/or software (e.g., embodied in a machine readable medium).
- hardware circuitry e.g., CMOS based logic circuitry
- firmware e.g., software and/or any combination of hardware, firmware, and/or software (e.g., embodied in a machine readable medium).
- the various structures and methods may be embodied using transistors, logic gates, and electrical circuits (e.g., application specific integrated (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
- ASIC application specific integrated
- DSP Digital Signal Processor
- the extensible computer process 100 , the query planning module 102 , the analysis phase 104 , the additional analysis phase 106 A-N, the monitoring module 114 , the user interface 118 , the optimization module 330 , the SQL instruction module 332 , the dynamic interpretation module 334 , the function module 336 , the developer operation module 338 , the translation module 340 , the reference module 342 , the detection module 450 , the rectification module 452 , the parallelization module 454 , the additional parallelization module 456 , the response organization module 458 , the component 560 , the additional component 562 , the M 686 A-C, and the R 688 A-B of FIGS. 1-10 may be enabled using software and/or circuitry.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
- Stored Programmes (AREA)
Abstract
Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment is disclosed. In an embodiment, a method includes generating an interpretation of a customizable database request which includes an extensible computer process and providing an input guidance to available processors of an available computing environment. The method further includes automatically distributing an execution of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the execution is limited to at least a part of an input data. The method also includes automatically assembling a response using a distributed output of the execution.
Description
- This disclosure relates generally to interpretation and execution of a customizable database request using an extensible computer process and an available computing environment.
- A database analyst may seek to request information from a database but may be prevented from doing so by a lack of an ability to customize a database query. The database analyst may also be unable to distribute the processing of the query across a distributed computational environment, which may include one or more servers. The database analyst may be restricted to a limited set of queries that may limit the effectiveness of the analyst's ability to obtain information from the database. The analyst may therefore seek data inefficiently using an excessive number of queries. The data analyst may also be required to transfer the processed information of the database to a separate process to analyze the data. The database analyst may therefore be required to spend an excessive amount of time obtaining information, which may lead to a delay, an additional cost of the analyst's time, an additional time for a processor usage, and a greater possibility of incurring a human made error. The database analyst may ultimately fail to find a desired information.
- Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment is disclosed. In an aspect, a method includes generating an interpretation of a customizable database request which includes an extensible computer process and providing an input guidance to available processors of an available computing environment. The method further includes automatically distributing an execution of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the execution is limited to at least a part of an input data. The method also includes automatically assembling a response using a distributed output of the execution.
- The input guidance may be provided to each of the available processors and may be comprised of certain portions of the input data. The input guidance may be used to determine which of the available processors are to perform functions related to the at least the part of the input data. The method may further include providing an information to the extensible computer process about its context in the customizable database request, and processing an interpretation of the customizable database request based on the information provided. The extensible computer process may be a developer provided-computer program, and the information provided may include at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data and the output data, a distribution information, a length of the input data and the output data, and a custom parameter.
- The custom parameter may be at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request. The method may further include post processing an output of each of the available processors when automatically assembling the response. The post processing may include at least one database operation including at least one of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
- The method may further include pre-processing an input of each of the available processors when providing the input guidance to the available processors. The available computing environment may be comprised of at least two servers. The customizable database request may specify the input data for the extensible computer process. The input data may be structured in a form comprising at least one of a database table and an output of a different database query.
- The input data may be unstructured in a form comprising a content of at least one file in a computing environment. The method may further include detecting a fault in the execution of the interpretation, and automatically rectifying an output effect of the fault. Rectifying the output effect of the fault may include at least one of reprocessing an operation, excluding a corrupted data, and logging the corrupted data. The customizable database request may be comprised of at least one of a predetermined function, a developer created function, and an analyst created function.
- In another aspect, a system may include a query planning module to generate an interpretation of a database request which includes an extensible computer process, and a parallelization module to provide an information to available processors of an available computing environment and to automatically distribute an execution of the interpretation across the available computing environment operating concurrently and in parallel. A component of the execution may be limited to at least a part of an input data. The system may further include a response organization module to automatically assemble a response using a distributed output of the execution.
- The information may be used to provide each of the available processors certain portions of the input data, and to determine which of the available processors are to perform functions related to the at least the part of the input data. The system may include a reference module to provide an extensible computer process information about its context in the database request. The system may include a dynamic interpretation module to process information that affects the interpretation of the database request based on the information provided, wherein the extensible computer process is a developer provided-computer program.
- The information provided may include a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters. The custom parameters may be at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the database request.
- In yet another aspect, a method includes generating an interpretation of a customizable database request which includes an extensible computer process, and providing an input guidance to available processors of an available computing environment. The input guidance determines which of the available processors are to perform functions related to the at least a part of an input data. The method further includes pre-processing an input of each of the available processors when providing the input guidance to the available processors, and automatically distributing an analysis phase of the interpretation across the available computing environment operating concurrently and in parallel. A component of the analysis phase is limited to at least a part of the input data.
- The method further includes automatically distributing an additional analysis phase of the interpretation across the available computing environment, and automatically assembling a response using a distributed output of the additional analysis phase. The method also includes post processing an output of each of the available processors when automatically assembling the response. The post processing includes at least one database operation including one or more of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
- The method may include providing an extensible computer process information about its context in the customizable database request, and processing information that affects the interpretation of the customizable database request based on the information provided. In the aspect, the extensible computer process is a developer provided-computer program, and the information provided includes at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters. The custom parameters are one or more of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request.
- Other aspects and example embodiments are provided in the drawings and the detailed description that follows.
- Example embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
-
FIG. 1 is a system view illustrating processing of a customizable database query using a developer extensible operation and an available computing environment, according to one embodiment. -
FIG. 2 is an exploded view of the available computing environment, according to one embodiment. -
FIG. 3 is an exploded view of a query planning module, according to one embodiment. -
FIG. 4 is an exploded view of a monitoring module, according to another embodiment. -
FIG. 5 is an illustration of processing input data to generate a query response, according to another embodiment. -
FIG. 6 is a system view of an alternate embodiment of processing of a customizable database query using a developer extensible operation and an available computing environment. -
FIG. 7 is an illustration of processing input data to generate a query response, according to an alternate embodiment. -
FIG. 8 is a diagrammatic system view of a data processing system in which any of the embodiments disclosed herein may be performed, according to one embodiment. -
FIG. 9 is a process flow of interpreting and executing a customizable database request, according to one embodiment. -
FIG. 10 is a process flow of automatically distributing an analysis phase and an additional analysis phase of the interpretation of a customizable database request across the available computing environment, according to one embodiment. - Other features of the present embodiments will be apparent from the accompanying drawings and from the detailed description that follows.
- Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment is disclosed. Although the present embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the various embodiments.
-
FIG. 1 is a system view illustrating processing of a customizable database query using a developer extensible operation and an available computing environment, according to one embodiment. In particular,FIG. 1 illustrates anextensible computer process 100, aquery planning module 102, ananalysis phase 104, anadditional analysis phase 106A-N, anavailable computing environment 112, amonitoring module 114, aresponse 116, auser interface 118, ananalyst 120, adeveloper 122, a customizable database request 124, andservers 126. -
FIG. 1 illustrates ananalyst 120 providing a customizable database request 124 to aextensible computer process 100. Theanalyst 120 may be a database analyst who is familiar with SQL (e.g., a Structured Query Language). SQL may be a database computer language designed for the retrieval and management of data in relational database management systems (RDBMS), database schema creation and modification, and database object access control management. Theanalyst 120 may have limited knowledge of other programming languages, and may have a substantially limited ability to create programs, to modify software, and to manage software distributed across multiple processors. Theanalyst 120 may be tasked with searching for data rather than developing programs. - The customizable database request 124 may consist of a SQL instruction and/or it may be written in any programming language. The customizable database request 124 may be customized to include a function (e.g., a nested SQL command, a mathematical equation, a variable, a standard deviation, etc.). The function may be created by the
analyst 120, thedeveloper 122, and/or it may be a predetermined function. The function may be customized to search multiple records at once, to retrieve and/or manipulate data in multiple forms (e.g., tables, images,unstructured data 584, text files, programs, sound files, photos, etc.). The function may access data in one form and generate data in another form. The customizable database request 124 may further specify aninput data 510 for theextensible computer process 100. - The customizable database request 124 may allow the process to be scaled in accordance with a changing system hardware and/or performance of a system. The function may allow user-implemented procedural code to be uploaded to a database and executed at each node of a system. A user (e.g., an
analyst 120, adeveloper 122, etc.) may provide code that may operate on individual rows and/or on groups of rows. The customizable database request 124 may take in input using a set of rows in a table (e.g., a persistent table in a database, the output of a SQL SELECT statement and/or the output of another function, etc.). The customizable database request 124 may result in an output that includes a relation of a set of rows (e.g., an output unrelated to the input.) The customizable database request 124 and/or a function of the customizable database request 124 may be placed into a SQL SELECT query and/or any other query as though it were itself a table. This integration with SQL may allow for composing SQL and procedural code invocations in any form and shape. The code may be written in Java, Python, and/or any other language. - In an embodiment, the customizable database request 124 may include a function that is written in Java that is then invoked as part of a SQL query statement. The function may convert sets of rows to sets of rows. The function may be parallelized to operate on rows across multiple nodes simultaneously. The function may be invoked on arbitrary sets of rows and/or rows grouped together by a PARTITION BY clause. Within a partition, rows may be further sorted using an ORDER BY clause.
- In an embodiment, a function may split strings into words. In the embodiment, the function may be invoked once for every row in an input table. The function may include Java procedural code that takes each document and emits a row for each word. The function may define a column that appears in its output rows. In another embodiment, a function may be created to compute the 10 most-frequently occurring words in a body of text using the function to split strings into words.
- In yet another embodiment, a function of the customizable database request 124 may perform sessionization by mapping each click in a clickstream to a unique session identifier. The function may define a session as a sequence of clicks by a particular user where no more than n seconds pass between successive clicks (e.g., if a click from a user isn't seen for n seconds, a new session is started. The function may use a userid and/or a timestamp attribute. The function may include as parameters the name of the timestamp attribute, the number of seconds between clicks that results in starting a new session. A clickstream table may be partitioned by userid, and partition tuples may be sequenced by timestamp. The sessionize function may then be invoked against each of the ordered partitions and/or emit the input tuples with an appropriate sessionid added.
- The customizable database request 124 may be received by an
extensible computer process 100, which may be designed to take into consideration future growth by allowing the addition and/or modification of functionality. The addition of new functionality and/or the modification of existing functionality may be accomplished with limited impact to existing system functions. Adeveloper 122 may be familiar with a type of programming involving database analysis, query modification, and/or data searches. Thedeveloper 122 may possess limited knowledge regarding programs to distribute an analysis across multiple computing systems. Thedeveloper 122 may support and/or design software for theanalyst 120. Thedeveloper 122 may adapt theextensible computer process 100 to add new functions, modify existing functions, and/or add additional language ability to the software. - The
extensible computer process 100 may communicate with aquery planning module 102 to generate a query interpretation of the customizable database request 124. The query interpretation may be formatted to be distributable (e.g., separated into individual tasks for separate processes, etc.). The query interpretation may convert the customizable database request 124 from any computer language (e.g., a machine-readable artificial language designed to express computations that can be performed by a machine, C++, SQL, Perl, Java, Prolog, etc.) into a preferred programming language. The query interpretation may automatically format the customizable database query to be processed using a distributable, multiphase analysis. - The
query planning module 102 may generate an interpretation (e.g., the query interpretation) of the customizable database request, which may include an extensible computer process. Thequery planning module 102 may optimize the analysis phase and/or the additional analysis phase using a parameter (e.g., an expected output file size, an input file format, a table dimension, etc.). The query planning module may provide an input guidance to available processors of the available computing environment. The input guidance may include certain portions of the input data, and the input guidance may be used to determine which of the available processors are to perform functions related to different parts of the input data. - The
query planning module 102 may use the parameter to allocate a system resource (e.g., memory, power supply output, processor usage, a number of servers applied, a sequence of processors used, a timing of processes analyzed, etc.). The allocation of a system resource may include a distribution of processes across anavailable computing environment 112, a selection of a type of analysis to apply, and/or a selection of input data to review. The execution of the interpretation may be automatically distributed across an available computing environment operating concurrently and in parallel, and a component of the execution may be limited to a part of the input data. The part of the input data may be a subset of the input data, which may allow the execution to be divided into separate tasks to be processed by different machines. - The available computing environment 112 (e.g., networked processors, virtual machines, multiple processors of a server,
multiple servers 126A-N and 128A-N, etc.) may be comprised of servers that are and/or will be available to process data. Theavailable computing environment 112 may be better illustrated inFIG. 2 . - The query interpretation may be dynamically determined based on a context (e.g., a repeated pattern of requested information, an association between an analyst's customizable database request 124 and an
input data 510, anavailable input data 510, etc.). The context of the customizable database request 124 may include the type of requested information, the language of the request, and/or the expectedresponse 116. For example, if the analyst's request includes a name and address, theanalysis phase 104 and/or theadditional analysis phase 106A-N may be adjusted to provide aresponse 116 that includes GPS coordinates (e.g., latitude and/or longitude, etc.). In another embodiment, the query interpretation may automatically provide alternate responses based on a variation of the requested parameters, such as by expanding or contracting a search parameter to provide alternate responses, varying search parameters, and searching for peak values. - The interpretation of the customizable database request generated by the
query planning module 102 may be processed based on a contextual information provided to the extensible computer process. The extensible computer process may be a developer provided-computer program. The information provided may include a format of the input data and the output data, whether the input data and the output data are ordered and in which form, grouping information, statistics of the input data and the output data, a distribution information, a length of the input data and the output data, and a custom parameter. - The custom parameter may be a number, a string, and/or a list of numbers of strings. The custom parameter may further include a content of a file in the available computing environment, and/or a result of the customizable database request (e.g., the response 116).
- The query interpretation generated by the
query planning module 102 may be communicated to ananalysis phase 104, which may be automatically distributed across anavailable computing environment 112. The automatic distribution of the query interpretation may allow separate machines to analyze the query using portions of aninput data 510 simultaneously, in parallel, in an overlapping sequence, and/or in series. - The
analysis phase 104 may include a component that is limited to a part of theinput data 510. The component may process a part of a “map” phase of a MapReduce analysis (e.g., a framework for computing a distributable problem). The component may process a part of theanalysis phase 104 using its part of theinput data 510. Theanalysis phase 104 may also include an additional component that uses the output of the component to generate an additional output (e.g., the additional component operates in series with the component, the additional component uses the output of the component as one of several inputs, etc.). - The
analysis phase 104 may process the query interpretation using theinput data 510, which may be acquired from the database 108A-N. Theinput data 510 may include structured data and/orunstructured data 584, as illustrated inFIG. 5 . The input data of the analysis phase may be generated using a combination of multiple data sources (e.g., multiple tables, storage devices, etc.). The portion of the input data used by a component of theanalysis phase 104 may also be generated using a combination of multiple data sources. - The
analysis phase 104 may communicate with amonitoring module 114 and/or theadditional analysis phase 106A-N, which may be automatically distributed across the available computing environment (e.g., currently available servers, virtual machines, processors, etc.). Theadditional analysis phase 106A-N may access a greater amount of information that the amount of theinput data 510 used by theanalysis phase 104. Theadditional analysis phase 106A-N may operate in parallel, in series, or in any other pattern with theanalysis phase 104. - The
response 116 may be automatically assembled using a distributed output of theadditional analysis phase 106A-N. The output of theadditional analysis phase 106A-N may be distributed across multiple processors, servers, and/or virtual machines, and a complete resulting output may require an accumulation of all distributed parts of theadditional analysis phase 106A-N output. The assembled output may be theresponse 116. Theresponse 116 may be displayed through a user interface (e.g., a web browser, a terminal, a PC, a server, a monitor, etc.). - The
monitoring module 114 may observe theinput data 510 provided to theanalysis phase 104, theavailable computing environment 112, the input to theadditional analysis phase 106A-N, the processing of information by theadditional analysis phase 106A-N, and the assembledresponse 116. Themonitoring module 114 may manage the automatic distribution of theanalysis phase 104 and/or theadditional analysis phase 106A-N across theavailable computing environment 112. Themonitoring module 114 may assemble the distributed output of theadditional analysis phase 106A-N to generate theresponse 116. - The
monitoring module 114 may detect a fault (e.g., an exception, a hardware failure, a system crash, a processor failure, a data error, a processing error, etc.) in theanalysis phase 104 and/or theadditional analysis phase 106A-N. Themonitoring module 114 may automatically rectify an output effect (e.g., a data corruption, a propagating data error, a system failure, etc.) of the fault. The rectification may include one or more of reprocessing an operation (e.g., a component of theanalysis phase 104, theadditional analysis phase 106A-N, etc.), excluding a corrupted data, and/or logging a corrupted data. The rectification may include isolating a fault generating process and/or hardware mechanism. Themonitoring module 114 may rectify an output effect automatically (e.g., without intervention by thedeveloper 122 and/or analyst 120). -
FIG. 2 is an exploded view of theavailable computing environment 112 illustrated inFIG. 1 , according to one embodiment. In particular,FIG. 2 illustrates theavailable computing environment 112, theservers 126A-N, and the databases 108A-N, according to one embodiment. Theavailable computing environment 112 may include one or more servers that are currently or will be open to process information within a preferred time frame. Theservers 126A-N of theavailable computing environment 112 may be comprised of one or more separate servers, virtual machines, client devices, and/or separate processors of a single server. Theservers 126A-N may communicate with one or more databases (e.g., databases 108A-N), which may be included within theavailable computing environment 112. Theservers 126A-N and the databases 108A-N may communicate with each other via a LAN, a WAN, a MAN, and/or any other network arrangement. In addition, the databases 108A-N may include direct attached storage devices, volatile and/or non-volatile memory. -
FIG. 3 is an exploded view of the query planning module, according to one embodiment. In particular,FIG. 3 includes thequery planning module 102, anoptimization module 330, aSQL instruction module 332, adynamic interpretation module 334, afunction module 336, adeveloper operation module 338, atranslation module 340, and areference module 342. - The
query planning module 102 may include multiple modules to perform various functions. For example, theoptimization module 330 may optimize theanalysis phase 104 and/or theadditional analysis phase 106A-N using a parameter included with the customizable data request. The parameter may include a prediction and/or expectation regarding the response 116 (e.g., an output memory requirement, a number of generated responses, a range of response outputs, a type ofinput data 510, etc.). TheSQL instruction module 332 may interpret a SQL command, a nested SQL instruction, etc. - The
dynamic interpretation module 334 may dynamically determine a query interpretation of the customizable database request 124 based on a context (e.g., a scope and/or format of the customizable database request 124, an aspect of theinput data 510, the available computing environment, etc.). The analysis may be dynamically altered in accordance with the query interpretation. - The
function module 336 may alter the query interpretation based on a function (e.g., a predetermined function, an analyst and/or developer created function, etc.). The function may be an equation, a programming command, a sequence of commands, etc. Thedeveloper operation module 338 may generate the query interpretation based on an operation added and/or modified by a developer in theextensible computer process 100. Thetranslation module 340 may generate the query interpretation by translating the customizable database request 124 from any language (e.g., a computer programming language such as SQL, Java, dBase, and/or a human language such as Indonesian, Russian, Spanish, and/or Chinese). Thereference module 342 may provide an extensible computer process information about its context in the database request. -
FIG. 4 is an exploded view of the monitoring module, according to another embodiment. In particular,FIG. 4 illustrates themonitoring module 114, adetection module 450, arectification module 452, aparallelization module 454, anadditional parallelization module 456, and aresponse organization module 458. - The
detection module 450 may observe an input and/or an output of theanalysis phase 104, theservers 126A-N, and theavailable computing environment 112, theadditional analysis phase 106A-N. Thedetection module 450 may also observe the operation and transmitted data of the database 108A-N, the query planning module, and/or theextensible computer process 100. Thedetection module 450 may automatically detect a fault in theanalysis phase 104 and/or theadditional analysis phase 106A-N. - The
rectification module 452 may automatically rectify an output effect (e.g., a process failure, a system crash, a corrupted data, a propagating failure, etc.) of the fault. The automatic rectification may include an isolation of the fault generating mechanism (e.g., a process, a server, a component, etc.). The automatic rectification may include re-executing an interrupted process (e.g., theanalysis phase 104, the component, theadditional analysis phase 106A-N, etc.). The automatic rectification may include logging the fault and/or the corrupted data. The rectified data may be excluded (e.g., from a query response, a repeatedanalysis phase 104, etc.). - The
parallelization module 454 may automatically distribute the analysis phase of the query interpretation across an available computing environment. Theadditional parallelization module 456 may automatically distribute the additional analysis phase of the query interpretation across the available computing environment. Theparallelization module 454 and/or theadditional parallelization module 456 may consider a number of processors available, the number of analyses to be performed, and/or the sequence of the distributed processes. - The
response organization module 458 may automatically assemble theresponse 116 using the distributed output of the additional analysis phase. Theresponse organization module 458 may wait for a completion of all necessary processes prior to assembling theresponse 116. Theresponse organization module 458 may further post process an output of each of the available processors when automatically assembling the response. The post processing may include a database operation, such as an aggregation operation, a sorting operation, and/or an invocation of a separate extensible computer process (e.g., an external program, a developer created function, a third-party software, etc.). -
FIG. 5 is an illustration of processing input data to generate a query response, according to another embodiment. In particular,FIG. 5 illustrates theanalysis phase 104, theadditional analysis phase 106, theinput data 510, theresponse 116, acomponent 560, anadditional component 562, a table 564,text 566, anobject 568, anaudio file 570, avideo file 572, an output table 574, anoutput text 576, anoutput object 578, anaudio file 580, an output video file 582, and anunstructured data 584. -
FIG. 5 illustrates a variety of types and forms that may be taken by theinput data 510 and/or theresponse 116. Theinput data 510 may include the table 564, thetext 566, theobject 568, theaudio file 570, and/or thevideo file 572. The input data may be structured in a form including a database table and/or an output of a different database query. Theresponse 116 may include the output table 574, theoutput text 576, theoutput object 578, theoutput audio file 580, and/or the output video file 582. The table 564 and/or the output table 574 may be structured data. Thetext 566, theobject 568, theaudio file 570, thevideo file 572, theoutput text 576, theoutput object 578, theoutput audio file 580, and/or the output video file 582 may beunstructured data 584. Theinput data 510 may be unstructured in a form including a content of at least one file in a computing environment. Theunstructured data 584 may include a mix of data types, including images and audio files, text, programs, and/or word processing files. - The
input data 510 may be communicated to theanalysis phase 104, which may process the data in thecomponent 560 and/or theadditional component 562. The output of theanalysis phase 104 may be received by theadditional analysis phase 106A-N, which may generate theresponse 116. Theadditional analysis phase 106A-N may consist of one or more phases. Theresponse 116 may be formed of the same and/or a different data type from theinput data 510. -
FIG. 6 is a system view of an alternate embodiment of processing of a customizable database query using a developer extensible operation and anavailable computing environment 112. In particular,FIG. 6 illustrates thequery planning module 102, theanalysis phase 104, theadditional analysis phase 106, thedatabase 108, theinput data 510, themonitoring module 114, theresponse 116, theanalyst 120, thedeveloper 122,M 686A-C,R 688A-B, andintermediate files 690. - The
query planning module 102 may receive a customizable database request 124 from theanalyst 120. Thedeveloper 122 may contribute to and/or modify the customizable database request 124. Thequery planning module 102 may communicate a query interpretation to theanalysis phase 104. Theanalysis phase 104 may receive aninput data 510 from thedatabase 108. Theinput data 510 may be divided into split 0-4. The analysis phase may include multiple components M 686A-C. Theadditional analysis phase 106 may include theR 688A-B. TheM 686A-C may each represent a map operation performed on a limited data input (e.g., split 0 and 1, split 2 and 4, split 3, etc.). TheM 686A-C may generateintermediate files 690, which may be communicated to theadditional analysis phase 106. TheR 688A-B may represent reduce operations in which the output of the map phases are accessible by each of the reduce operations. TheR 688A-B of theadditional analysis phase 106 may produce output file 0-1 (e.g., the response 116). -
FIG. 7 is an illustration of processing input data to generate a response, according to an alternate embodiment. In particular,FIG. 7 illustrates theinput data 510, theanalysis phase 104, theadditional analysis phase 106, theresponse 116, thedeveloper 122, theM 686A-B and theR 688A-B. - The
input data 510 may include two text files (e.g., the dog, the cat). Theanalysis phase 104 may separate the text files into separate parts (e.g., the, dog, the, cat, etc.). The output of theoperations M 686A-B may be automatically redistributed to the parts of theadditional analysis phase 106. The outputs of theoperations M 686A-B may be sorted and/or categorized. The operations of the additional analysis phase,R 688A-B may form theresponse 116. The query response may include a count of each word (e.g., 1 “cat,” 1 “dog,” 2 “the,” etc.). TheM 686A-B may each be limited to a part of theinput data 510. TheR 688A-B may be capable of accessing all outputs of theanalysis phase 104. - The
developer 122 may customize and/or affect the operations (e.g., theM 686A-B, theR 688A-B, etc.) while the distribution of theanalysis phase 104 and/or theadditional analysis phase 106 are automatically handled. -
FIG. 8 is a diagrammatic system view of a data processing system in which any of the embodiments disclosed herein may be performed, according to one embodiment. Particularly, the diagrammatic system view 800 ofFIG. 8 illustrates aprocessor 802, amain memory 804, astatic memory 806, abus 808, avideo display 810, an alpha-numeric input device 812, acursor control device 814, adrive unit 816, asignal generation device 818, anetwork interface device 820, a machinereadable medium 822,instructions 824, and anetwork 826, according to one embodiment. - The diagrammatic system view 800 may indicate a personal computer and/or the data processing system in which one or more operations disclosed herein are performed. The
processor 802 may be a microprocessor, a state machine, an application specific integrated circuit, a field programmable gate array, etc. (e.g., Intel® Pentium® processor). Themain memory 804 may be a dynamic random access memory and/or a primary memory of a computer system. - The
static memory 806 may be a hard drive, a flash drive, and/or other memory information associated with the data processing system. Thebus 808 may be an interconnection between various circuits and/or structures of the data processing system. Thevideo display 810 may provide graphical representation of information on the data processing system. The alpha-numeric input device 812 may be a keypad, a keyboard and/or any other input device of text (e.g., a special device to aid the physically handicapped). - The
cursor control device 814 may be a pointing device such as a mouse. Thedrive unit 816 may be the hard drive, a storage system, and/or other longer term storage subsystem. Thesignal generation device 818 may be a bios and/or a functional operating system of the data processing system. Thenetwork interface device 820 may be a device that performs interface functions such as code conversion, protocol conversion and/or buffering required for communication to and from thenetwork 826. The machinereadable medium 822 may provide instructions on which any of the methods disclosed herein may be performed. Theinstructions 824 may provide source code and/or data code to theprocessor 802 to enable any one or more operations disclosed herein. -
FIG. 9 is a process flow of interpreting and executing a customizable database request, according to one embodiment. Inoperation 902, an interpretation of a customizable database request may be generated (e.g., using thetranslation module 340 and/or the query planning module 102), which may include an extensible computer process. Inoperation 904, an input guidance may be provided to available processors of anavailable computing environment 112. Inoperation 906, an input of each of the available processors may be pre-processed (e.g., using the query planning module 102) when providing the input guidance to the available processors. Inoperation 908, an information may be provided to the extensible computer process about its context in the customizable database request (e.g., using thedynamic interpretation module 334 and/or the reference module 342). Inoperation 910, an interpretation of the customizable database request may be processed (e.g., using the query planning module 102) based on the information provided. Inoperation 912, an execution of the interpretation may be automatically distributed (e.g., using the analysis phase 104) across the available computing environment operating concurrently and in parallel (e.g., using the reference module 342). Inoperation 914, a fault may be detected (e.g., using thedetection module 450 of the monitoring module 114) in the execution of the interpretation. Inoperation 918, a response may be automatically assembled (e.g., by the response organization module 458) using a distributed output of the execution. Inoperation 920, an output of each of the available processors may be post processed (e.g., by the response organization module 458) when automatically assembling the response. -
FIG. 10 is a process flow of automatically distributing an analysis phase and an additional analysis phase of the interpretation of a customizable database request across the available computing environment, according to one embodiment. Inoperation 1002, an interpretation of a customizable database request which includes an extensible computer process may be generated (e.g., using theSQL instruction module 332, thetranslation module 340, and/or theoptimization module 330 of the query planning module 102). Inoperation 1004, an input guidance may be provided to available processors of an available computing environment. Inoperation 1006, an extensible computer process information may be provided information about its context in the customizable database request (e.g., using the reference module 342). Inoperation 1008, an information may be processed (e.g., using the dynamic interpretation module 334) that affects the interpretation of the customizable database request based on the information provided. Inoperation 1010, an input of each of the available processors may be pre-processed when providing the input guidance to the available processors. Inoperation 1012, an analysis phase of the interpretation may be automatically distributed (e.g., using the parallelization module 454) across the available computing environment operating computing environment operating concurrently and in parallel. Inoperation 1014, an additional analysis phase of the interpretation may be automatically distributed (e.g., using the additional parallelization module 456) across the available computing environment. Inoperation 1016, an output of each of the available processors may be post processed when the response is automatically assembled (e.g., using the response organization module 458). - Although the present embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the various embodiments. For example, the various devices, modules, analyzers, generators, etc. described herein may be enabled and operated using hardware circuitry (e.g., CMOS based logic circuitry), firmware, software and/or any combination of hardware, firmware, and/or software (e.g., embodied in a machine readable medium). For example, the various structures and methods may be embodied using transistors, logic gates, and electrical circuits (e.g., application specific integrated (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
- Particularly, the
extensible computer process 100, thequery planning module 102, theanalysis phase 104, theadditional analysis phase 106A-N, themonitoring module 114, theuser interface 118, theoptimization module 330, theSQL instruction module 332, thedynamic interpretation module 334, thefunction module 336, thedeveloper operation module 338, thetranslation module 340, thereference module 342, thedetection module 450, therectification module 452, theparallelization module 454, theadditional parallelization module 456, theresponse organization module 458, thecomponent 560, theadditional component 562, theM 686A-C, and theR 688A-B ofFIGS. 1-10 may be enabled using software and/or circuitry. - In addition, it will be appreciated that the various operations, processes, and methods disclosed herein may be embodied in a machine-readable medium and/or a machine accessible medium compatible with a data processing system (e.g., a computer system), and may be performed in any order (e.g., including using means for achieving the various operations). Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
Claims (20)
1. A method comprising:
generating an interpretation of a customizable database request which includes an extensible computer process;
providing an input guidance to available processors of an available computing environment;
automatically distributing an execution of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the execution is limited to at least a part of an input data; and
automatically assembling a response using a distributed output of the execution.
2. The method of claim 1 , wherein the input guidance is provided to each of the available processors and is comprised of certain portions of the input data, and wherein the input guidance is used to determine which of the available processors are to perform functions related to the at least the part of the input data.
3. The method of claim 1 further comprising:
providing an information to the extensible computer process about its context in the customizable database request; and
processing an interpretation of the customizable database request based on the information provided, wherein the extensible computer process is a developer provided-computer program, and wherein the information provided includes at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data and the output data, a distribution information, a length of the input data and the output data, and a custom parameter.
4. The method of claim 3 , wherein the custom parameter is at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request.
5. The method of claim 1 further comprising post processing an output of each of the available processors when automatically assembling the response.
6. The method of claim 5 wherein the post processing includes at least one database operation including at least one of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
7. The method of claim 1 further comprising pre-processing an input of each of the available processors when providing the input guidance to the available processors.
8. The method of claim 1 wherein the available computing environment is comprised of at least two servers.
9. The method of claim 1 , wherein the customizable database request specifies the input data for the extensible computer process.
10. The method of claim 7 , wherein the input data is structured in a form comprising at least one of a database table and an output of a different database query.
11. The method of claim 7 , wherein the input data is unstructured in a form comprising a content of at least one file in a computing environment.
12. The method of claim 1 , further comprising:
detecting a fault in the execution of the interpretation; and
automatically rectifying an output effect of the fault.
13. The method of claim 12 , wherein rectifying the output effect of the fault includes at least one of reprocessing an operation, excluding a corrupted data, and logging the corrupted data.
14. The method of claim 1 , wherein the customizable database request is comprised of at least one of a predetermined function, a developer created function, and an analyst created function.
15. A system comprising:
a query planning module to generate an interpretation of a database request which includes an extensible computer process;
a parallelization module to provide an information to available processors of an available computing environment and to automatically distribute an execution of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the execution is limited to at least a part of an input data; and
a response organization module to automatically assemble a response using a distributed output of the execution.
16. The system of claim 15 , wherein the information is used to provide each of the available processors certain portions of the input data, and to determine which of the available processors are to perform functions related to the at least the part of the input data.
17. The system of claim 15 further comprising:
a reference module to provide an extensible computer process information about its context in the database request; and
a dynamic interpretation module to process information that affects the interpretation of the database request based on the information provided, wherein the extensible computer process is a developer provided-computer program, and wherein the information provided includes a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters.
18. The system of claim 17 , wherein the custom parameters are at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the database request.
19. A method comprising:
generating an interpretation of a customizable database request which includes an extensible computer process;
providing an input guidance to available processors of an available computing environment, wherein the input guidance determines which of the available processors are to perform functions related to the at least a part of an input data;
pre-processing an input of each of the available processors when providing the input guidance to the available processors;
automatically distributing an analysis phase of the interpretation across the available computing environment operating concurrently and in parallel, wherein a component of the analysis phase is limited to at least a part of the input data;
automatically distributing an additional analysis phase of the interpretation across the available computing environment;
automatically assembling a response using a distributed output of the additional analysis phase; and
post processing an output of each of the available processors when automatically assembling the response, wherein the post processing includes at least one database operation including at least one of an aggregation operation, a sorting operation, and an invocation of another extensible computer process.
20. The method of claim 19 further comprising:
providing an extensible computer process information about its context in the customizable database request; and
processing information that affects the interpretation of the customizable database request based on the information provided, wherein the extensible computer process is a developer provided-computer program, and wherein the information provided includes at least one of a format of the input data and an output data, whether the input data and the output data is ordered and in which form, grouping information, statistics of the input data, a distribution information, a length of the input data and the output data, and custom parameters, wherein the custom parameters are at least one of a number, a string, a list of numbers of strings, a content of a file in the available computing environment, and a result of the customizable database request.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/406,875 US20100241893A1 (en) | 2009-03-18 | 2009-03-18 | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment |
EP10753834A EP2409245A4 (en) | 2009-03-18 | 2010-02-05 | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment |
PCT/US2010/023260 WO2010107523A2 (en) | 2009-03-18 | 2010-02-05 | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment |
US12/784,527 US8903841B2 (en) | 2009-03-18 | 2010-05-21 | System and method of massively parallel data processing |
US12/877,136 US7966340B2 (en) | 2009-03-18 | 2010-09-08 | System and method of massively parallel data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/406,875 US20100241893A1 (en) | 2009-03-18 | 2009-03-18 | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/784,527 Continuation US8903841B2 (en) | 2009-03-18 | 2010-05-21 | System and method of massively parallel data processing |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/784,527 Continuation US8903841B2 (en) | 2009-03-18 | 2010-05-21 | System and method of massively parallel data processing |
US12/877,136 Continuation US7966340B2 (en) | 2009-03-18 | 2010-09-08 | System and method of massively parallel data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20100241893A1 true US20100241893A1 (en) | 2010-09-23 |
Family
ID=42738531
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/406,875 Abandoned US20100241893A1 (en) | 2009-03-18 | 2009-03-18 | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment |
US12/784,527 Active 2030-11-11 US8903841B2 (en) | 2009-03-18 | 2010-05-21 | System and method of massively parallel data processing |
US12/877,136 Active US7966340B2 (en) | 2009-03-18 | 2010-09-08 | System and method of massively parallel data processing |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/784,527 Active 2030-11-11 US8903841B2 (en) | 2009-03-18 | 2010-05-21 | System and method of massively parallel data processing |
US12/877,136 Active US7966340B2 (en) | 2009-03-18 | 2010-09-08 | System and method of massively parallel data processing |
Country Status (3)
Country | Link |
---|---|
US (3) | US20100241893A1 (en) |
EP (1) | EP2409245A4 (en) |
WO (1) | WO2010107523A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110213775A1 (en) * | 2010-03-01 | 2011-09-01 | International Business Machines Corporation | Database Table Look-up |
US20130086116A1 (en) * | 2011-10-04 | 2013-04-04 | International Business Machines Corporation | Declarative specification of data integraton workflows for execution on parallel processing platforms |
CN106453817A (en) * | 2015-08-13 | 2017-02-22 | Lg电子株式会社 | Mobile terminal and method of controlling the same |
Families Citing this family (90)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8713038B2 (en) * | 2009-04-02 | 2014-04-29 | Pivotal Software, Inc. | Integrating map-reduce into a distributed relational database |
US9665620B2 (en) | 2010-01-15 | 2017-05-30 | Ab Initio Technology Llc | Managing data queries |
US9177017B2 (en) * | 2010-09-27 | 2015-11-03 | Microsoft Technology Licensing, Llc | Query constraint encoding with type-based state machine |
US9489183B2 (en) | 2010-10-12 | 2016-11-08 | Microsoft Technology Licensing, Llc | Tile communication operator |
US9430204B2 (en) | 2010-11-19 | 2016-08-30 | Microsoft Technology Licensing, Llc | Read-only communication operator |
US9507568B2 (en) | 2010-12-09 | 2016-11-29 | Microsoft Technology Licensing, Llc | Nested communication operator |
US9395957B2 (en) | 2010-12-22 | 2016-07-19 | Microsoft Technology Licensing, Llc | Agile communication operator |
US8713039B2 (en) | 2010-12-23 | 2014-04-29 | Microsoft Corporation | Co-map communication operator |
US8538954B2 (en) * | 2011-01-25 | 2013-09-17 | Hewlett-Packard Development Company, L.P. | Aggregate function partitions for distributed processing |
US8856151B2 (en) | 2011-01-25 | 2014-10-07 | Hewlett-Packard Development Company, L.P. | Output field mapping of user defined functions in databases |
US20130238548A1 (en) * | 2011-01-25 | 2013-09-12 | Muthian George | Analytical data processing |
US9355145B2 (en) | 2011-01-25 | 2016-05-31 | Hewlett Packard Enterprise Development Lp | User defined function classification in analytical data processing systems |
US20120239612A1 (en) * | 2011-01-25 | 2012-09-20 | Muthian George | User defined functions for data loading |
US8612368B2 (en) * | 2011-03-01 | 2013-12-17 | International Business Machines Corporation | Systems and methods for processing machine learning algorithms in a MapReduce environment |
US9116955B2 (en) * | 2011-05-02 | 2015-08-25 | Ab Initio Technology Llc | Managing data queries |
US9569511B2 (en) * | 2011-08-25 | 2017-02-14 | Salesforce.Com, Inc. | Dynamic data management |
US8452792B2 (en) | 2011-10-28 | 2013-05-28 | Microsoft Corporation | De-focusing over big data for extraction of unknown value |
US10007698B2 (en) * | 2011-11-28 | 2018-06-26 | Sybase, Inc. | Table parameterized functions in database |
US11216454B1 (en) * | 2011-12-19 | 2022-01-04 | Actian Sub Iii, Inc. | User defined functions for database query languages based on call-back functions |
US9436740B2 (en) | 2012-04-04 | 2016-09-06 | Microsoft Technology Licensing, Llc | Visualization of changing confidence intervals |
CN103379114B (en) | 2012-04-28 | 2016-12-14 | 国际商业机器公司 | For the method and apparatus protecting private data in Map Reduce system |
US8984515B2 (en) | 2012-05-31 | 2015-03-17 | International Business Machines Corporation | System and method for shared execution of mixed data flows |
US9607045B2 (en) | 2012-07-12 | 2017-03-28 | Microsoft Technology Licensing, Llc | Progressive query computation using streaming architectures |
US9524184B2 (en) | 2012-07-31 | 2016-12-20 | Hewlett Packard Enterprise Development Lp | Open station canonical operator for data stream processing |
US9311380B2 (en) | 2013-03-29 | 2016-04-12 | International Business Machines Corporation | Processing spatial joins using a mapreduce framework |
US9317472B2 (en) * | 2013-06-07 | 2016-04-19 | International Business Machines Corporation | Processing element data sharing |
US9514214B2 (en) * | 2013-06-12 | 2016-12-06 | Microsoft Technology Licensing, Llc | Deterministic progressive big data analytics |
US9436692B1 (en) * | 2013-06-25 | 2016-09-06 | Emc Corporation | Large scale video analytics architecture |
CN104346344A (en) * | 2013-07-24 | 2015-02-11 | 北大方正集团有限公司 | Compiled book manufacturing method and device |
US10162829B2 (en) * | 2013-09-03 | 2018-12-25 | Adobe Systems Incorporated | Adaptive parallel data processing |
US9372766B2 (en) | 2014-02-11 | 2016-06-21 | Saudi Arabian Oil Company | Circumventing load imbalance in parallel simulations caused by faulty hardware nodes |
US10459767B2 (en) | 2014-03-05 | 2019-10-29 | International Business Machines Corporation | Performing data analytics utilizing a user configurable group of reusable modules |
GB2532469A (en) | 2014-11-20 | 2016-05-25 | Ibm | Self-optimizing table distribution with transparent replica cache |
US10417281B2 (en) | 2015-02-18 | 2019-09-17 | Ab Initio Technology Llc | Querying a data source on a network |
EP3365809B1 (en) * | 2015-10-23 | 2020-12-09 | Oracle International Corporation | System and method for sandboxing support in a multidimensional database environment |
US10740328B2 (en) | 2016-06-24 | 2020-08-11 | Microsoft Technology Licensing, Llc | Aggregate-query database system and processing |
US11860940B1 (en) | 2016-09-26 | 2024-01-02 | Splunk Inc. | Identifying buckets for query execution using a catalog of buckets |
US12013895B2 (en) | 2016-09-26 | 2024-06-18 | Splunk Inc. | Processing data using containerized nodes in a containerized scalable environment |
US20180095914A1 (en) | 2016-10-03 | 2018-04-05 | Ocient Llc | Application direct access to sata drive |
WO2018112056A1 (en) | 2016-12-14 | 2018-06-21 | Ocient Llc | Efficient database management system utilizing silo and manifest |
US10868863B1 (en) | 2016-12-14 | 2020-12-15 | Ocient Inc. | System and method for designating a leader using a consensus protocol within a database management system |
US10552435B2 (en) | 2017-03-08 | 2020-02-04 | Microsoft Technology Licensing, Llc | Fast approximate results and slow precise results |
US12099876B2 (en) | 2017-04-03 | 2024-09-24 | Ocient Inc. | Coordinating main memory access of a plurality of sets of threads |
US10235268B2 (en) | 2017-05-18 | 2019-03-19 | International Business Machines Corporation | Streams analysis tool and method |
US10754856B2 (en) | 2017-05-30 | 2020-08-25 | Ocient Inc. | System and method for optimizing large database management systems using bloom filter |
US11989194B2 (en) | 2017-07-31 | 2024-05-21 | Splunk Inc. | Addressing memory limits for partition tracking among worker nodes |
US11921672B2 (en) | 2017-07-31 | 2024-03-05 | Splunk Inc. | Query execution at a remote heterogeneous data store of a data fabric service |
US12118009B2 (en) * | 2017-07-31 | 2024-10-15 | Splunk Inc. | Supporting query languages through distributed execution of query engines |
US11182125B2 (en) | 2017-09-07 | 2021-11-23 | Ocient Inc. | Computing device sort function |
US11676062B2 (en) | 2018-03-06 | 2023-06-13 | Samsung Electronics Co., Ltd. | Dynamically evolving hybrid personalized artificial intelligence system |
US11880368B2 (en) | 2018-10-15 | 2024-01-23 | Ocient Holdings LLC | Compressing data sets for storage in a database system |
US11886436B2 (en) | 2018-10-15 | 2024-01-30 | Ocient Inc. | Segmenting a partition of a data set based on a data storage coding scheme |
US11249916B2 (en) | 2018-10-15 | 2022-02-15 | Ocient Holdings LLC | Single producer single consumer buffering in database systems |
US12050580B2 (en) | 2018-10-15 | 2024-07-30 | Ocient Inc. | Data segment storing in a database system |
US11256696B2 (en) | 2018-10-15 | 2022-02-22 | Ocient Holdings LLC | Data set compression within a database system |
US11709835B2 (en) | 2018-10-15 | 2023-07-25 | Ocient Holdings LLC | Re-ordered processing of read requests |
US11093223B2 (en) | 2019-07-18 | 2021-08-17 | Ab Initio Technology Llc | Automatically converting a program written in a procedural programming language into a dataflow graph and related systems and methods |
US11494380B2 (en) | 2019-10-18 | 2022-11-08 | Splunk Inc. | Management of distributed computing framework components in a data fabric service system |
US11093500B2 (en) | 2019-10-28 | 2021-08-17 | Ocient Holdings LLC | Enforcement of minimum query cost rules required for access to a database system |
US11106679B2 (en) | 2019-10-30 | 2021-08-31 | Ocient Holdings LLC | Enforcement of sets of query rules for access to data supplied by a plurality of data providers |
US11609911B2 (en) | 2019-12-19 | 2023-03-21 | Ocient Holdings LLC | Selecting a normalized form for conversion of a query expression |
US11922222B1 (en) | 2020-01-30 | 2024-03-05 | Splunk Inc. | Generating a modified component for a data intake and query system using an isolated execution environment image |
US11853364B2 (en) | 2020-01-31 | 2023-12-26 | Ocient Holdings LLC | Level-based queries in a database system and methods for use therewith |
US11061910B1 (en) | 2020-01-31 | 2021-07-13 | Ocient Holdings LLC | Servicing concurrent queries via virtual segment recovery |
US11599463B2 (en) | 2020-03-25 | 2023-03-07 | Ocient Holdings LLC | Servicing queries during data ingress |
US11238041B2 (en) | 2020-03-25 | 2022-02-01 | Ocient Holdings LLC | Facilitating query executions via dynamic data block routing |
US11580102B2 (en) | 2020-04-02 | 2023-02-14 | Ocient Holdings LLC | Implementing linear algebra functions via decentralized execution of query operator flows |
US11294916B2 (en) | 2020-05-20 | 2022-04-05 | Ocient Holdings LLC | Facilitating query executions via multiple modes of resultant correctness |
US11775529B2 (en) | 2020-07-06 | 2023-10-03 | Ocient Holdings LLC | Recursive functionality in relational database systems |
US11321288B2 (en) | 2020-08-05 | 2022-05-03 | Ocient Holdings LLC | Record deduplication in database systems |
US11755589B2 (en) | 2020-08-05 | 2023-09-12 | Ocient Holdings LLC | Delaying segment generation in database systems |
US11880716B2 (en) | 2020-08-05 | 2024-01-23 | Ocient Holdings LLC | Parallelized segment generation via key-based subdivision in database systems |
US11822532B2 (en) | 2020-10-14 | 2023-11-21 | Ocient Holdings LLC | Per-segment secondary indexing in database systems |
US12099504B2 (en) | 2020-10-19 | 2024-09-24 | Ocient Holdings LLC | Utilizing array field distribution data in database systems |
US11507578B2 (en) | 2020-10-19 | 2022-11-22 | Ocient Holdings LLC | Delaying exceptions in query execution |
US11675757B2 (en) | 2020-10-29 | 2023-06-13 | Ocient Holdings LLC | Maintaining row durability data in database systems |
US11297123B1 (en) | 2020-12-11 | 2022-04-05 | Ocient Holdings LLC | Fault-tolerant data stream processing |
US11314743B1 (en) | 2020-12-29 | 2022-04-26 | Ocient Holdings LLC | Storing records via multiple field-based storage mechanisms |
US11645273B2 (en) | 2021-05-28 | 2023-05-09 | Ocient Holdings LLC | Query execution utilizing probabilistic indexing |
US12072939B1 (en) | 2021-07-30 | 2024-08-27 | Splunk Inc. | Federated data enrichment objects |
US11803544B2 (en) | 2021-10-06 | 2023-10-31 | Ocient Holdings LLC | Missing data-based indexing in database systems |
US11983172B2 (en) | 2021-12-07 | 2024-05-14 | Ocient Holdings LLC | Generation of a predictive model for selection of batch sizes in performing data format conversion |
US12093272B1 (en) | 2022-04-29 | 2024-09-17 | Splunk Inc. | Retrieving data identifiers from queue for search of external data system |
US12124449B2 (en) | 2022-05-24 | 2024-10-22 | Ocient Holdings LLC | Processing left join operations via a database system based on forwarding input |
US20240289331A1 (en) | 2022-09-07 | 2024-08-29 | Ocient Holdings LLC | Dimensionality reduction and model training in a database system implementation of a k nearest neighbors model |
US12130817B2 (en) | 2022-10-27 | 2024-10-29 | Ocient Holdings LLC | Generating execution tracking rows during query execution via a database system |
US12093254B1 (en) | 2023-04-28 | 2024-09-17 | Ocient Holdings LLC | Query execution during storage formatting updates |
US12072887B1 (en) | 2023-05-01 | 2024-08-27 | Ocient Holdings LLC | Optimizing an operator flow for performing filtering based on new columns values via a database system |
US12117986B1 (en) | 2023-07-20 | 2024-10-15 | Ocient Holdings LLC | Structuring geospatial index data for access during query execution via a database system |
US12093231B1 (en) | 2023-07-28 | 2024-09-17 | Ocient Holdings LLC | Distributed generation of addendum part data for a segment stored via a database system |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5313629A (en) * | 1989-10-23 | 1994-05-17 | International Business Machines Corporation | Unit of work for preserving data integrity of a data-base by creating in memory a copy of all objects which are to be processed together |
US5634053A (en) * | 1995-08-29 | 1997-05-27 | Hughes Aircraft Company | Federated information management (FIM) system and method for providing data site filtering and translation for heterogeneous databases |
US5708828A (en) * | 1995-05-25 | 1998-01-13 | Reliant Data Systems | System for converting data from input data environment using first format to output data environment using second format by executing the associations between their fields |
US20020083039A1 (en) * | 2000-05-18 | 2002-06-27 | Ferrari Adam J. | Hierarchical data-driven search and navigation system and method for information retrieval |
US20030177111A1 (en) * | 1999-11-16 | 2003-09-18 | Searchcraft Corporation | Method for searching from a plurality of data sources |
US20070136251A1 (en) * | 2003-08-21 | 2007-06-14 | Idilia Inc. | System and Method for Processing a Query |
US20080147599A1 (en) * | 2006-12-18 | 2008-06-19 | Ianywhere Solutions, Inc. | Load balancing for complex database query plans |
US20080183688A1 (en) * | 2006-08-25 | 2008-07-31 | Chamdani Joseph I | Methods and systems for hardware acceleration of database operations and queries |
US7606805B2 (en) * | 2000-07-28 | 2009-10-20 | EasyAsk Acquisition, LLC | Distributed search system and method |
US7680765B2 (en) * | 2006-12-27 | 2010-03-16 | Microsoft Corporation | Iterate-aggregate query parallelization |
US20100198855A1 (en) * | 2009-01-30 | 2010-08-05 | Ranganathan Venkatesan N | Providing parallel result streams for database queries |
US7844620B2 (en) * | 2007-11-16 | 2010-11-30 | International Business Machines Corporation | Real time data replication for query execution in a massively parallel computer |
Family Cites Families (124)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US2009100A (en) * | 1931-01-08 | 1935-07-23 | Bendix Brake Co | Brake |
US5765146A (en) * | 1993-11-04 | 1998-06-09 | International Business Machines Corporation | Method of performing a parallel relational database query in a multiprocessor environment |
US7315860B1 (en) | 1994-09-01 | 2008-01-01 | Computer Associates Think, Inc. | Directory service system and method with tolerance for data entry storage and output |
EP0777883B1 (en) | 1994-09-01 | 2003-05-02 | Computer Associates Think, Inc. | X.500 system and methods |
US5943663A (en) | 1994-11-28 | 1999-08-24 | Mouradian; Gary C. | Data processing method and system utilizing parallel processing |
US5613071A (en) | 1995-07-14 | 1997-03-18 | Intel Corporation | Method and apparatus for providing remote memory access in a distributed memory multiprocessor system |
US6067542A (en) | 1995-10-20 | 2000-05-23 | Ncr Corporation | Pragma facility and SQL3 extension for optimal parallel UDF execution |
US5905982A (en) | 1997-04-04 | 1999-05-18 | International Business Machines Corporation | Handling null values in SQL queries over object-oriented data |
US6112198A (en) | 1997-06-30 | 2000-08-29 | International Business Machines Corporation | Optimization of data repartitioning during parallel query optimization |
US6618718B1 (en) | 1997-10-14 | 2003-09-09 | International Business Machines Corporation | Apparatus and method for dynamically generating query explain data |
US6604096B1 (en) | 1997-10-14 | 2003-08-05 | International Business Machines Corporation | Apparatus and method for locally caching query explain data |
JP2001527244A (en) | 1997-12-22 | 2001-12-25 | リンダ ジー デミシェル | Method and apparatus for efficiently partitioning query execution in object-relational mapping between client and server |
US6202067B1 (en) * | 1998-04-07 | 2001-03-13 | Lucent Technologies, Inc. | Method and apparatus for correct and complete transactions in a fault tolerant distributed database system |
US6625593B1 (en) * | 1998-06-29 | 2003-09-23 | International Business Machines Corporation | Parallel query optimization strategies for replicated and partitioned tables |
US6339769B1 (en) | 1998-09-14 | 2002-01-15 | International Business Machines Corporation | Query optimization by transparently altering properties of relational tables using materialized views |
US6609128B1 (en) | 1999-07-30 | 2003-08-19 | Accenture Llp | Codes table framework design in an E-commerce architecture |
US6615199B1 (en) | 1999-08-31 | 2003-09-02 | Accenture, Llp | Abstraction factory in a base services pattern environment |
US6477580B1 (en) | 1999-08-31 | 2002-11-05 | Accenture Llp | Self-described stream in a communication services patterns environment |
US6578068B1 (en) | 1999-08-31 | 2003-06-10 | Accenture Llp | Load balancer in environment services patterns |
US20020029207A1 (en) | 2000-02-28 | 2002-03-07 | Hyperroll, Inc. | Data aggregation server for managing a multi-dimensional database and database management system having data aggregation server integrated therein |
US6457020B1 (en) | 2000-03-20 | 2002-09-24 | International Business Machines Corporation | Query optimization using a multi-layered object cache |
AU2001257077A1 (en) | 2000-04-17 | 2001-10-30 | Brio Technology, Inc. | Analytical server including metrics engine |
WO2002021749A2 (en) | 2000-09-08 | 2002-03-14 | Plumtree Software | Providing a personalized web page by accessing different servers |
IL141599A0 (en) * | 2001-02-22 | 2002-03-10 | Infocyclone Inc | Information retrieval system |
US6961723B2 (en) | 2001-05-04 | 2005-11-01 | Sun Microsystems, Inc. | System and method for determining relevancy of query responses in a distributed network search mechanism |
US7039647B2 (en) | 2001-05-10 | 2006-05-02 | International Business Machines Corporation | Drag and drop technique for building queries |
US7263514B2 (en) | 2001-05-17 | 2007-08-28 | International Business Machines Corporation | Efficient object query processing technique on object's dynamic properties via pushdown |
US6775662B1 (en) | 2001-05-21 | 2004-08-10 | Oracle International Corporation | Group pruning from cube, rollup, and grouping sets |
US6968344B2 (en) | 2001-07-26 | 2005-11-22 | Tata Consultancy Services Limited | Method and apparatus for object-oriented access to a relational database management system (RDBMS) based on any arbitrary predicate |
WO2003065240A1 (en) | 2002-02-01 | 2003-08-07 | John Fairweather | System and method for managing collections of data on a network |
US6925463B2 (en) | 2002-04-15 | 2005-08-02 | International Business Machines Corporation | Method and system for query processing by combining indexes of multilevel granularity or composition |
US7149733B2 (en) | 2002-07-20 | 2006-12-12 | Microsoft Corporation | Translation of object queries involving inheritence |
US7676452B2 (en) | 2002-07-23 | 2010-03-09 | International Business Machines Corporation | Method and apparatus for search optimization based on generation of context focused queries |
US7461051B2 (en) | 2002-11-11 | 2008-12-02 | Transparensee Systems, Inc. | Search method and system and system using the same |
US7293024B2 (en) * | 2002-11-14 | 2007-11-06 | Seisint, Inc. | Method for sorting and distributing data among a plurality of nodes |
US7299225B2 (en) | 2002-11-26 | 2007-11-20 | International Business Machines Corporation | High performance predicate push-down for non-matching predicate operands |
US7136850B2 (en) | 2002-12-20 | 2006-11-14 | International Business Machines Corporation | Self tuning database retrieval optimization using regression functions |
US7599912B2 (en) | 2003-01-14 | 2009-10-06 | At&T Intellectual Property I, L.P. | Structured query language (SQL) query via common object request broker architecture (CORBA) interface |
WO2004063928A1 (en) | 2003-01-14 | 2004-07-29 | Accelia, Inc. | Database load reducing system and load reducing program |
US7668801B1 (en) | 2003-04-21 | 2010-02-23 | At&T Corp. | Method and apparatus for optimizing queries under parametric aggregation constraints |
CA2429910A1 (en) | 2003-05-27 | 2004-11-27 | Cognos Incorporated | System and method of query transformation |
EP1649390B1 (en) * | 2003-07-07 | 2014-08-20 | IBM International Group BV | Optimized sql code generation |
US20050027690A1 (en) | 2003-07-29 | 2005-02-03 | International Business Machines Corporation | Dynamic selection of optimal grouping sequence at runtime for grouping sets, rollup and cube operations in SQL query processing |
US7565370B2 (en) | 2003-08-29 | 2009-07-21 | Oracle International Corporation | Support Vector Machines in a relational database management system |
US7617205B2 (en) | 2005-03-30 | 2009-11-10 | Google Inc. | Estimating confidence for query revision models |
US7539660B2 (en) | 2003-10-23 | 2009-05-26 | International Business Machines Corporation | Method and system for generating SQL joins to optimize performance |
US7277873B2 (en) | 2003-10-31 | 2007-10-02 | International Business Machines Corporaton | Method for discovering undeclared and fuzzy rules in databases |
US7657516B2 (en) | 2003-12-01 | 2010-02-02 | Siebel Systems, Inc. | Conversion of a relational database query to a query of a multidimensional data source by modeling the multidimensional data source |
US7447678B2 (en) | 2003-12-31 | 2008-11-04 | Google Inc. | Interface for a universal search engine |
US7321891B1 (en) | 2004-02-19 | 2008-01-22 | Ncr Corp. | Processing database queries |
US7627567B2 (en) | 2004-04-14 | 2009-12-01 | Microsoft Corporation | Segmentation of strings into structured records |
US7676453B2 (en) | 2004-04-22 | 2010-03-09 | Oracle International Corporation | Partial query caching |
US7440935B2 (en) | 2004-05-05 | 2008-10-21 | International Business Machines Corporation | Method and system for query directives and access plan hints |
US7266547B2 (en) * | 2004-06-10 | 2007-09-04 | International Business Machines Corporation | Query meaning determination through a grid service |
US7296007B1 (en) | 2004-07-06 | 2007-11-13 | Ailive, Inc. | Real time context learning by software agents |
US7660811B2 (en) | 2004-07-09 | 2010-02-09 | Microsoft Corporation | System that facilitates database querying |
US7668806B2 (en) | 2004-08-05 | 2010-02-23 | Oracle International Corporation | Processing queries against one or more markup language sources |
WO2006021944A1 (en) | 2004-08-12 | 2006-03-02 | Leanway Automatic Areanas Ltd. | Enhanced database structure configuration |
US7478080B2 (en) | 2004-09-30 | 2009-01-13 | International Business Machines Corporation | Canonical abstraction for outerjoin optimization |
US7617186B2 (en) | 2004-10-05 | 2009-11-10 | Omniture, Inc. | System, method and computer program for successive approximation of query results |
US7542990B2 (en) | 2004-10-26 | 2009-06-02 | Computer Associates Think, Inc. | System and method for providing a relational application domain model |
US7640237B2 (en) | 2005-01-11 | 2009-12-29 | International Business Machines Corporation | System and method for database query with on demand database query reduction |
US7668807B2 (en) | 2005-02-24 | 2010-02-23 | International Business Machines Corporation | Query rebinding for high-availability database systems |
US7565345B2 (en) | 2005-03-29 | 2009-07-21 | Google Inc. | Integration of multiple query revision models |
US7640230B2 (en) | 2005-04-05 | 2009-12-29 | Microsoft Corporation | Query plan selection control using run-time association mechanism |
US7606829B2 (en) | 2005-04-14 | 2009-10-20 | International Business Machines Corporation | Model entity operations in query results |
US7533088B2 (en) | 2005-05-04 | 2009-05-12 | Microsoft Corporation | Database reverse query matching |
US8386440B2 (en) * | 2005-05-10 | 2013-02-26 | Microsoft Corporation | Database corruption recovery systems and methods |
US7383247B2 (en) | 2005-08-29 | 2008-06-03 | International Business Machines Corporation | Query routing of federated information systems for fast response time, load balance, availability, and reliability |
US7565342B2 (en) | 2005-09-09 | 2009-07-21 | International Business Machines Corporation | Dynamic semi-join processing with runtime optimization |
US8005820B2 (en) | 2005-09-29 | 2011-08-23 | Teradata Us, Inc. | Optimizing the processing of in-list rows |
US7680775B2 (en) | 2005-12-13 | 2010-03-16 | Iac Search & Media, Inc. | Methods and systems for generating query and result-based relevance indexes |
US20070208726A1 (en) | 2006-03-01 | 2007-09-06 | Oracle International Corporation | Enhancing search results using ontologies |
US7702625B2 (en) | 2006-03-03 | 2010-04-20 | International Business Machines Corporation | Building a unified query that spans heterogeneous environments |
US7644062B2 (en) | 2006-03-15 | 2010-01-05 | Oracle International Corporation | Join factorization of union/union all queries |
US7647298B2 (en) | 2006-03-23 | 2010-01-12 | Microsoft Corporation | Generation of query and update views for object relational mapping |
US7680787B2 (en) | 2006-04-06 | 2010-03-16 | International Business Machines Corporation | Database query generation method and system |
US7937390B2 (en) | 2006-06-01 | 2011-05-03 | Mediareif Moestl & Reif Kommunikations-Und Informationstechnologien Oeg | Method for controlling a relational database system |
CN101093493B (en) | 2006-06-23 | 2011-08-31 | 国际商业机器公司 | Speech conversion method for database inquiry and converter |
US7567945B2 (en) | 2006-06-29 | 2009-07-28 | Yahoo! Inc. | Aggregation-specific confidence intervals for fact set queries |
US20080040334A1 (en) | 2006-08-09 | 2008-02-14 | Gad Haber | Operation of Relational Database Optimizers by Inserting Redundant Sub-Queries in Complex Queries |
US7647286B2 (en) | 2006-09-07 | 2010-01-12 | International Business Machines Corporation | System and method for managing a chaotic event by providing optimal and adaptive sequencing of decision sets with supporting data |
US7672934B1 (en) | 2006-10-19 | 2010-03-02 | Symantec Operting Corporation | Method for restoring documents from a database file |
US7921416B2 (en) * | 2006-10-20 | 2011-04-05 | Yahoo! Inc. | Formal language and translator for parallel processing of data |
US7590626B2 (en) | 2006-10-30 | 2009-09-15 | Microsoft Corporation | Distributional similarity-based models for query correction |
US7523123B2 (en) * | 2006-11-16 | 2009-04-21 | Yahoo! Inc. | Map-reduce with merge to process multiple relational datasets |
US7676457B2 (en) | 2006-11-29 | 2010-03-09 | Red Hat, Inc. | Automatic index based query optimization |
US7844608B2 (en) | 2006-12-15 | 2010-11-30 | Yahoo! Inc. | Clustered query support for a database query engine |
US20080147596A1 (en) | 2006-12-18 | 2008-06-19 | Mckenna William | Method and system for improving sql database query performance |
CN101206648A (en) | 2006-12-20 | 2008-06-25 | 鸿富锦精密工业(深圳)有限公司 | Network service generating system and method |
US7685119B2 (en) | 2006-12-20 | 2010-03-23 | Yahoo! Inc. | System and method for query expansion |
US7672925B2 (en) | 2006-12-28 | 2010-03-02 | Sybase, Inc. | Accelerating queries using temporary enumeration representation |
US20080162445A1 (en) | 2006-12-29 | 2008-07-03 | Ahmad Ghazal | Determining satisfiability and transitive closure of a where clause |
US7593931B2 (en) | 2007-01-12 | 2009-09-22 | International Business Machines Corporation | Apparatus, system, and method for performing fast approximate computation of statistics on query expressions |
US7640238B2 (en) | 2007-01-19 | 2009-12-29 | International Business Machines Corporation | Query planning for execution using conditional operators |
US7657505B2 (en) | 2007-01-19 | 2010-02-02 | Microsoft Corporation | Data retrieval from a database utilizing efficient eager loading and customized queries |
US7624122B2 (en) | 2007-01-25 | 2009-11-24 | Sap Ag | Method and system for querying a database |
US7680779B2 (en) | 2007-01-26 | 2010-03-16 | Sap Ag | Managing queries in a distributed database system |
US7865533B2 (en) | 2007-02-05 | 2011-01-04 | Microsoft Corporation | Compositional query comprehensions |
US20080256549A1 (en) | 2007-04-10 | 2008-10-16 | International Business Machines Corporation | System and Method of Planning for Cooperative Information Processing |
US8417762B2 (en) | 2007-04-10 | 2013-04-09 | International Business Machines Corporation | Mechanism for execution of multi-site jobs in a data stream processing system |
JP4073033B1 (en) | 2007-04-27 | 2008-04-09 | 透 降矢 | A database query processing system using multi-operation processing using composite relational operations that considers improvement of processing functions of join operations |
US7680746B2 (en) | 2007-05-23 | 2010-03-16 | Yahoo! Inc. | Prediction of click through rates using hybrid kalman filter-tree structured markov model classifiers |
CN101339551B (en) | 2007-07-05 | 2013-01-30 | 日电(中国)有限公司 | Natural language query demand extension equipment and its method |
US7689534B2 (en) | 2007-07-11 | 2010-03-30 | International Business Machines Corporation | Affecting database file performance by allowing delayed query language trigger firing |
US7984043B1 (en) * | 2007-07-24 | 2011-07-19 | Amazon Technologies, Inc. | System and method for distributed query processing using configuration-independent query plans |
US20090043745A1 (en) | 2007-08-07 | 2009-02-12 | Eric L Barsness | Query Execution and Optimization with Autonomic Error Recovery from Network Failures in a Parallel Computer System with Multiple Networks |
US8171047B2 (en) * | 2007-08-07 | 2012-05-01 | International Business Machines Corporation | Query execution and optimization utilizing a combining network in a parallel computer system |
US20090049024A1 (en) | 2007-08-14 | 2009-02-19 | Ncr Corporation | Dynamic query optimization between systems based on system conditions |
US8214585B2 (en) * | 2007-08-15 | 2012-07-03 | International Business Machines Corporation | Enabling parallel access volumes in virtual machine environments |
US8127283B2 (en) * | 2007-09-05 | 2012-02-28 | Intel Corporation | Enabling graphical notation for parallel programming |
US10089361B2 (en) | 2007-10-31 | 2018-10-02 | Oracle International Corporation | Efficient mechanism for managing hierarchical relationships in a relational database system |
EP2063364A1 (en) | 2007-11-19 | 2009-05-27 | Siemens Aktiengesellschaft | Module for building database queries |
US7991794B2 (en) | 2007-12-18 | 2011-08-02 | Oracle International Corporation | Pipelining operations involving DML and query |
US8903802B2 (en) | 2008-03-06 | 2014-12-02 | Cisco Technology, Inc. | Systems and methods for managing queries |
US8335780B2 (en) | 2008-03-11 | 2012-12-18 | James Madison Kelley | Scalable high speed relational processor for databases and networks |
US8082237B2 (en) | 2008-03-28 | 2011-12-20 | Oracle International Corporation | Applying the use of temporal data and temporal data models to roles and organizational structures |
US8606803B2 (en) | 2008-04-01 | 2013-12-10 | Microsoft Corporation | Translating a relational query to a multidimensional query |
US8713048B2 (en) | 2008-06-24 | 2014-04-29 | Microsoft Corporation | Query processing with specialized query operators |
US9135583B2 (en) | 2008-07-16 | 2015-09-15 | Business Objects S.A. | Systems and methods to create continuous queries associated with push-type and pull-type data |
US8150865B2 (en) | 2008-07-29 | 2012-04-03 | Oracle International Corporation | Techniques for coalescing subqueries |
US7984031B2 (en) | 2008-08-01 | 2011-07-19 | Microsoft Corporation | Query builder for testing query languages |
US8250086B2 (en) | 2008-09-02 | 2012-08-21 | Teradata U S, Inc. | Web services access with shared SQL |
US8775154B2 (en) | 2008-09-18 | 2014-07-08 | Xerox Corporation | Query translation through dictionary adaptation |
-
2009
- 2009-03-18 US US12/406,875 patent/US20100241893A1/en not_active Abandoned
-
2010
- 2010-02-05 WO PCT/US2010/023260 patent/WO2010107523A2/en active Application Filing
- 2010-02-05 EP EP10753834A patent/EP2409245A4/en not_active Ceased
- 2010-05-21 US US12/784,527 patent/US8903841B2/en active Active
- 2010-09-08 US US12/877,136 patent/US7966340B2/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5313629A (en) * | 1989-10-23 | 1994-05-17 | International Business Machines Corporation | Unit of work for preserving data integrity of a data-base by creating in memory a copy of all objects which are to be processed together |
US5708828A (en) * | 1995-05-25 | 1998-01-13 | Reliant Data Systems | System for converting data from input data environment using first format to output data environment using second format by executing the associations between their fields |
US5634053A (en) * | 1995-08-29 | 1997-05-27 | Hughes Aircraft Company | Federated information management (FIM) system and method for providing data site filtering and translation for heterogeneous databases |
US20030177111A1 (en) * | 1999-11-16 | 2003-09-18 | Searchcraft Corporation | Method for searching from a plurality of data sources |
US20020083039A1 (en) * | 2000-05-18 | 2002-06-27 | Ferrari Adam J. | Hierarchical data-driven search and navigation system and method for information retrieval |
US7606805B2 (en) * | 2000-07-28 | 2009-10-20 | EasyAsk Acquisition, LLC | Distributed search system and method |
US20070136251A1 (en) * | 2003-08-21 | 2007-06-14 | Idilia Inc. | System and Method for Processing a Query |
US20080183688A1 (en) * | 2006-08-25 | 2008-07-31 | Chamdani Joseph I | Methods and systems for hardware acceleration of database operations and queries |
US20080147599A1 (en) * | 2006-12-18 | 2008-06-19 | Ianywhere Solutions, Inc. | Load balancing for complex database query plans |
US7680765B2 (en) * | 2006-12-27 | 2010-03-16 | Microsoft Corporation | Iterate-aggregate query parallelization |
US7844620B2 (en) * | 2007-11-16 | 2010-11-30 | International Business Machines Corporation | Real time data replication for query execution in a massively parallel computer |
US20100198855A1 (en) * | 2009-01-30 | 2010-08-05 | Ranganathan Venkatesan N | Providing parallel result streams for database queries |
Non-Patent Citations (2)
Title |
---|
"A Scalable Implementation of Fault Tolerance for Massively Parallel Systems," by Deconinck et al. IN: Proc. 2nd Int'l Conf. on Massively Parallel Computing Systems (1996) pp. 214-221. Available at: https://s-space.snu.ac.kr/handle/10371/6861 (Last visited 5/1/13). Also IEEE. * |
"Fault Tolerance in Massively Parallel Systems," by Deconinck et al. INT: Transputer Communications, Vol. 2(4), pp 241-257 (1994). Available at: https://www.esat.kuleuven.be/electa/publications/fulltexts/pub_823.pdf (last visited 5/1/13). * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110213775A1 (en) * | 2010-03-01 | 2011-09-01 | International Business Machines Corporation | Database Table Look-up |
US8359316B2 (en) * | 2010-03-01 | 2013-01-22 | International Business Machines Corporation | Database table look-up |
US20130086116A1 (en) * | 2011-10-04 | 2013-04-04 | International Business Machines Corporation | Declarative specification of data integraton workflows for execution on parallel processing platforms |
US20130254237A1 (en) * | 2011-10-04 | 2013-09-26 | International Business Machines Corporation | Declarative specification of data integraton workflows for execution on parallel processing platforms |
US9317542B2 (en) * | 2011-10-04 | 2016-04-19 | International Business Machines Corporation | Declarative specification of data integration workflows for execution on parallel processing platforms |
US9361323B2 (en) * | 2011-10-04 | 2016-06-07 | International Business Machines Corporation | Declarative specification of data integration workflows for execution on parallel processing platforms |
CN106453817A (en) * | 2015-08-13 | 2017-02-22 | Lg电子株式会社 | Mobile terminal and method of controlling the same |
US10042689B2 (en) * | 2015-08-13 | 2018-08-07 | Lg Electronics Inc. | Mobile terminal and method of controlling the same |
Also Published As
Publication number | Publication date |
---|---|
EP2409245A2 (en) | 2012-01-25 |
US20100241646A1 (en) | 2010-09-23 |
EP2409245A4 (en) | 2012-12-12 |
WO2010107523A3 (en) | 2010-11-18 |
US20100332461A1 (en) | 2010-12-30 |
US7966340B2 (en) | 2011-06-21 |
WO2010107523A2 (en) | 2010-09-23 |
US8903841B2 (en) | 2014-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20100241893A1 (en) | Interpretation and execution of a customizable database request using an extensible computer process and an available computing environment | |
US8219581B2 (en) | Method and system for analyzing ordered data using pattern matching in a relational database | |
US10838960B2 (en) | Data analytics platform over parallel databases and distributed file systems | |
US11308161B2 (en) | Querying a data source on a network | |
US8601016B2 (en) | Pre-generation of structured query language (SQL) from application programming interface (API) defined query systems | |
JP5298117B2 (en) | Data merging in distributed computing | |
US10176229B2 (en) | Guided keyword-based exploration of data | |
US11556590B2 (en) | Search systems and methods utilizing search based user clustering | |
US20190213007A1 (en) | Method and device for executing the distributed computation task | |
US9514184B2 (en) | Systems and methods for a high speed query infrastructure | |
CN107391528B (en) | Front-end component dependent information searching method and equipment | |
EP3293645B1 (en) | Iterative evaluation of data through simd processor registers | |
US20110078569A1 (en) | Value help user interface system and method | |
US8971644B1 (en) | System and method for determining an annotation for an image | |
Liu et al. | DCODE: A distributed column-oriented database engine for big data analytics | |
Kumari et al. | Challenges of modern query processing | |
Kim et al. | Intelligent data management framework for advanced Web service | |
Khapane et al. | Natural language database interface | |
Phan | Efficient and scalable aggregation for large-scale data-intensive applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ASTER DATA SYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FRIEDMAN, ERIC;PAWLOWSKI, PETER;REEL/FRAME:022417/0365 Effective date: 20090317 |
|
AS | Assignment |
Owner name: TERADATA US, INC., OHIO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ASTER DATA SYSTEMS, INC.;REEL/FRAME:026636/0842 Effective date: 20110610 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |