CN115004170B

CN115004170B - Optimized query scheduling based on data freshness requirements

Info

Publication number: CN115004170B
Application number: CN202180009042.XA
Authority: CN
Inventors: 朱利叶斯·西塞克; 高拉夫·库马尔; 肖纳克·米斯特里; 凯伦·彼得森
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2020-01-13
Filing date: 2021-01-12
Publication date: 2023-03-07
Anticipated expiration: 2041-01-12
Also published as: EP4091067B1; CN115004170A; US12111831B2; CN116069762A; US20220164352A1; CA3069090C; US11269879B2; EP4091067A1; CA3069090A1; US20210216547A1; WO2021146221A1

Abstract

A method for optimizing query scheduling, comprising receiving, at an information retrieval data processing system (200), a request (110) to accelerate query execution of a specified query (120) to a time (130) before a scheduled time (190); the method includes identifying in a query a specified field (140) corresponding to data in a database (250), and retrieving a data freshness requirement (160) for the specified field and a frequency of change (150) of the data corresponding to the specified field. The method includes determining whether performing the specified query at a time prior to the scheduled time, rather than the scheduled time, violates a data freshness requirement based on a frequency of change of data corresponding to the specified field. When execution does not violate freshness, the method includes scheduling the specified query to execute at a time prior to the scheduled time.

Description

Optimized query scheduling based on data freshness requirements

技术领域technical field

本公开涉及查询调度领域，并��尤其涉及查询的预调度以用于在请求的查询执行时间之前执行。The present disclosure relates to the field of query scheduling, and more particularly to the pre-scheduling of queries for execution ahead of a requested query execution time.

背景技术Background technique

查询是对来自信息检索系统的信息的请求。提出查询的一般方法有三种：菜单驱动、示例查询和查询语言制定。在第一种情况下，根据菜单中参数的选择制定和发出查询。在第二种情况下，信息检索系统提供空白记录，并允许终端用户指定定义查询的字段和值。在第三种情况下，终端用户使用以查询语言编写的程式化(stylized)查询来制定查询。后者是最复杂的方法，因为它需要使用专门的语言，但后者也是最强大的，因为它是查询信息检索系统的约束最少的模式。A query is a request for information from an information retrieval system. There are three general approaches to formulating queries: menu-driven, query by example, and query language formulation. In the first case, a query is formulated and issued based on the selection of parameters in the menu. In the second case, the information retrieval system provides blank records and allows the end user to specify the fields and values that define the query. In the third case, end users formulate queries using stylized queries written in a query language. The latter is the most complex approach because it requires the use of a specialized language, but it is also the most powerful because it is the least constrained mode of querying an information retrieval system.

查询通常通过查询接口按需发出，或者在执行计算机程序时以编程方式发出。但是，查询也可以以批模式发出。也就是说，查询可以在一个时间被指定，但是针对信息检索系统的查询的执行可以推迟到以后的时间。在这方面，在信息检索系统中，多个用户同时向数据库提交查询以执行是很常见的。因此，如果信息检索系统缺乏足够的计算资源来同时执行所有提交的查询，则信息检索系统必须推迟执行这些查询中的一个或多个，而只能立即处理查询的子集。确定要推迟哪些查询以及在什么时间执行推迟的查询的过程称为查询调度。Queries are typically issued on-demand through a query interface, or programmatically during execution of a computer program. However, queries can also be issued in batch mode. That is, a query can be specified at one time, but execution of the query against the information retrieval system can be deferred until a later time. In this regard, it is common in information retrieval systems for multiple users to simultaneously submit queries to the database for execution. Therefore, if the IR system lacks sufficient computing resources to execute all submitted queries simultaneously, the IR system must defer execution of one or more of these queries and can only process a subset of the queries immediately. The process of determining which queries to defer and when to execute the deferred queries is called query scheduling.

执行查询调度的一种方法是按传入查询的到达顺序执行传入查询，称为“先到先服务”方法。但是，先到先服务方法无法区分具有不同响应时间要求的查询，一些查询比其他查询对时间更敏感。如果查询只是按照到达的顺序进行调度，一些时间敏感的查询可能会��迫等待在时间不敏感的查询之后，这可能对信息检索系统的可用性和响应性产生不利影响。One method of performing query scheduling is to execute incoming queries in the order in which they arrive, known as the "first-come, first-served" method. However, the first-come, first-served approach cannot distinguish between queries with different response time requirements, some queries are more time sensitive than others. If queries are simply scheduled in order of arrival, some time-sensitive queries may be forced to wait behind time-insensitive queries, which may adversely affect the availability and responsiveness of the information retrieval system.

查询调度也可以按照固定的优先级进行。在固定优先级调度中，根据查询到达时已知的一个或多个属性，例如，查询请求者的身份或类型，为每个查询分配优先级。此后，可以根据分配的优先级来调度每个查询。可以看出，固定优先级调度避免了先到先服务方法的问题，因为时间敏感的查询可以优先于较低时间敏感的查询。然而，固定优先级调度不能考虑执行时间相对较长的“重”查询和执行时间相对较短的“轻”查询，例如，毫秒或秒的数量级。Query scheduling can also be done with a fixed priority. In fixed-priority scheduling, each query is assigned a priority based on one or more attributes known when the query arrives, for example, the identity or type of the query requester. Thereafter, each query can be scheduled according to the assigned priority. It can be seen that fixed-priority scheduling avoids the problems of the first-come-first-served approach, as time-sensitive queries can be prioritized over less time-sensitive queries. However, fixed-priority scheduling cannot take into account "heavy" queries with relatively long execution times and "light" queries with relatively short execution times, eg, on the order of milliseconds or seconds.

发明内容Contents of the invention

本公开的实施例解决了相关技术在查询调度方面的缺陷，并提供了一种新颖且非显而易见的用于根据数据新鲜度要求的优化查询调度的方法、系统和计算机程序产品。在本公开的实施例中，用于优化查询调度的过程包括在信息检索数据处理系统中接收将指定查询的查询执行加速到调度时间之前的时间的请求。然后在指定查询中识别与数据库中的数据相对应的指定字段。此后，检索指定字段的数据新鲜度要求以及对应于该指定字段的数据的变化频率。然后基于该指定字段对应的数据的变化频率来确定在调度时间之前的时间执行该指定查询是否违反数据新鲜度要求。仅当该指定查询的执行被确定为不违反数据新鲜度要求时，该指定查询才被调度在调度时间之前的时间执行。但除此之外，保持执行该指定查询的调度时间。Embodiments of the present disclosure address the shortcomings of the related art in query scheduling and provide a novel and non-obvious method, system and computer program product for optimized query scheduling according to data freshness requirements. In an embodiment of the present disclosure, a process for optimizing query scheduling includes receiving, in an information retrieval data processing system, a request to accelerate query execution of a specified query to a time prior to a scheduled time. The specified fields corresponding to the data in the database are then identified in the specified query. Thereafter, the data freshness requirement for the specified field and the change frequency of the data corresponding to the specified field are retrieved. Then, based on the change frequency of the data corresponding to the specified field, it is determined whether executing the specified query before the scheduled time violates the data freshness requirement. The specified query is scheduled for execution at a time prior to the scheduled time only if execution of the specified query is determined not to violate the data freshness requirement. But otherwise, keep executing the scheduled time for that specified query.

在实施例的一个方面中，该之前的时间是具有更少的调度查询的调度不足(under-scheduled)的时间，该更少的调度查询消耗该信息检索数据处理系统的资源少于该信息检索数据处理系统在该定位时间或调度时间的可用资源，并有足够的可用资源来支持该指定查询的执行。在实施例的另一个方面，所述方法还包括，在确定所述指定查询具有低于阈值的估计执行成本的条件下，即使所述指定查询被确定为不违反所述数据新鲜度要求，仍然保持执行所述指定查询的所述调度时间。在实施例的又一方面，通过将每个所述指定查询的至少一部分与查询片段的表中的条目和对应的历史执行时间进行匹配来计算所述估计执行成本。In one aspect of an embodiment, the previous time is an under-scheduled time with fewer scheduled queries consuming fewer resources of the information retrieval data processing system than the information retrieval The available resources of the data processing system at the location time or scheduling time, and there are sufficient available resources to support the execution of the specified query. In another aspect of the embodiments, the method further includes, upon determining that the specified query has an estimated execution cost below a threshold, even if the specified query is determined not to violate the data freshness requirement, still The scheduled time for executing the specified query is maintained. In yet another aspect of an embodiment, the estimated execution cost is calculated by matching at least a portion of each of the specified queries with entries in a table of query fragments and corresponding historical execution times.

在本公开的另一实施例中，一种信息检索数据处理系统适用于根据数据新鲜度要求的优化查询调度。该系统包括主机计算平台，该主机计算平台包括一个或多个计算机，每个计算机均具有存储器和至少一个处理器；该系统还包括耦合到数据库的查询接口。该查询接口通过计算机通信网络从请求者接收调度针对所述数据库的查询的请求，并作为响应，调度所述查询以供执行，以便将不同结果集合返回给所述请求者。最后，系统包括优化查询调度模块。In another embodiment of the present disclosure, an information retrieval data processing system is adapted for optimized query scheduling based on data freshness requirements. The system includes a host computing platform including one or more computers each having memory and at least one processor; the system also includes a query interface coupled to a database. The query interface receives a request to schedule a query against the database from a requester over a computer communication network, and in response, schedules the query for execution to return a different set of results to the requester. Finally, the system includes an optimized query scheduling module.

该模块包括计算机程序指令，当该计算机程序指令在主机计算平台的存储器中执行时，该指令可操作为接收请求以将指定查询的查询执行加速到调度时间之前的时间，并在指定查询中识别与数据库中的数据相对应的指定字段。该程序指令还能够检索指定字段的数据新鲜度要求和与指定字段对应的数据的变化频率，并基于与指定字段对应的数据的变化频率，确定在调度时间之前的时间而不是在调度时间执行指定查询是否违反了数据新鲜度要求。最后，程序指令能够在通过调度该指定查询到调度时间之前的时间，该指定查询的执行被确定为不违反数据新鲜度要求的条件下，调度指定查询在调度时间之前的时间执行。但除此之外，程序指令被能够保持执行该指定查询的所述调度时间。The module includes computer program instructions operable, when executed in the memory of the host computing platform, to receive a request to accelerate query execution of a specified query to a time prior to the scheduled time, and to identify in the specified query The named fields that correspond to data in the database. The program instructions are also capable of retrieving the data freshness requirements for the specified fields and the frequency of change of data corresponding to the specified fields, and based on the frequency of change of the data corresponding to the specified fields, determine a time before the scheduled time rather than at the scheduled time to perform the specified Query whether data freshness requirements are violated. Finally, the program instructions can schedule the specified query to be executed at a time before the scheduled time under the condition that execution of the specified query is determined not to violate the data freshness requirement by scheduling the specified query to a time before the scheduled time. But otherwise, program instructions are enabled to keep executing the specified query at said scheduled time.

本公开的其他方面将部分地在随后的描述中阐述，并且部分将从描述中变得显而易见，或者可以通过本公开的实践来了解。本公开的各方面将通过所附权利要求中特别指出的要素和组合来实现和获得。应当理解，如所声明的，前述一般描述和以下详细描述仅是示例性和解释性的，而不是对本公开的限制。Additional aspects of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure. Aspects of the disclosure will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure, as stated.

附图说明Description of drawings

包含在本说明书中并构成本说明书一部分的附图示出了本公开的实施例并且与说明书一起用于解释本公开的原理。本文所示的实施例目前是优选的，然而应当理解，本公开不��于所示的精确布置和手段，其中：The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure. The embodiments shown herein are presently preferred, however it should be understood that the disclosure is not limited to the precise arrangements and instrumentalities shown, wherein:

图1是根据数据新鲜度要求的优化查询调度的过程的示意图；FIG. 1 is a schematic diagram of a process of optimizing query scheduling according to data freshness requirements;

图2是为根据数据新鲜度要求的优化查询调度而配置的信息检索数据处理系统的示意图；和，Figure 2 is a schematic diagram of an information retrieval data processing system configured for optimized query scheduling based on data freshness requirements; and,

图3是说明根据数据新鲜度要求的优化查询调度的过程的流程图。Figure 3 is a flowchart illustrating a process for optimizing query scheduling according to data freshness requirements.

具体实施方式Detailed ways

本公开的实施例提供信息检索数据处理系统中的一个或多个查询的根据数据新鲜度要求的优化查询调度。根据本公开的实施例，可以在信息检索数据处理系统中接收用于将指定查询的查询执行从调度时间加速到调度时间之前的时间的请求。响应于该请求的接收，可以在指定查询内识别对应于数据库中的数据的指定字段。同样，检索指定字段的数据新鲜度要求，以及与该指定字段对应的数据的变化频率。此后，可以基于该指定字段对应的数据的变化频率来确定在调度时间之前的时间而不是在调度时间执行指定查询是否违反了数据新鲜度要求。在通过在调度时间之前的时间调度该指定查询，指定查询的执行被确定为不违反数据新鲜度要求的情况下，可以调度该指定查询调度在调度时间之前的时间执行。但是，否则可以保持执行该指定查询的调度时间。Embodiments of the present disclosure provide optimized query scheduling according to data freshness requirements of one or more queries in an information retrieval data processing system. According to an embodiment of the present disclosure, a request for accelerating query execution of a specified query from a scheduled time to a time before the scheduled time may be received in an information retrieval data processing system. In response to receipt of the request, specified fields corresponding to data in the database can be identified within the specified query. Similarly, retrieve the data freshness requirement for a specified field, and the change frequency of the data corresponding to the specified field. Thereafter, it may be determined based on the change frequency of the data corresponding to the specified field whether executing the specified query at the time before the scheduled time instead of at the scheduled time violates the data freshness requirement. In the case that execution of the specified query is determined not to violate the data freshness requirement by scheduling the specified query at a time before the scheduled time, the specified query may be scheduled for execution at a time before the scheduled time. Otherwise, however, the scheduled time for execution of that specified query can be maintained.

在本公开的示例性实施例的进一步说明中，图1图示了用于根据数据新鲜度要求对信息检索数据处理系统优化查询调度的过程。如图1所示，接收请求110，请求将查询120的调度加速到早于该查询的现有调度时间的指定时间130。识别由查询120涉及的数据库或数据模型的字段140。此后，检索字段140的数据新鲜度要求160——具体地，是先前存储的、对在执行访问字段140中的数据的查询时字段140的��新的数据必须有多近的指示，无论是否直接是聚合的一部分。同样，检索字段140的观察到的波动性150——具体地，是对字段140中的数据在过去被更新的频率的指示。In a further illustration of exemplary embodiments of the present disclosure, FIG. 1 illustrates a process for optimizing query scheduling for an information retrieval data processing system according to data freshness requirements. As shown in FIG. 1, a request 110 is received requesting that the scheduling of a query 120 be accelerated to a specified time 130 earlier than the query's existing scheduled time. A field 140 of a database or data model referred to by query 120 is identified. Thereafter, a data freshness requirement 160 for a field 140 is retrieved—specifically, a previously stored indication of how recent the updated data for the field 140 must be when a query to access the data in the field 140 must be performed, whether directly part of the aggregation. Likewise, the observed volatility 150 of a field 140 is retrieved—specifically, an indication of how often the data in the field 140 has been updated in the past.

此后，将新鲜度要求160与在相对于已经接收到请求110的时间的指定时间130观察到的波动性150进行比较，以确定在指定时间130字段140中的数据的预期新鲜度。达到将查询120的调度加速到指定时间130将导致违反新鲜度要求160的程度下，则拒绝请求110并且在调度100中保持查询120以在先前的调度时间190执行。但除此之外，查询120在调度100中被重新调度到指定时间130。可选地，基于查询120的至少一部分来确定查询120的执行成本170。在执行成本低于阈值的程度下，不允许重新调度到指定时间130。Thereafter, the freshness requirement 160 is compared to the observed volatility 150 at the specified time 130 relative to the time the request 110 has been received to determine the expected freshness of the data in the field 140 at the specified time 130 . To the extent that accelerating the scheduling of query 120 to specified time 130 would result in a violation of freshness requirement 160 , request 110 is rejected and query 120 is held in schedule 100 to execute at the previously scheduled time 190 . But otherwise, the query 120 is rescheduled in the schedule 100 to a specified time 130 . Optionally, an execution cost 170 for query 120 is determined based on at least a portion of query 120 . To the extent that the execution cost is below a threshold, no rescheduling to a specified time 130 is allowed.

结合图1描述的过程可以在信息检索数据处理系统200内实现。在进一步说明中，图2示意性地示出了配置为优化查询调度的信息检索数据处理系统。该系统包括主机计算平台210，该主机计算平台210包括一个或多个计算机，每个计算机具有存储器和至少一个处理器。该系统还包括到数据库250(或在数据库250中建模数据的数据模型)的查询接口260。查询接口260被配置为通过计算机通信网络220从分别在不同计算设备230中执行的请求者240(例如，查询客户端)接收查询，并在查询调度270中调度每个接收到的查询的执行，其中每个查询都被分配了指定时间(日/日期/时间或其任意组合)来执行。查询接口260还被配置为向查询客户端240中的请求方提供提交和执行的查询的对应结果。The process described in connection with FIG. 1 may be implemented within information retrieval data processing system 200 . In further illustration, Figure 2 schematically illustrates an information retrieval data processing system configured to optimize query scheduling. The system includes a host computing platform 210 that includes one or more computers, each computer having memory and at least one processor. The system also includes a query interface 260 to a database 250 (or a data model modeling the data in the database 250). Query interface 260 is configured to receive queries from requesters 240 (e.g., query clients) respectively executing in different computing devices 230 via computer communication network 220, and to schedule execution of each received query in query scheduler 270, Each of these queries is assigned a specified time (day/date/time or any combination thereof) to execute. The query interface 260 is also configured to provide corresponding results of submitted and executed queries to requesters in the query client 240 .

重要的是，该系统包括优化查询调度器模块300。模块300包括计算机程序指令，当该计算机程序指令在主计算平台210中执行时，能够从查询客户端240接收各个请求以将指定查询的执行加速到先前调度时间之前的时间。该计算机程序指令还在执行时启用以查阅将不同查询部分与已知执行成本相关联的执行成本表290，以便识别表290中与指定查询的一部分匹配的条目，以便预测指定查询的执行成本。Importantly, the system includes an optimized query scheduler module 300 . Module 300 includes computer program instructions that, when executed in host computing platform 210 , are capable of receiving respective requests from query client 240 to expedite execution of a given query to a time prior to a previously scheduled time. The computer program instructions also enable, when executed, to consult an execution cost table 290 associating different query portions with known execution costs to identify entries in the table 290 that match a portion of a specified query in order to predict the execution cost of the specified query.

计算机程序指令在执行过程中进一步启用，以达到预测执行成本超过阈值的程度，以保证调度的加速，以识别由指定查询涉及的字段，在执行涉及该字段的查询时，在数据新鲜度表280中确定所识别字段的数据的已知波动性和所识别字段的所需数据新鲜度。最后，计算机程序指令在执行期间启用，以将查询的执行的调度加速到查询调度270中的先前调度时间，只要指定查询的确定的波动性不会产生新鲜度值低于新鲜度要求的数据，例如，数据可能会在请求的时间之后和先前调度时间之前变化。但除此之外，计算机程序指令被启用以拒绝查询调度270中的加速调度请求。The computer program instructions are further enabled during execution to the extent that the predicted execution cost exceeds a threshold to warrant an acceleration of the schedule to identify a field involved by a specified query that is included in the data freshness table 280 when executing a query involving that field Determine the known volatility of the data for the identified fields and the desired data freshness for the identified fields in . Finally, the computer program instructions are enabled during execution to accelerate the scheduling of the execution of the query to the previously scheduled time in query scheduling 270, so long as the determined volatility of the specified query does not produce data with a freshness value below the freshness requirement, For example, data may change after the requested time and before the previously scheduled time. But otherwise, computer program instructions are enabled to deny expedited scheduling requests in query scheduling 270 .

在优化查询调度器模块300的操作的更进一步说明中，图3是说明用于信息检索数据处理系统的优化查询调度的方法的流程图。从块310开始，接收将查询的调度加速到先前调度时间之前的时间的请求。在块320中，由该请求涉及的查询连同指定的重新调度时间一起被识别。在块330中，选择查询所涉及的字段，并且在块340中，检索该字段的新鲜度要求。In a further illustration of the operation of the optimized query scheduler module 300, FIG. 3 is a flowchart illustrating a method of optimized query scheduling for an information retrieval data processing system. Beginning at block 310, a request to expedite scheduling of a query to a time prior to a previously scheduled time is received. In block 320, the query referred to by the request is identified along with the specified rescheduling time. In block 330, the field involved in the query is selected, and in block 340, the freshness requirements for that field are retrieved.

然后，在块350中，还检索字段的观察到的波动性，并且在块360中，基于观察到的波动性预测在指定时间字段中数据的新鲜度。在这方面，数据新鲜度表可以连续或周期性地更新数据库或数据模型中不同字段的数据更新频率的度量。在决策块370中，将预测的新鲜度与新鲜度要求进行比较，以确定将查询重新调度到指定时间是否会违反新鲜度要求。也就是说，如果确定该字段的数据被预测为在指定时间和先前调度时间之间的时间段内更新，使得数据在指定时间过早且不新鲜，则将发生违规。如果不是，则将查询重新调度到先前调度时间之前的指定时间。但否则，在块390，拒绝请求。Then, in block 350, the observed volatility for the field is also retrieved, and in block 360, based on the observed volatility, the freshness of the data for the field at the specified time is predicted. In this regard, a data freshness table may be continuously or periodically updated as a measure of how often data is updated for different fields in a database or data model. In decision block 370, the predicted freshness is compared to the freshness requirement to determine whether rescheduling the query to the specified time would violate the freshness requirement. That is, a violation will occur if it is determined that the field's data is predicted to be updated in the time period between the specified time and the previously scheduled time such that the data is too old and not fresh at the specified time. If not, the query is rescheduled to the specified time before the previously scheduled time. But otherwise, at block 390, the request is denied.

本公开可以体现在系统、方法、计算机程序产品或其任何组合中。计算机程序产品可包括计算机可读存储介质或其上具有计算机可读程序指令的介质，用于使处理器执行本公开的各方面。计算机可读存储介质可以是可以保留和存储指令以供指令执行设备使用的有形设备。计算机可读存储介质例如可以是但不限于电子存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或前述的任何合适的组合。The present disclosure can be embodied in a system, method, computer program product, or any combination thereof. A computer program product may include a computer-readable storage medium or a medium having computer-readable program instructions thereon for causing a processor to perform aspects of the present disclosure. A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.

本文描述的计算机可读程序指令可以从计算机可读存储介质下载到相应的计算/处理设备，或者经由网络下载到外部计算机或外部存储设备。计算机可读程序指令可以完全在用户计算机上、部分在用户计算机上、作为独立软件包、部分在用户计算机上和部分在远程计算机上或完全在远程计算机或服务器上执行。本文参考根据本公开的实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图来描述本公开的各方面。应当理解，流程图和/或框图的每个块，以及流程图和/或框图中的块的组合，可以通过计算机可读程序指令来实现。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to a corresponding computing/processing device, or downloaded to an external computer or external storage device via a network. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

这些计算机可读程序指令可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器以产生机器，使得经由计算机或其他可编程数据处理装置的处理器执行的指令创建用于实现流程图和/或框图块中指定的功能/动作的手段(means)。这些计算机可读程序指令也可以存储在计算机可读存储介质中，该计算机可读存储介质可以引导计算机、可编程数据处理装置和/或其他设备以特定方式运行，使得其中存储有指令的计算机可读存储介质包括实现在流程图和/或框图块中指定的功能/动作的方面的指令的制造物品。These computer-readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that instructions executed by the processor of the computer or other programmable data processing apparatus create a process for implementing the flowchart and/or the means of the function/action specified in the block diagram block. These computer-readable program instructions can also be stored in a computer-readable storage medium, and the computer-readable storage medium can guide a computer, a programmable data processing device, and/or other equipment to operate in a specific manner, so that the computer with the instructions stored therein can An article of manufacture that reads a storage medium includes instructions for implementing aspects of the functions/acts specified in the flowcharts and/or block diagram blocks.

计算机可读程序指令也可以加载到计算机、其他可编程数据处理装置或其他设备上，以使一系列操作步骤在计算机、其他可编程装置或其他设备上执行以产生计算机实现的过程，使得在计算机、其他可编程装置或其他设备上执行的指令实现流程图和/或框图块中指定的功能/动作。Computer-readable program instructions can also be loaded onto computers, other programmable data processing devices, or other devices, so that a series of operation steps can be executed on the computer, other programmable devices, or other devices to produce a computer-implemented process, so that the computer , other programmable devices, or instructions executed on other equipment to implement the functions/actions specified in the flowchart and/or block diagram blocks.

图中的流程图和框图说明了根据本公开的各种实施例的系统、方法和计算机程序产品的可能实施方式的架构、功能和操作。在这点上，流程图或框图中的每一块可表示指令的模块、段或一部分，其包括用于实现指定逻辑功能的一个或多个可执行指令。在一些替代实施方式中，块中标注的功能可能不按图中标注的顺序出现。例如，连续显示的两个块实际上可以基本上同时执行，或者这些块有时可以以相反的顺序执行，这取决于所涉及的功能。还应注意，框图和/或流程图的每个块，以及框图和/或流程图中的块的组合，可以通过执行特定功能或动作或执行专用硬件和计算机指令的组合的基于专用硬件的系统来实施。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical functions. In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a special purpose hardware-based system that performs the specified function or action, or a combination of special purpose hardware and computer instructions to implement.

最后，本文使用的术语仅出于描述特定实施例的目的，并不旨在限制本公开。如本文所用，单数形式“一”、“一个”和“该”旨在也包括复数形式，除非上下文另有明确指示。将进一步理解，当在本说明书中使用时，术语“包括”和/或“包括……的”指定了所述特征、整数(integer)、步骤、操作、元素和/或组件的存在，但不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、组件和/或它们的组。Finally, the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will be further understood that when used in this specification, the terms "comprising" and/or "comprising of" designate the presence of stated features, integers, steps, operations, elements and/or components, but not Excludes the presence or addition of one or more other characteristics, integers, steps, operations, elements, components and/or groups thereof.

以下权利要求中的所有装置或步骤加上功能元件的相应结构、材料、动作和等同物旨在包括用于与如具体要求保护的其他要求保护的元素组合来执行功能的任何结构、材料或动作。本公开的描述已经出于说明和描述的目的而呈现，但不旨在穷举或限制于所公开形式的公开。在不脱离本公开的范围和精神的情况下，许多修改和变化对于本领域普通技术人员将是显而易见的。选择和描述实施例是为了最好地解释本公开的原理和实际应用，并且使本领域的其他普通技术人员能够理解具有适合于预期的特定用途的各种修改的各种实施例的公开。The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed . The description of the present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of this disclosure. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

已经如此详细地描述了本申请的公开并且通过参考其实施例，很明显，在不脱离所附权利要求限定的本公开的范围的情况下，可以进行修改和变化。Having thus described the disclosure of the present application in detail and by referring to the examples thereof, it will be apparent that modifications and changes may be made without departing from the scope of the disclosure as defined in the appended claims.

Claims

1. A method for optimizing query scheduling, characterized in that the method comprises:

At an information retrieval data processing system configured to manage queries across a plurality of different computing devices, receiving a query acceleration request to accelerate execution of a query previously scheduled for execution at a scheduled time to a time prior to the scheduled time;

identifying specified fields from said query previously scheduled, said specified fields corresponding to data in a database;

retrieving a data freshness requirement for said specified field from a previously scheduled query and a frequency of change of said data corresponding to said specified field from a previously scheduled query;

determining whether performing the previously scheduled query at the time prior to the scheduled time instead of the scheduled time violates the data freshness requirement based on the frequency of change of the data corresponding to the specified field; as well as

rescheduling the previously scheduled query on the condition that execution of the previously scheduled query is determined not to violate the data freshness requirement by scheduling the previously scheduled query at the time prior to the scheduled time The query executes at said time prior to said scheduled time, but otherwise keeps executing said scheduled time at which said query was previously scheduled.

2. The method of claim 1, wherein said preceding time is an underscheduled time with fewer scheduled queries consuming resources of said information retrieval data processing system less than the available resources of said information retrieval data processing system at the time of location and sufficient of said available resources to support a previously scheduled execution of said query.

3. The method of claim 1, further comprising, upon determining that the previously scheduled query has an estimated execution cost below a threshold, even if the previously scheduled query is determined not to violate The data freshness requirement also still maintains the scheduled time for executing the previously scheduled query.

4. The method of claim 3, wherein the estimate is computed by matching at least a portion of each previously scheduled query with entries in a table of query fragments and corresponding historical execution times Execution costs.

5. An information retrieval data processing system suitable for optimized query scheduling, characterized in that the system includes:

a host computing platform comprising one or more computers each having memory and at least one processor;

a query interface coupled to the database and configured to:

receiving a request to schedule a query against the database from a requestor executing a plurality of different computing devices over a communications network, and

scheduling the query for execution and returning a distinct set of results in response to the query to the requester; and

An optimized query scheduling module, the optimized query scheduling module comprising computer program instructions, when the computer program instructions are executed in the memory of the host computing platform, performing operations comprising:

receiving, in an information retrieval data processing system, a query acceleration request for accelerating execution of a query previously scheduled for execution at a scheduled time to a time prior to the scheduled time;

retrieving a data freshness requirement for said specified field from a previously scheduled query and a frequency of change of said data corresponding to said specified field;

6. The system of claim 5, wherein said preceding time is an underscheduled time with fewer scheduled queries consuming resources of said information retrieval data processing system less than the available resources of said information retrieval data processing system at the time of location and sufficient of said available resources to support a previously scheduled execution of said query.

7. The system of claim 5, wherein the program instructions are further such that if the previously scheduled query is determined to have an estimated execution cost below a threshold, even if the previously scheduled query is determined to In order not to violate the data freshness requirement, the scheduled time for executing the previously scheduled query is still maintained.

8. The system of claim 7, wherein the estimate is computed by matching at least a portion of each previously scheduled query with entries in a table of query fragments and corresponding historical execution times Execution costs.

9. A computer storage medium, wherein a computer program is stored on the computer storage medium, and the computer program is executed by a computer to implement the method for optimizing query scheduling according to claim 1.

10. The computer storage medium of claim 9 , wherein the preceding time is an underscheduled time with fewer scheduled queries consuming the information retrieval data processing system resources are less than the resources available to the information retrieval data processing system at the fix time and there are sufficient resources available to support the previously scheduled execution of the query.

11. The computer storage medium of claim 9 , wherein the method further comprises, upon determining that the previously scheduled query has an estimated execution cost below a threshold, even if the previously scheduled query is determined not to violate the data freshness requirement while still maintaining the scheduled time for execution of the previously scheduled query.

12. The computer storage medium of claim 11 , wherein the calculated query is calculated by matching at least a portion of each previously scheduled query with entries in a table of query fragments and corresponding historical execution times. Estimated implementation costs described above.