yiisoft / data

Data providers

Home Page:https://www.yiiframework.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

maxResults: the dynamic page size

kamarton opened this issue · comments

Google applies the concept of maxResults. An alternative to pageSize, which does not specify a fixed number of elements in the response, but only the maximum number. MaxResults can vary between 0 and maximum on any page.

Items in the response can be limited to any criteria:

  • the response is too long
  • the response requires too many resources (eg RAM)
  • the server is overloaded
  • and more

Its application is very diverse:

  • frontend/backend/api:
    • returns with fewer items, but much faster. By the time the user scrolls through the response items and scrolls further, the newer page will be loaded.
    • It allows processing huge sets of data (eg logs) in a system and user friendly way. For example, it allows only daily partitions, so the user can see where the search is going on the fly, and if he is no longer interested, he can navigate away from the page or stop the more scanning, saving the useless responses and processes.
    • The page uses the ignore_user_abort() function and if it detects connection_aborted it stops any further running (similar to the console process, the kill signal)
  • console:
    • Limits the number of items created in the process
    • does not let the process run into infinity, eg. handles kill signal.
  • There are paid services (such as Google Bigquery) that pay for the amount of data processed. It is unfortunate that processes continue to run even when they are no longer needed. (The Bigquery supports a partitioning solution by default.)

I could imagine this feature in KeysetPaginator.

This roughly strains the boundaries of the layers and requires its correct design. You can think about whether the theme of a separate package, or fits into the framework, but one thing is certain: at worst, any external package should be able to handle a similar concept.

Suggest

interface StoppableStrategyDataInterface {
  public function withStoppableStrategy(StobbaleStrategy $strategy)
}

example strategies

  • list (group)
    • AnyStrategy
    • ScoredStrategy
  • final
    • UserAbortStrategy (connection_aborted, kill signal handling)
    • PartitionStrategy (field and an associated value range)
    • ScanningCountStrategy
    • SystemResourceStrategy
      • SystemMemoryStrategy
      • SystemDiskStrategy
      • SystemCpuStratagy
Q A
Version 1.0.0 under development
PHP version -
Operating system -

How would such implementation look for SQL and array?

It cannot be used with PHP arrays unless you are working from an external source.
Thinking about SQL, it can only be used for larger datasets.

without partitions:

SELECT * FROM ... WHERE larege_text like `%apple%` order by created_at LIMIT 3
// after 5 mins
// result: 1, 2, 1000000001

with date partitions:

SELECT * FROM ... WHERE larege_text like '%apple%' and date <= '2019-01-01' order by created_at LIMIT 3
// result: 1, 2
SELECT * FROM ... WHERE larege_text like '%apple%' and date <= '2019-01-01' and id > 2 order by created_at LIMIT 3
// result: -
SELECT * FROM ... WHERE larege_text like '%apple%' and date >= '2019-01-02' and date < '2019-01-03' order by created_at LIMIT 3
// result: -
...
SELECT * FROM ... WHERE larege_text like '%apple%' and date >= '2019-02-01' and date < '2019-02-02' order by created_at LIMIT 3
// result: 1000000001

This is only useful for scanner-based databases / data sources. Closable, thoughtful does not make sense in the framework.