Interface Streaming
- All Known Implementing Classes:
Completion,CompoundOutputEval,Dissect,Drop,Enrich,Eval,Grok,InferencePlan,Insist,Keep,Project,RegexExtract,RegisteredDomain,Rename,Rerank,ResolvingProject,UriParts,UserAgent
public interface Streaming
This interface marks commands which do not add or remove rows and aren't sensitive to the exact order of the rows.
This is required to decide whether a command is compatible with
is safe to be replaced by this sequence
where "local" means that it's correct to apply the limit only on the data node, without a corresponding reduction on the coordinator.
See
as long as MY_COMMAND preserves the columns that we order by.
and
If the first X rows do not contain any "dark" rows, the result is empty, however if we switch:
and we have N nodes, then the first N*X rows may contain "dark" rows, and the final result is not empty in this case.
HoistRemoteEnrichLimit and
HoistRemoteEnrichTopN.
For the most part, such commands can be thought of to be operating on a row-by-row basis. A more formal definition is that this command can be run on data nodes and this sequence
... LIMIT X | MY_COMMAND
... local LIMIT X | MY_COMMAND | LIMIT X
Limit.local().
We also require the same condition to hold for TopN, that is, the following are equivalent
... TOP N [field1, ..., fieldN] | MY_COMMAND
... local TOP N [field1, ..., fieldN] | MY_COMMAND | TOP N [field1, ..., fieldN]
Most commands that satisfy this will also satisfy the simpler (but stronger) conditions that the following are equivalent:
... LIMIT X | MY_COMMAND
... MY_COMMAND | LIMIT X
... TOP N [field1, ..., fieldN] | MY_COMMAND
... | MY_COMMAND | TOP N [field1, ..., fieldN]
It is not true, for example, for WHERE:
... TOP X [field] | WHERE side="dark"
... local TOP X [field] | WHERE side="dark" | TOP X [field]