You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
Pedram Nimreezi 71eb9ed5ff Add job batching, statistics, timing and linearization options преди 10 години
priv Add support for event fields that are notfound, and begin some documentation преди 11 години
src Add job batching, statistics, timing and linearization options преди 10 години
.gitignore Add support for job processing and variable storage with local state преди 10 години
LICENSE Add project files преди 13 години
Makefile Add support for event fields that are notfound, and begin some documentation преди 11 години
README.org Add job batching, statistics, timing and linearization options преди 10 години
rebar.config Add support for event fields that are notfound, and begin some documentation преди 11 години

README.org

Goldrush is a small Erlang app that provides fast event stream processing

Event processing compiled to a query module

  • per module protected event processing statistics

  • query module logic can be combined for any/all filters

  • query module logic can be reduced to efficiently match event processing

Complex event processing logic

  • match input events with greater than (gt) logic

  • match input events with less than (lt) logic

  • match input events with equal to (eq) logic

  • match input events with wildcard (wc) logic

  • match input events with notfound (nf) logic

  • match no input events (null blackhole) logic

  • match all input events (null passthrough) logic

Handle output events

  • Once a query has been composed the output action can be overriden with an erlang function. The function will be applied to each output event from the query.

Handle job execution and timing

  • create input events that include runtime on successful function executions.

Handle fastest lookups of stored values.

  • provide state storage option to compile, caching the values in query module.

Usage

To use goldrush in your application, you need to define it as a rebar dep or include it in erlang's path.

Before composing modules, you'll need to define a query. The query syntax matches any number of `{erlang, terms}' and is composed as follows:

Simple Logic

  • Simple logic is defined as any logic matching a single event filter

Select all events where 'a' exists and is greater than 0.

    glc:gt(a, 0).

Select all events where 'a' exists and is equal to 0.

    glc:eq(a, 0).

Select all events where 'a' exists and is less than 0.

    glc:lt(a, 0).

Select all events where 'a' exists.

    glc:wc(a).

Select all events where 'a' does not exist.

    glc:nf(a).

Select no input events. User as a black hole query.

    glc:null(false).

Select all input events. Used as a passthrough query.

    glc:null(true).

Combined Logic

  • Combined logic is defined as logic matching multiple event filters

Select all events where both 'a' AND 'b' exists and are greater than 0.

    glc:all([glc:gt(a, 0), glc:gt(b, 0)]).

Select all events where 'a' OR 'b' exists and are greater than 0.

    glc:any([glc:gt(a, 0), glc:gt(b, 0)]).

Select all events where 'a' AND 'b' exists where 'a' is greater than 1 and 'b' is less than 2.

    glc:all([glc:gt(a, 1), glc:lt(b, 2)]).

Select all events where 'a' OR 'b' exists where 'a' is greater than 1 and 'b' is less than 2.

    glc:any([glc:gt(a, 1), glc:lt(b, 2)]).

Reduced Logic

  • Reduced logic is defined as logic which can be simplified to improve efficiency.

Select all events where 'a' is equal to 1, 'b' is equal to 2 and 'c' is equal to 3 and collapse any duplicate logic.

        glc_lib:reduce(
            glc:all([
                glc:any([glc:eq(a, 1), glc:eq(b, 2)]),
                glc:any([glc:eq(a, 1), glc:eq(c, 3)])])).

The previous example will produce and is equivalent to:

    glc:all([glc:eq(a, 1), glc:eq(b, 2), glc:eq(c, 3)]).

To compose a module you will take your Query defined above and compile it.

    glc:compile(Module, Query).
  • At this point you will be able to handle an event using a compiled query.

Begin by constructing an event list.

    Event = gre:make([{'a', 2}], [list]).

Now pass it to your query module to be handled.

    glc:handle(Module, Event).

Handling output events

  • You can override the output action with an erlang function.

Write all input events as info reports to the error logger.

    glc:with(glc:null(true), fun(E) ->
         error_logger:info_report(gre:pairs(E)) end).

Write all input events where `error_level' exists and is less than 5 as info reports to the error logger.

    glc:with(glc:lt(error_level, 5), fun(E) ->
         error_logger:info_report(gre:pairs(E)) end).

To compose a module with state data you will add a third argument (orddict).

    glc:compile(Module, Query, [{stored, value}]).

Return the stored value in this query module.

{ok, value} = glc:get(stored).

To execute a job through the query module, inputting an event on success.

    Event = gre:make([{'a', 2}], [list]).
    Result = glc:run(Module, fun(Event, State) ->
        %% do not end with error | {error, _} or throw an exception 
    end, Event).

%% Note: Jobs are linearized by default glc:compile(Module, Query, [{jobs_linearized, true}]).

To execute a queued job through the query module, inputting an event on success.

    Event = gre:make([{'a', 2}], [list]).
    %% Id must be in <<"binary">> format or 'undefined' if auto-generated.
    Result = glc:insert_queue(Module, Id, fun(Event, State) -> %% 
        %% do not end with error | {error, _} or throw an exception 
    end, Event).
  • width, defaults to number of schedulers if not provided

  • limit, defaults to 10k, hard limit before jobs are rejected

  • queue_limit, defaults to 250k, hard limit before queuing rejections

  • batch_limit, defaults to 10k, the max amount of jobs to process at a time

  • batch_delay, defaults to 1 [* 10], the time to wait for jobs to spool up

  • stats_enabled, defaults to true, provides statistics for events, jobs and queues

  • jobs_linearized, defaults to true, tries to execute the jobs serially.

Lolspeed can be achieved by setting either of the last two options to false.

Return the stored value in this query module.

{ok, value} = glc:get(stored).

Return all stored values in this query module.

[...] = Module:get().

Return the number of input events for this query module.

glc:input(Module).

Return the number of output events for this query module.

glc:output(Module).

Return the number of filtered events for this query module.

glc:filter(Module).

Build

 $ ./rebar compile

or

    $ make

CHANGELOG

0.2.0

  • Support sidejob style execution with enhanced timing and batching.

  • Add more statistics and provide stats & job linearization toggles

0.1.7

  • Support multiple functions specified using `with/2`

  • Add state storage option with generated accessors

0.1.6

  • Add notfound event matching

0.1.5

  • Rewrite to make highly crash resilient

    • per module supervision

    • statistics data recovery

  • Add wildcard event matching

  • Add reset counters