174 lines
6.2 KiB
Markdown
174 lines
6.2 KiB
Markdown
# Leakybuckets
|
|
|
|
## Bucket concepts
|
|
|
|
The Leakybucket is used for decision making. Under certain conditions,
|
|
enriched events are poured into these buckets. When these buckets are
|
|
full, we raise a new event. After this event is raised the bucket is
|
|
destroyed. There are many types of buckets, and we welcome any new
|
|
useful design of buckets.
|
|
|
|
Usually, the bucket configuration generates the creation of many
|
|
buckets. They are differentiated by a field called stackkey. When two
|
|
events arrive with the same stackkey they go in the same matching
|
|
bucket.
|
|
|
|
The very purpose of these buckets is to detect clients that exceed a
|
|
certain rate of attempts to do something (ssh connection, http
|
|
authentication failure, etc...). Thus, the most used stackkey field is
|
|
often the source_ip.
|
|
|
|
## Standard leaky buckets
|
|
|
|
Default buckets have two main configuration options:
|
|
|
|
* capacity: number of events the bucket can hold. When the capacity
|
|
is reached and a new event is poured, a new event is raised. We
|
|
call this type of event overflow. This is an int.
|
|
|
|
* leakspeed: duration needed for an event to leak. When an event
|
|
leaks, it disappears from the bucket.
|
|
|
|
## Trigger
|
|
|
|
A Trigger is a special type of bucket with a capacity of zero. Thus, when an
|
|
event is poured into a trigger, it always raises an overflow.
|
|
|
|
## Uniq
|
|
|
|
A Uniq is a bucket working like the standard leaky bucket except for one
|
|
thing: a filter returns a property for each event and only one
|
|
occurrence of this property is allowed in the bucket, thus the bucket
|
|
is called uniq.
|
|
|
|
## Counter
|
|
|
|
A Counter is a special type of bucket with an infinite capacity and an
|
|
infinite leakspeed (it never overflows, nor leaks). Nevertheless,
|
|
the event is raised after a fixed duration. The option is called
|
|
duration.
|
|
|
|
## Bayesian
|
|
|
|
A Bayesian is a special bucket that runs bayesian inference instead of
|
|
counting events. Each event must have its likelihoods specified in the
|
|
yaml file under `prob_given_benign` and `prob_given_evil`. The bucket
|
|
will continue evaluating events until the posterior goes above the
|
|
threshold (triggering the overflow) or the duration (specified by leakspeed)
|
|
expires.
|
|
|
|
## Available configuration options for buckets
|
|
|
|
### Fields for standard buckets
|
|
|
|
* type: mandatory field. Must be one of "leaky", "trigger", "uniq" or
|
|
"counter"
|
|
|
|
* name: mandatory field, but the value is totally open. Nevertheless,
|
|
this value will tag the events raised by the bucket.
|
|
|
|
* filter: mandatory field. It's a filter that is run to decide whether
|
|
an event matches the bucket or not. The filter has to return
|
|
a boolean. As a filter implementation we use
|
|
https://github.com/antonmedv/expr
|
|
|
|
* capacity: [mandatory for now, shouldn't be mandatory in the final
|
|
version] it's the size of the bucket. When pouring in a bucket
|
|
already with size events, it overflows.
|
|
|
|
* leakspeed: leakspeed is a time duration (it has to be parsed by
|
|
https://golang.org/pkg/time/#ParseDuration). After each interval, an
|
|
event is leaked from the bucket.
|
|
|
|
* stackkey: mandatory field. This field is used to differentiate on
|
|
which instance of the bucket the matching events will be poured.
|
|
When an unknown stackkey is seen in an event, a new bucket is created.
|
|
|
|
* on_overflow: optional field, that tells what to do when the
|
|
bucket is returning the overflow event. As of today, the possibilities
|
|
are "ban,1h", "Reprocess" or "Delete".
|
|
Reprocess is used to send the raised event back to the event pool to
|
|
be matched against buckets
|
|
|
|
### Fields for special buckets
|
|
|
|
#### Uniq
|
|
|
|
* uniq_filter: an expression that must comply with the syntax defined
|
|
in https://github.com/antonmedv/expr and must return a string.
|
|
All strings returned by this filter in the same buckets have to be different.
|
|
Thus if a string is seen twice, the event is dismissed.
|
|
|
|
#### Trigger
|
|
|
|
Capacity and leakspeed are not relevant for this kind of bucket.
|
|
|
|
#### Counter
|
|
|
|
* duration: the Counter will be destroyed after this interval
|
|
has elapsed since its creation. The duration must be parsed
|
|
by https://golang.org/pkg/time/#ParseDuration.
|
|
Nevertheless, this kind of bucket is often used with an infinite
|
|
leakspeed and an infinite capacity [capacity set to -1 for now].
|
|
|
|
#### Bayesian
|
|
|
|
* bayesian_prior: The prior to start with
|
|
* bayesian_threshold: The threshold for the posterior to trigger the overflow.
|
|
* bayesian_conditions: List of Bayesian conditions with likelihoods
|
|
|
|
Bayesian Conditions are built from:
|
|
* condition: The expr for this specific condition to be true
|
|
* prob_given_evil: The likelihood an IP satisfies the condition given the fact
|
|
that it is a maliscious IP
|
|
* prob_given_benign: The likelihood an IP satisfies the condition given the fact
|
|
that it is a benign IP
|
|
* guillotine: Bool to stop the condition from getting evaluated if it has
|
|
evaluated to true once. This should be used if evaluating the condition is
|
|
computationally expensive.
|
|
|
|
|
|
## Add examples here
|
|
|
|
```
|
|
# ssh bruteforce
|
|
- type: leaky
|
|
name: ssh_bruteforce
|
|
filter: "Meta.log_type == 'ssh_failed-auth'"
|
|
leakspeed: "10s"
|
|
capacity: 5
|
|
stackkey: "source_ip"
|
|
on_overflow: ban,1h
|
|
|
|
# reporting of src_ip,dest_port seen
|
|
- type: counter
|
|
name: counter
|
|
filter: "Meta.service == 'tcp' && Event.new_connection == 'true'"
|
|
distinct: "Meta.source_ip + ':' + Meta.dest_port"
|
|
duration: 5m
|
|
capacity: -1
|
|
|
|
- type: trigger
|
|
name: "New connection"
|
|
filter: "Meta.service == 'tcp' && Event.new_connection == 'true'"
|
|
on_overflow: Reprocess
|
|
```
|
|
|
|
# Note on leakybuckets implementation
|
|
|
|
[This is not dry enough to have many details here, but:]
|
|
|
|
The bucket code is triggered by runPour in pour.go, by calling the `leaky.PourItemToHolders` function.
|
|
There is one struct called buckets which is for now a
|
|
`map[string]interface{}` that holds all buckets. The key of this map
|
|
is derived from the filter configured for the bucket and its
|
|
stackkey. This looks complicated, but it allows us to use
|
|
only one struct. This is done in buckets.go.
|
|
|
|
On top of that the implementation defines only the standard leaky
|
|
bucket. A goroutine is launched for every bucket (`bucket.go`). This
|
|
goroutine manages the life of the bucket.
|
|
|
|
For special buckets, hooks are defined at initialization time in
|
|
manager.go. Hooks are called when relevant by the bucket goroutine
|
|
when events are poured and/or when a bucket overflows.
|