Skip to content

Commit

Permalink
removed max_cache_entries because of calculations, and other PR fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
galrose committed Jun 16, 2024
1 parent 70e11a5 commit 5811865
Show file tree
Hide file tree
Showing 2 changed files with 59 additions and 53 deletions.
84 changes: 50 additions & 34 deletions processor/coralogixprocessor/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# Coralogix Processor

<!-- status autogenerated section -->
| Status | |
| ------------- |-----------|
| Stability | [development]: traces |
| Distributions | [] |
| Warnings | [Statefulness](#warnings) |
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Aprocessor%2Fcoralogix%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fcoralogix) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Aprocessor%2Fcoralogix%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fcoralogix) |
| [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@crobert-1](https://www.github.com/crobert-1), [@galrose](https://www.github.com/galrose) |

| Status | |
|----------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Stability | [development]: traces |
| Distributions | [] |
| Warnings | [Statefulness](#warnings) |
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Aprocessor%2Fcoralogix%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fcoralogix) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Aprocessor%2Fcoralogix%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fcoralogix) |
| [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@crobert-1](https://www.github.com/crobert-1), [@galrose](https://www.github.com/galrose) |

[development]: https://github.com/open-telemetry/opentelemetry-collector#development
<!-- end autogenerated section -->
Expand All @@ -16,55 +18,69 @@
The Coralogix processor adds attributes to spans that enable features in Coralogix.

## Features

### DB Statement Blueprints

This feature enables the processor to create blueprints from SQL queries, this means replacing any variables with `?`.
The blueprint is also hashed to be able to be used with spanmetrics connector.
Long queries can be an issue when being stored in certain metric stores,
this way its possible to use the hash as the identifying dimension on the metric and be able the query certain blueprints.
The blueprint is also hashed to be able to be used with spanmetrics connector.
Long queries can be an issue when being stored in certain metric stores.
Blueprints alleviate this problem by using the hash as the identifying dimension on the metric, which enables
users to query metrics by blueprints.

The added attributes are `db.statement.blueprint` and `db.statement.blueprint.id`.
* `db.statement.blueprint` contains the blueprinted version of the statement, we require them to be sent to coralogix to display your blueprinted statement
* `db.statement.blueprint.id` contains a hash of the statement, this way we can add it as a dimension in the spanmetrics connector and use it to query your blueprints.

* `db.statement.blueprint` contains the blueprinted version of the statement, we require them to be sent to coralogix to
display your blueprinted statement
* `db.statement.blueprint.id` contains a hash of the statement, this way we can add it as a dimension in the spanmetrics
connector and use it to query your blueprints.
* `sampling.priority` if enabled contains the value 100 for new blueprints, further explanation below.

#### Sampling

If sampling is enabled then it stores the found blueprints in an in-memory cache to be able to send only new blueprints that haven't been seen yet.
This only adds an attribute to the span named `sampling.priority`, if the blueprint is new then the sampling priority will be `100`.
If sampling is enabled then it stores the found blueprints in an in-memory cache to be able to send only new blueprints
that haven't been seen yet.
This only adds an attribute to the span named `sampling.priority`, if the blueprint is new then the sampling priority
will be `100`.

Using this key it's possible to use either
the [Tail Sampler](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor)
or
the [Probabilistic Sampler](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor)
to only send new blueprints to coralogix.
If sampling is not enabled it won't cache anything and the `sampling.priorty` attribute won't be added.

Using this key it's possible to use either the Tail Sampler
or the Probabilistic Sampler to only send new blueprints to coralogix.
If sampling is not enabled it won't cache anything and the `sampling.priorty` attribute won't be added.
The cache is limited by the `max_cache_size_mib` configuration, if the cache is full it will remove the oldest entries
to make space for new ones.
The cache stores hashes of the queries, each hash is 8 bytes, so the number of maximum cache entries is calculated
by `max_cache_size_mib * 1024 * 1024 / 8`.

## Config

* `db_statment_blueprints`
* `with_samping` (default: `false`): Enables cache to store seen blueprints and adds the attribute `sampling.priority` to spans with new blueprints.
* `cache_config`
* `max_cache_size_mib` (default: `1048576`) The size of the cache in mebibytes to store seen blueprints hashes.
* `max_cached_entries` (default: `15000000`) The number of seen blueprints the cache will store.
* `sampling`: If enabled, adds the attribute `sampling.priority` with a value of `100` to spans with new blueprints.
Refer to the [Sampling section](#sampling) for more information.
* `max_cache_size_mib` (default: `1024`) The size of the cache in mebibytes to store seen blueprints hashes.

### Basic Setup

This setup is without sampling meaning no `sampling.priority` attribute will be added to spans.
The cache will be disabled.

```yaml
processors:
coralogix:
# Enables blueprints feature
db_statement_blueprints:
# Enables sampling, the cache will be with default values
with_sampling: true
```
### With Cache Config
### With Sampling Config
This setup will enable the cache to store seen blueprints and add the `sampling.priority` attribute to spans with new
blueprints.

```yaml
processors:
coralogix:
# Enables blueprints feature
db_statement_blueprints:
# Enables sampling, the cache will be with default values
with_sampling: true
# Custom cache configuration
cache_config:
# Maximum number of mebibytes stored in the cache
max_cache_size_mib: 1048576 #1GB
# Maximum number of hashes (IDs) stored in cache
max_cached_entries: 10000000 # 1 million
sampling:
max_cache_size_mib: 1024 #1GiB
```
28 changes: 9 additions & 19 deletions processor/coralogixprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,35 +2,25 @@ package coralogixprocessor // import "github.com/open-telemetry/opentelemetry-co

import "fmt"

type cacheConfig struct {
maxCacheSizeMebibytes int64 `mapstructure:"max_cache_size_mib"`
maxCachedEntries int64 `mapstructure:"max_cached_entries"`
type sampingConfig struct {
maxCacheSizeMib int64 `mapstructure:"max_cache_size_mib"`
maxCachedEntries int64 `mapstructure:"max_cached_entries"`
}

type databaseBlueprintsConfig struct {
withSampling bool `mapstructure:"with_sampling"`
cacheConfig `mapstructure:"cache_config"`
sampling sampingConfig `mapstructure:"sampling"`
}

type Config struct {
databaseBlueprintsConfig `mapstructure:"database_blueprints_config"`
}

func (c *Config) Validate() error {
if c.databaseBlueprintsConfig.withSampling == true {
if c.databaseBlueprintsConfig.cacheConfig.maxCacheSizeMebibytes <= 0 {
return fmt.Errorf("max_cache_size_bytes must be a positive integer")
}
if c.databaseBlueprintsConfig.cacheConfig.maxCachedEntries <= 0 {
return fmt.Errorf("max_cached_entries must be a positive integer")
}
} else {
if c.databaseBlueprintsConfig.cacheConfig.maxCacheSizeMebibytes > 0 {
return fmt.Errorf("max_cache_size_bytes cannot be set if with_sampling is false")
}
if c.databaseBlueprintsConfig.cacheConfig.maxCachedEntries > 0 {
return fmt.Errorf("buffer_items cannot be set if with_sampling is false")
}
if c.databaseBlueprintsConfig.sampling.maxCacheSizeMib <= 0 {
return fmt.Errorf("max_cache_size_mib must be a positive integer")
}
if c.databaseBlueprintsConfig.sampling.maxCachedEntries <= 0 {
return fmt.Errorf("max_cached_entries must be a positive integer")
}
return nil
}

0 comments on commit 5811865

Please sign in to comment.