Skip to content

Commit

Permalink
Merge pull request #6475 from cortexproject/CharlieTLe-patch-3
Browse files Browse the repository at this point in the history
Update roadmap.md
  • Loading branch information
CharlieTLe authored Jan 6, 2025
2 parents 42028f7 + b3273ff commit dea0fd8
Showing 1 changed file with 16 additions and 24 deletions.
40 changes: 16 additions & 24 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,39 +9,31 @@ This document highlights some ideas for major features we'd like to implement in
To get a more complete overview of planned features and current work, see the [issue tracker](https://github.com/cortexproject/cortex/issues).
Note that these are not ordered by priority.

## Helm charts and other packaging
Last updated: January 4, 2025

We have a [helm chart](https://github.com/cortexproject/cortex-helm-chart) but it needs work before it can be effectively utilised by different backends. We also don't provide an official set of dashboards and alerts to our users yet. This is one of the most requested features and something we will tackle in the immediate future. We also plan on publishing debs, rpms along with guides on how to run Cortex on bare-metal.
## Short-term (< 6 months)

## Auth Gateway
### Support for Prometheus Remote Write 2.0

Cortex server has a simple authentication mechanism (X-Scope-OrgId) but users can't use the multitenancy features out of the box without complicated proxy configuration. It's hard to support all the different authentication mechanisms used by different companies but plan to have a simple but opinionated auth-gateway that provides value out of the box. The configuration could be as simple as:
[Prometheus Remote Write 2.0](https://prometheus.io/docs/specs/remote_write_spec_2_0/)

```
tenants:
- name: infra-team
password: basic-auth-password
- name: api-team
password: basic-auth-password2
```
* adds a new Protobuf Message with new features enabling more use cases and wider adoption on top of performance and cost savings
* deprecates the previous Protobuf Message from a 1.0 Remote-Write specification
* adds mandatory X-Prometheus-Remote-Write-*-Written HTTP response headers for reliability purposes

## Billing and Usage analytics
For more information tracking this, please see [issue #6116](https://github.com/cortexproject/cortex/issues/6116).

We have all the metrics to track how many series, samples and queries each tenant is sending but don't have dashboards that help with this. We plan to have dashboards and UIs that will help operators monitor and control each tenants usage out of the box.
## Long-term (> 6 months)

## Downsampling
Downsampling means storing fewer samples, e.g. one per minute instead of one every 15 seconds.
This makes queries over long periods more efficient. It can reduce storage space slightly if the full-detail data is discarded.
### CNCF Graduation Status

## Per-metric retention
Cortex was accepted to the CNCF on September 20, 2018 and moved to the Incubating maturity level on August 20, 2020. The Cortex maintainers are working towards promoting the project to the graduation status.

Cortex blocks storage supports deleting all data for a tenant after a time period (e.g. 3 months, 1 year), but we would also like to have custom retention for subsets of metrics (e.g. delete server metrics but retain business metrics).
For more information tracking this, please see [issue #6075](https://github.com/cortexproject/cortex/issues/6075).

## Exemplar support
[Exemplars](https://docs.google.com/document/d/1ymZlc9yuTj8GvZyKz1r3KDRrhaOjZ1W1qZVW_5Gj7gA/edit)
let you link metric samples to other data, such as distributed tracing.
As of early 2021 Prometheus will collect exemplars and send them via remote write, but Cortex needs to be extended to handle them.
### Downsampling

## Scalability
[Downsampling](https://thanos.io/tip/components/compact.md/#downsampling) means storing fewer samples, e.g. one per minute instead of one every 15 seconds.
This makes queries over long periods more efficient. It can reduce storage space slightly if the full-detail data is discarded.

Scalability has always been a focus for the project, but there is a lot more work to be done. We can now scale to 100s of Millions of active series but 1 Billion active series is still an unknown.
For more information tracking this, please see [issue #4322](https://github.com/cortexproject/cortex/issues/4322).

0 comments on commit dea0fd8

Please sign in to comment.