Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: cfgardenobserver #33618

Closed
1 of 3 tasks
jriguera opened this issue Jun 18, 2024 · 5 comments
Closed
1 of 3 tasks

New component: cfgardenobserver #33618

jriguera opened this issue Jun 18, 2024 · 5 comments
Labels

Comments

@jriguera
Copy link
Contributor

The purpose and use-cases of the new component

We would like to implement a new observer for Cloudfoundry containers/applications. The idea is not make use of the main API but the local one, which is available as unix socket on each node and manages the containers lifecycle. The main API would remain as optional and only to get Application info (which only involves one GET http request once the app id is known).

We want to have the applications exposing metrics (prometheus openmetrics format: /metrics endpoint) automatically scraped with an OpenTelemetry collector running on each node. The idea is building a "cfgardenobserver" (similar to k8sobserver, or dockerobserver) which polls the garden api periodically and automatically discovers the (local) containers in the diego-cell and feeds a receivercreator to build (or remove -if the container is gone-) an instance of a prometheusclient (or most likely a simpleprometheusreceiver) with the proper settings (mTLS or extra metrics labels) to scrape the app instance. Eventually the metrics collected will be pushed to an OTLP endpoint. This would be a similar to K8S with the the K8S downward API (a sort of reduced API for simple/local queries) with the k8sobserver, which avoids querying the entire cluster and only returns information about containers (pods) currently running on the node.

Example configuration for the component

# Otel collector config
extensions:
  cfgarden_observer:
    endpoint: unix:///var/vcap/data/garden/garden.sock
    default_container_port: 8080
    default_metrics_path: /internal/metrics
    # Only used optionally to get application annotations/labels and extra information not provided by low level Garden API
    # Use it carefully because the API is slow and it cannot be continuosly queried (on each listcontainers run)
    cf_api:
        url: https://cf.api
        username: user
        password: pass


receivers:
  receiver_creator:
    watch_observers: [cfgarden_observer]
    receivers:
      prometheus_simple:
        # we only scrape apps from a specific org id
        rule: type == "container" && labels["network.org_id"] == "98368c71-f3fb-4897-81b1-b58e1f226ea1"
        config:
           collection_interval: 60s
          # This is not needed but the endpoint would be build from the ContainerIP property using the default port `8080`
          # (mapped by default in all apps) with the default path /internal/metrics
          # endpoint: '`endpoint`'
        resource_attributes:
          org_id: labels["network.org_id"]
          space_id: labels["network.space_id"]
          app_id: labels["network.app_id"]
          org: labels["log_config.tags.organization_name"]
          space: labels["log_config.tags.space_name"]
          app: labels["log_config.tags.space_name"]
          app_index: labels["log_config.index"]

Telemetry data types supported

Metrics

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am a member of the OpenTelemetry organization.
  • If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

@crobert-1

Sponsor (optional)

@crobert-1

Additional context

Cloudfoundry is a PaaS which has implemented their own container technology. The component in charge of providing a local API in each node (also known as "diego-cell") is named Garden. Garden can use different backends to manage the lifecycle of the containers (runc, containerd, etc) and abstrat the OS. The API supports unix socket or regular network port. There is a golang library to handle the functionality of interacting with containers. In this case, only GET/List operations would be performed to list running containers and get the [ContainerInfo] property (https://pkg.go.dev/code.cloudfoundry.org/garden#ContainerInfo) will be used. The ContainerInfo will be mapped to observer.Endpoint as container type.

Applications running in a container can be a copy from a docker image or an application build from the code with a buildpack.
By convention, buildpack apps "always" run on a port 8080 inside the container, which is mapped to external ports, depending on the PaaS configuration can be secured with mTLS, non secured at all, or both. The operators can pass the mTLS configuration to prometheus_simple receiver to define it.

Docker containers have always the first exposed port as main application port, but is not required to be 8080. The port can be extracted from the port-mappings provided by Garden.

In any case, the port 61001 (inside the container) uses mTLS and it is always proxying the traffic to the main application port.

@jriguera jriguera added needs triage New item requiring triage Sponsor Needed New component seeking sponsor labels Jun 18, 2024
@crobert-1
Copy link
Member

crobert-1 commented Jun 18, 2024

I'll sponsor this. Feel free to start submitting PRs, @jriguera 👍

@crobert-1 crobert-1 added Accepted Component New component has been sponsored and removed Sponsor Needed New component seeking sponsor needs triage New item requiring triage labels Jun 18, 2024
@jriguera
Copy link
Contributor Author

jriguera commented Jun 19, 2024

In order to simplify the review process, we will split the functionality in 2 PRs:

  1. Garden API interaction: get list of the local running containers and make the properties available for the receivercreator.
  2. Add support for labels and annotations. This will require querying the CF API as garden containers do not have such properties. Those annotations/labels are useful in the receivercreator rules to define which containers (app instances) should be considered (scraped) as not all containers would expose /metrics endpoint.

Regarding point 1, example of properties available in Garden API:

Container f5b4614a-746a-4758-565e-e050 is active:
    garden.network.external-ip=10.80.111.100
    log_config={"guid":"62847b99-edb5-44cf-9602-34f85f8e6b0d","index":0,"source_name":"CELL","tags":{"app_id":"62847b99-edb5-44cf-9602-34f85f8e6b0d","app_name":"myappname","instance_id":"0","organization_id":"98368c71-f3fb-4897-81b1-b58e1f226ea1","organization_name":"myorgname","process_id":"62847b99-edb5-44cf-9602-34f85f8e6b0d","process_instance_id":"f5b4614a-746a-4758-565e-e050","process_type":"web","source_id":"62847b99-edb5-44cf-9602-34f85f8e6b0d","space_id":"8986f12e-df81-44e2-b86b-8d763fa3d09f","space_name":"myspacename"}}
    garden.network.container-ip=10.255.141.133
    garden.network.host-ip=255.255.255.255
    network.app_id=62847b99-edb5-44cf-9602-34f85f8e6b0d
    network.org_id=98368c71-f3fb-4897-81b1-b58e1f226ea1
    network.ports=8080
    executor:owner=executor
    network.space_id=8986f12e-df81-44e2-b86b-8d763fa3d09f
    garden.network.interface=eth0
    garden.network.mapped-ports=[{"HostPort":61144,"ContainerPort":8080},{"HostPort":61145,"ContainerPort":2222},{"HostPort":61146,"ContainerPort":61001},{"HostPort":61147,"ContainerPort":61002}]
    garden.state=created
    network.container_workload=app
    network.policy_group_id=62847b99-edb5-44cf-9602-34f85f8e6b0d

jpkrohling pushed a commit that referenced this issue Jul 15, 2024
**Description:** 
We would like to implement a new observer for
[Cloudfoundry](https://www.cloudfoundry.org/) containers/applications.
The idea is not make use of the main API but the local one, which is
available as unix socket on each node and manages the containers
lifecycle. The main API would remain as optional and only to get
Application info (which only involves one GET http request once the app
id is known).


**Link to tracking Issue:**
[33618](#33618)

**Testing:** First component PR

**Documentation:** Added Readme

---------

Co-authored-by: Tomás Mota <[email protected]>
Co-authored-by: Jose Riguera <[email protected]>
Co-authored-by: Tomás Mota <[email protected]>
Co-authored-by: Curtis Robert <[email protected]>
Co-authored-by: José Riguera Lopez <[email protected]>
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Aug 20, 2024
@crobert-1 crobert-1 removed the Stale label Aug 20, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Oct 21, 2024
andrzej-stencel pushed a commit that referenced this issue Nov 5, 2024
…4513)

**Description:** 
First Component PR: #33727
This is the second PR for adding the cfgardenobserver, with the first
suggested implementation. There are definitely some decisions made that
require feedback, such as adding the CloudFoundry application labels to
the Endpoint labels, and the decision to use the `Container`
EndpointType at all.

**Link to tracking Issue:** #33618 

**Testing:** Unit testing of config and extension

**Documentation:** Updated readme with new configuration and endpoints

---------

Co-authored-by: sam clulow <[email protected]>
Co-authored-by: sam clulow <[email protected]>
Co-authored-by: José Riguera Lopez <[email protected]>
michael-burt pushed a commit to michael-burt/opentelemetry-collector-contrib that referenced this issue Nov 7, 2024
…en-telemetry#34513)

**Description:** 
First Component PR: open-telemetry#33727
This is the second PR for adding the cfgardenobserver, with the first
suggested implementation. There are definitely some decisions made that
require feedback, such as adding the CloudFoundry application labels to
the Endpoint labels, and the decision to use the `Container`
EndpointType at all.

**Link to tracking Issue:** open-telemetry#33618 

**Testing:** Unit testing of config and extension

**Documentation:** Updated readme with new configuration and endpoints

---------

Co-authored-by: sam clulow <[email protected]>
Co-authored-by: sam clulow <[email protected]>
Co-authored-by: José Riguera Lopez <[email protected]>
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this issue Dec 17, 2024
…en-telemetry#34513)

**Description:** 
First Component PR: open-telemetry#33727
This is the second PR for adding the cfgardenobserver, with the first
suggested implementation. There are definitely some decisions made that
require feedback, such as adding the CloudFoundry application labels to
the Endpoint labels, and the decision to use the `Container`
EndpointType at all.

**Link to tracking Issue:** open-telemetry#33618 

**Testing:** Unit testing of config and extension

**Documentation:** Updated readme with new configuration and endpoints

---------

Co-authored-by: sam clulow <[email protected]>
Co-authored-by: sam clulow <[email protected]>
Co-authored-by: José Riguera Lopez <[email protected]>
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants