Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram sum is not correct in latest version (critical) #3284

Closed
krupyansky opened this issue Oct 14, 2022 · 5 comments · Fixed by #3293
Closed

Histogram sum is not correct in latest version (critical) #3284

krupyansky opened this issue Oct 14, 2022 · 5 comments · Fixed by #3293
Assignees
Labels
bug Something isn't working pkg:exporter:otlp Related to the OTLP exporter package

Comments

@krupyansky
Copy link

krupyansky commented Oct 14, 2022

Description

On localhost:8889/metrics I get the data:

# HELP enp_offering_eoffr_http_client_total_duration

# TYPE enp_offering_eoffr_http_client_total_duration histogram

enp_offering_eoffr_http_client_total_duration_sum{http_flavor="HTTP/1.1",http_method="POST",http_status_code="200",instance="23d4c8be87b8",job="enp-offering",to_service="catalog"} 2557

enp_offering_eoffr_http_client_total_duration_sum{http_flavor="HTTP/1.1",http_method="POST",http_status_code="200",instance="23d4c8be87b8",job="enp-offering",to_service="ordering"} 2557

I have one histogram with different values for key "to_service": catalog and ordering.

And I get always the same value for different buckets.
Also, values of sum for both buckets do not increase.

This bug is not reproduced in previous versions.

Environment

  • OS: iOS
  • Architecture: x86
  • Go Version: 1.18
  • opentelemetry-go version: latest

Steps To Reproduce

  1. Create one syncint64 histogram
  2. Record to the histogram milliseconds and add some attribute (ex. "to_service") with one value (ex. "catalog)". And repeat. Only this time add the same attribute with another value (ex. ordering).
  3. See error

Expected behavior

Expect correct values ​​for sum

@krupyansky krupyansky added the bug Something isn't working label Oct 14, 2022
@krupyansky krupyansky changed the title Histogram sum is not correct Histogram sum is not correct (critical) Oct 14, 2022
@krupyansky krupyansky changed the title Histogram sum is not correct (critical) Histogram sum is not correct in latest version (critical) Oct 14, 2022
@MrAlias
Copy link
Contributor

MrAlias commented Oct 14, 2022

I am not able to reproduce this behavior with the following code. Please provide a complete example of your usage that can be run to reproduce the bug and the commit hash of the package you are running against.

main.go:

package main

import (
	"context"
	"fmt"
	"log"
	"net/http"
	"os"
	"os/signal"

	"github.com/prometheus/client_golang/prometheus/promhttp"

	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/prometheus"
	"go.opentelemetry.io/otel/sdk/metric"
)

func main() {
	ctx := context.Background()

	// The exporter embeds a default OpenTelemetry Reader and
	// implements prometheus.Collector, allowing it to be used as
	// both a Reader and Collector.
	exporter, err := prometheus.New()
	if err != nil {
		log.Fatal(err)
	}
	provider := metric.NewMeterProvider(metric.WithReader(exporter))
	meter := provider.Meter("github.com/open-telemetry/opentelemetry-go/example/prometheus")

	// Start the prometheus HTTP server and pass the exporter Collector to it
	go serveMetrics()

	// This is the equivalent of prometheus.NewHistogramVec
	histogram, err := meter.SyncInt64().Histogram("enp_offering_eoffr_http_client_total_duration")
	if err != nil {
		log.Fatal(err)
	}
	histogram.Record(ctx, 23, attribute.String("to_service", "catalog"))
	histogram.Record(ctx, 187, attribute.String("to_service", "ordering"))

	ctx, _ = signal.NotifyContext(ctx, os.Interrupt)
	<-ctx.Done()
}

func serveMetrics() {
	log.Printf("serving metrics at localhost:2223/metrics")
	http.Handle("/metrics", promhttp.Handler())
	err := http.ListenAndServe(":2223", nil)
	if err != nil {
		fmt.Printf("error serving http: %v", err)
		return
	}
}

go.mod:

go 1.18

require (
	github.com/prometheus/client_golang v1.13.0
	go.opentelemetry.io/otel v1.11.0
	go.opentelemetry.io/otel/exporters/prometheus v0.32.3
	go.opentelemetry.io/otel/sdk/metric v0.32.3
)

require (
	github.com/beorn7/perks v1.0.1 // indirect
	github.com/cespare/xxhash/v2 v2.1.2 // indirect
	github.com/go-logr/logr v1.2.3 // indirect
	github.com/go-logr/stdr v1.2.2 // indirect
	github.com/golang/protobuf v1.5.2 // indirect
	github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
	github.com/prometheus/client_model v0.2.0 // indirect
	github.com/prometheus/common v0.37.0 // indirect
	github.com/prometheus/procfs v0.8.0 // indirect
	go.opentelemetry.io/otel/metric v0.32.3 // indirect
	go.opentelemetry.io/otel/sdk v1.11.0 // indirect
	go.opentelemetry.io/otel/trace v1.11.0 // indirect
	golang.org/x/sys v0.0.0-20220919091848-fb04ddd9f9c8 // indirect
	google.golang.org/protobuf v1.28.1 // indirect
)

replace go.opentelemetry.io/otel => ../..

replace go.opentelemetry.io/otel/exporters/prometheus => ../../exporters/prometheus

replace go.opentelemetry.io/otel/sdk => ../../sdk

replace go.opentelemetry.io/otel/sdk/metric => ../../sdk/metric

replace go.opentelemetry.io/otel/metric => ../../metric

replace go.opentelemetry.io/otel/trace => ../../trace

Project version: b6a22ab

Running the above:

$ go run .
2022/10/14 13:30:51 reader is not registered
2022/10/14 13:30:51 serving metrics at localhost:2223/metrics

and in another terminal:

$ curl -s localhost:2223/metrics | grep 'enp_offering_eoffr_http_client_total_duration'
# HELP enp_offering_eoffr_http_client_total_duration
# TYPE enp_offering_eoffr_http_client_total_duration histogram
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="0"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="5"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="10"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="25"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="50"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="75"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="100"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="250"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="750"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="1000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="2500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="5000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="7500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="10000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="catalog",le="+Inf"} 1
enp_offering_eoffr_http_client_total_duration_sum{to_service="catalog"} 23
enp_offering_eoffr_http_client_total_duration_count{to_service="catalog"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="0"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="5"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="10"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="25"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="50"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="75"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="100"} 0
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="250"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="750"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="1000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="2500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="5000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="7500"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="10000"} 1
enp_offering_eoffr_http_client_total_duration_bucket{to_service="ordering",le="+Inf"} 1
enp_offering_eoffr_http_client_total_duration_sum{to_service="ordering"} 187
enp_offering_eoffr_http_client_total_duration_count{to_service="ordering"} 1

@MrAlias MrAlias added response needed Waiting on user input before progress can be made pkg:exporter:prometheus Related to the Prometheus exporter package labels Oct 14, 2022
@krupyansky
Copy link
Author

krupyansky commented Oct 15, 2022

I make repo for bug:
https://github.com/krupyansky/otel-bug

  1. make init
  2. make curl-metrics
  3. Go to browser and enter: localhost:8889/metrics
  4. and you can see bug

image

@MrAlias MrAlias removed the response needed Waiting on user input before progress can be made label Oct 15, 2022
@MrAlias
Copy link
Contributor

MrAlias commented Oct 15, 2022

This looks specific to the otlpmetric exporters. The histogram sum is a pointer that is not recreated for every iteration of the export transform loop:

Therefore the last one iterated over is use for all.

@MrAlias MrAlias added this to the Metric v0.33.0 milestone Oct 15, 2022
@MrAlias MrAlias added pkg:exporter:otlp Related to the OTLP exporter package and removed pkg:exporter:prometheus Related to the Prometheus exporter package labels Oct 15, 2022
@MrAlias MrAlias self-assigned this Oct 15, 2022
@MrAlias MrAlias moved this to In Progress in Go: Metric SDK (Beta) Oct 15, 2022
MrAlias added a commit to MrAlias/opentelemetry-go that referenced this issue Oct 15, 2022
Fixes open-telemetry#3284

The transform uses the same reference a histogram datapoint sum value
for all transformed metrics. This results in all transformed metrics
being exported with the same sum (see open-telemetry#3284). This changes the transform
to correctly reference a unique sum for each datapoint.
@krupyansky
Copy link
Author

Hello, @MrAlias!)

Can you tell me when the release of version 0.33.0 is planned?

Our team is looking forward to the release)

Repository owner moved this from In Progress to Done in Go: Metric SDK (Beta) Oct 17, 2022
MadVikingGod pushed a commit that referenced this issue Oct 17, 2022
* Fix HistogramDataPoints transform in otlpmetric

Fixes #3284

The transform uses the same reference a histogram datapoint sum value
for all transformed metrics. This results in all transformed metrics
being exported with the same sum (see #3284). This changes the transform
to correctly reference a unique sum for each datapoint.
@MrAlias
Copy link
Contributor

MrAlias commented Oct 17, 2022

@krupyansky you can track the release progress here: https://github.com/open-telemetry/opentelemetry-go/milestone/31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:exporter:otlp Related to the OTLP exporter package
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants