You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is to report a bug in the health check logic for TLS connections. Specifically, in the connCheck function in the internal/pool/conn_check.go file here. It leads to unintentional exhaustions of retry count and ultimately command failures, in the presence of server-side disconnections.
go-redis with non-TLS does not have this problem.
Expected Behavior
The intended use case of connection health check, when picking connections from pool, a health check is made. Bad connections are thrown away immediately. The code keeps picking until a good connection is found. If all pooled connections are bad, a new connection is made. Throwing away a bad connection does not consume retry count. Only when a error happened when using a picked connection to send a command, that error would consume a retry count to be retried.
Current Behavior
The specific bug is, when using TLS, the input argument of the connCheckfunction is of tls.Conn type. tls.Conntype does not implement the syscall.Conninterface. As result, the type conversion here always returns ok being false therefore bypassing connection health check entirely for TLS connections. Bad connections in the connection pool are used to send commands, resulting in errors. Every bad connection consumes a retry count.
Possible Solution
Steps to Reproduce
Set up client to use TLS. With 20 pool size, and 4 retry count. But this issue will be exposed as long as the retry count is lower than the pool size.
Use client kill type normal on Redis to kill all existing connections all at once.
Observe commands failures on client side.
Context (Environment)
Many cloud services hosting Redis offers managed replacements of instances, during which connections on the old instance are killed in batch. Due to this bug, it results in commands failures for TLS clusters, but not non-TLS clusters.
The text was updated successfully, but these errors were encountered:
This is to report a bug in the health check logic for TLS connections. Specifically, in the
connCheck
function in theinternal/pool/conn_check.go
file here. It leads to unintentional exhaustions of retry count and ultimately command failures, in the presence of server-side disconnections.go-redis
with non-TLS does not have this problem.Expected Behavior
The intended use case of connection health check, when picking connections from pool, a health check is made. Bad connections are thrown away immediately. The code keeps picking until a good connection is found. If all pooled connections are bad, a new connection is made. Throwing away a bad connection does not consume retry count. Only when a error happened when using a picked connection to send a command, that error would consume a retry count to be retried.
Current Behavior
The specific bug is, when using TLS, the input argument of the
connCheck
function is oftls.Conn
type.tls.Conn
type does not implement thesyscall.Conn
interface. As result, the type conversion here always returnsok
beingfalse
therefore bypassing connection health check entirely for TLS connections. Bad connections in the connection pool are used to send commands, resulting in errors. Every bad connection consumes a retry count.Possible Solution
Steps to Reproduce
client kill type normal
on Redis to kill all existing connections all at once.Context (Environment)
Many cloud services hosting Redis offers managed replacements of instances, during which connections on the old instance are killed in batch. Due to this bug, it results in commands failures for TLS clusters, but not non-TLS clusters.
The text was updated successfully, but these errors were encountered: