-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/sqlserver] Enable direct connection to SQL Server instances #31915
[receiver/sqlserver] Enable direct connection to SQL Server instances #31915
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left some comments to hopefully help give thought process behind some changes.
@@ -14,29 +14,45 @@ | |||
[sumo]: https://github.com/SumoLogic/sumologic-otel-collector | |||
<!-- end autogenerated section --> | |||
|
|||
The `sqlserver` receiver grabs metrics about a Microsoft SQL Server instance using the Windows Performance Counters. | |||
Because of this, it is a Windows only receiver. | |||
The `sqlserver` receiver grabs metrics about a Microsoft SQL Server instance. The receiver works by either using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the complexity of this PR is enabling support on the non-Windows based OSs, while trying to keep Windows-specific functionality unchanged. I attempted to keep logic that was specific to windows in _windows
files, logic specific to non-windows in _others
files, and both in the generically named files. This leads to a bit of a challenge trying to follow some code, but I don't know of a better way given the nature of this receiver.
Happy to take suggestions on how to make this cleaner if anyone has any.
|
||
Make sure to run the collector as administrator in order to collect all performance counters for metrics. | ||
|
||
## Configuration | ||
|
||
The following settings are optional: | ||
- `collection_interval` (default = `10s`): The internal at which metrics should be emitted by this receiver. | ||
- `instance_name` (optional): The instance name identifies the specific SQL Server instance being monitored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to allow query-filtering based on instance name and not computer name as I believe instance name makes more sense for this. If anyone wants both, or only computer name I'm happy to discuss. It would be a relatively simple feature/enhancement.
- `server`: IP Address or hostname of SQL Server instance to connect to. | ||
- `port`: Port of the SQL Server instance to connect to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server
and port
here are separate options for the sake of the SQL db connection string, which requires them to be passed in separately. I thought this makes more sense than the alternative of a single option, and then parsing internally. Happy to discuss if anyone prefers the alternative.
receiver/sqlserverreceiver/config.go
Outdated
if !directDBConnectionEnabled(cfg) { | ||
if cfg.Server != "" || cfg.Username != "" || string(cfg.Password) != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't necessarily have to return an error here, but I want to let the user know up front there may be something wrong or incomplete. There's no use-case for specifying some, but not all, of the connection options, and I could see it being a source of confusion if a isn't seeing metrics because they unknowingly forgot a configuration option.
package sqlserverreceiver // import "github.com/open-telemetry/opentelemetry-collector-contrib/receiver/sqlserverreceiver" | ||
|
||
func (cfg *Config) validateInstanceAndComputerName() error { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simply to provide an alternative to the Windows-specific functionality that requires either both options to be set, or none. There's no such limitation on the non-Windows side.
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs | ||
INNER JOIN sys.master_files AS mf WITH (NOLOCK) | ||
ON vfs.[database_id] = mf.[database_id] AND vfs.[file_id] = mf.[file_id] | ||
%s' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As shown in getSQLServerDatabaseIOQuery
, this allows us to filter based on instance name.
recorder recordFunc | ||
var _ scraperhelper.Scraper = (*SQLServerScraperHelper)(nil) | ||
|
||
func NewScraper(id component.ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially hoping to directly use internal/sqlquery
's scraper functionality, but it wouldn't have access to the sqlserver receiver's metadata.yaml, and metric emitting functionality. It simply returns a set of metrics, with no way for us to use sqlserver metadata on it, AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is largely copied code from scraper.go
, it's not new code. The only change I'm aware of is the rename of the sqlServerScraper
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from scraper_test.go
, it's not new.
421e648
to
0851a28
Compare
) | ||
|
||
// SQL Server Perf Counter (PC) Scraper. This is used to scrape metrics from Windows Perf Counters. | ||
type sqlServerPCScraper struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming feedback on the two types of SQL Server scrapers would be helpful. I chose the direct connection scraper to have the default name since it's platform agnostic, and the Windows-specific scraper to have a non-default name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the overlap between metrics collected in different modes? Can the direct connection bring the same metrics as currently reported with perf counters?
0317752
to
792e8e7
Compare
All metrics reported by perf counters can be gathered from direct connection. Here's the breakdown of doc references for how to get all existing metrics from querying SQL Server: SQL Server, SQL statistics object
SQL Server, Transactions object
SQL Server, General Statistics object
SQL Server, Buffer Manager object
SQL Server, Access Methods object
For future reference, the mapping of SQL Server objects to metrics can be found here. |
792e8e7
to
06d65b1
Compare
@@ -158,6 +167,14 @@ metrics: | |||
unit: "{transactions}/s" | |||
gauge: | |||
value_type: double | |||
sqlserver.database.io.read_latency: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to name this sqlserver.database.io.read_latency
instead of sqlserver.database_io.read_latency
to try to closer match semantic conventions for io
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to cumulative sum based on underlying metric description (refer to io_stall_read_ms
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is quite large and I don't feel I can review it confidently. Can we break any parts out into smaller PRs?
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> Add a query for the database IO metrics that SQL Server exposes. Read more about available metrics [here](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-io-virtual-file-stats-transact-sql?view=sql-server-ver16). This is a no-op update as this query will not be used until direct connection to the SQL Server instance is fully implemented. This is simply part of the effort. **Note:** This code is currently not reached. This is on purpose. **Link to tracking Issue:** <Issue number if applicable> This was originally part of #31915, but I'm breaking this out to make the original PR a more manageable size. #30297 **Testing:** <Describe what testing was performed and which tests were added.> Existing tests and added tests are passing. **Documentation:** <Describe the documentation added.> Purposefully none: This is currently dead code until all of #31915 gets merged, so it shouldn't be used.
f813757
to
fbf5f1a
Compare
18a76a3
to
afdb124
Compare
) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This adds config options for direct connection to a SQL server instance. For the sake of PR size, this adds no functionality, it's simply config options and validation. **Link to tracking Issue:** <Issue number if applicable> This was originally part of #31915, but I'm breaking this out to make the original PR a more manageable size. #30297 **Testing:** <Describe what testing was performed and which tests were added.> Existing tests and added tests are all passing. **Documentation:** <Describe the documentation added.> No documentation on purpose. The added config options currently have no impact on functionality, so users shouldn't be aware of this, and shouldn't try to use them yet.
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This is part of #31915, to try to make it more manageable in size. This variable will be used by the SQL Server receiver.
afdb124
to
7ab147e
Compare
7ab147e
to
3635289
Compare
…32471) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> A major part of #31915 is adding support for other operating systems than Windows. This change moves Windows-specific logic into files named as such, and renames some specific functionality to match purpose, instead of generic names. `newSQLServerScraper` -> `newSQLServerPCScraper` (PC in this context stands for performance counters) `sqlServerScraper` -> `sqlServerPCScraper`` This only renames internal functions and files, there's no user-facing impact of this. **Link to tracking Issue:** <Issue number if applicable> #31915 #30297 **Testing:** <Describe what testing was performed and which tests were added.> Existing tests are passing.
3635289
to
143500c
Compare
…tems (#32542) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> This change enables the SQL Server receiver to directly connect to SQL Server instances, and to run on operating systems other than Windows. This is part of #31915, but doesn't add any functionality for the end user. No metrics are currently being gathered from SQL server instances, to make the PR (hopefully) simpler to review. **Link to tracking Issue:** <Issue number if applicable> #31915 #30297 **Testing:** <Describe what testing was performed and which tests were added.> Tests added are passing.
Add changelog Fix test CI/CD failures make gotidy, fix naming collision Update receiver/sqlserverreceiver/README.md Co-authored-by: Dmitrii Anoshin <[email protected]> Convert unit to seconds, use errors.Join instead of multierr Fix linting failure Add extended_documentation for direct connection-only metrics Fix dep version Rename sqlserver.database_io.read_latency to sqlserver.database.io.read_latency Update metric description and type Update receiver/sqlserverreceiver/scraper.go Update receiver/sqlserverreceiver/scraper.go Update receiver/sqlserverreceiver/scraper.go Fix gomod Fix merge conflict with Windows-specific functionality Add back scraper.go file
12c989f
to
d1209fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my limited view, all looks good. Approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description:
This PR enables direct connection to SQL Server instances, and also allows the SQL Server receiver to run on non-Windows platforms.
User-facing changes:
sqlserver.database_io.read_latency
. This is disabled by default, and way randomly chosen as a single metric to try to restrict this PR's size. I plan on following this PR up with more metrics from the same query, as well as more queries and metrics.Note: Existing users should see no change in functionality with this PR. Using existing configurations with this PR should have no performance or functional impact.
Link to tracking Issue:
Resolves #30297
Testing:
Tested on MacOS and Windows. Manual tested with direct connection to a docker container on both platforms, as well as ran all tests in package. Everything is passing.
Working demo for testing:
Docker container start command:
Collector config:
Documentation:
Updated README