-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[receiver/hostmetrics] Cheaper parent PID and number of threads retrieval on Windows #38589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/hostmetrics] Cheaper parent PID and number of threads retrieval on Windows #38589
Conversation
Thank you for making this PR @pjanotti! It was helpful for me to understand what you meant; I didn't realize Comparing to the performance data I posted in the other PR, here's a I accidentally removed old builds of my PR, but based on the profile I'm not sure it's particularly necessary to do a perfmon comparison; the profile shows this PR is either the same or better without increasing WMI usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what your plan is for integrating this, but here's my thoughts.
- The existing
getProcessHandlesInternal
function gets renamedgetProcessHandlesGopsutil
getProcessHandlesInternal
becomes a new function in eachprocess_scraper_*.go
platform specific file. Most just shell out togetProcessHandlesGopsutil
, whereas on Windows it will be the contents ofgetProcessHandlesInternalNew
from this PR- Debatably, this might be enough of a deviation in behaviour to constitute a featuregate. I think this is safe enough that it could start as a Beta featuregate. If we do go with a featuregate,
getProcessHandlesInternal
inprocess_scraper_windows.go
would check if the featuregate enabled, and if it's not then shell out togetProcessHandlesGopsutil
.
I think with those things in place, the PR would be ready to come out of draft.
if err != nil { | ||
errs.AddPartial(0, fmt.Errorf("error reading parent pid for process %q (pid %v): %w", executable.name, pid, err)) | ||
parentProcessID := int32(0) | ||
if !s.config.ExcludeParentPid { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that we don't need a specific config for this. I would like to still make the check conditional, but we can make it conditional on whether the process.parent_pid
resource attribute is enabled or not.
While on Windows with the new process enumeration method getting ppid no longer costs anything, on other platforms it will still be the old style of doing a certain operation per-process. It's cheap on other platforms, but as a matter of principle I do think any work this scraper does should be conditional on whether the user actually enabled that attribute/metric.
@braydonk I agree with your comments. I will start working on those changes later today. |
|
||
//go:build windows | ||
|
||
package processscraper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file could be removed when we remove the featureflag to fallback to the old behavior.
receiver/hostmetricsreceiver/internal/scraper/processscraper/get_process_handles_windows.go
Outdated
Show resolved
Hide resolved
receiver/hostmetricsreceiver/internal/scraper/processscraper/get_process_handles_windows.go
Show resolved
Hide resolved
@@ -250,7 +250,7 @@ func (s *processScraper) getProcessMetadata(ctx context.Context) ([]*processMeta | |||
} | |||
|
|||
command, err := getProcessCommand(ctx, handle) | |||
if err != nil { | |||
if err != nil && !s.config.MuteProcessAllErrors { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we add muting here but the related change on line 277 still unmuted? Maybe remove this from this PR to fix consistently as a separate bug_fix if needed? Just filing an issue is fine as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point that is a leftover from benchmarking...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing unittest on windows is frequency of #38860, unrelated to this PR. |
…eval on Windows (open-telemetry#38589) (To replace PR open-telemetry#35337) Reduces the cost of getting number of threads and parent process ID by calling `CreateToolhelp32Snapshot` to do process enumeration. The data returned by this function already includes the number of threads and parent process ID - and avoiding calling `CreateToolhelp32Snapshot` for every enumerated process. Fix open-telemetry#32947 ```terminal > go test -benchmem -run=^$ -bench ^BenchmarkGetProcessMetadata$ -benchtime 10s goos: windows goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/processscraper cpu: Intel(R) Core(TM) Ultra 7 165H BenchmarkGetProcessMetadata/Old-IncludeParentPid-22 3 3928594600 ns/op 28357514 B/op 13029 allocs/op BenchmarkGetProcessMetadata/New-IncludeParentPid-22 171 69861262 ns/op 28222817 B/op 12591 allocs/op BenchmarkGetProcessMetadata/Old-ExcludeParentPid-22 172 68011864 ns/op 28232550 B/op 12614 allocs/op BenchmarkGetProcessMetadata/New-ExcludeParentPid-22 169 72172583 ns/op 28351193 B/op 12647 allocs/op PASS ok github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/processscraper 81.722s > go test -benchmem -run=^$ -bench ^BenchmarkGetProcessMetadata$ -benchtime 30s goos: windows goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/processscraper cpu: Intel(R) Core(TM) Ultra 7 165H BenchmarkGetProcessMetadata/Old-IncludeParentPid-22 8 4115364238 ns/op 28797698 B/op 13220 allocs/op BenchmarkGetProcessMetadata/New-IncludeParentPid-22 514 70165003 ns/op 28702086 B/op 12800 allocs/op BenchmarkGetProcessMetadata/Old-ExcludeParentPid-22 552 70230804 ns/op 28558975 B/op 12750 allocs/op BenchmarkGetProcessMetadata/New-ExcludeParentPid-22 504 73404366 ns/op 28507275 B/op 12707 allocs/op PASS ok github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/processscraper 170.587s ``` 
(To replace PR #35337)
Reduces the cost of getting number of threads and parent process ID by calling
CreateToolhelp32Snapshot
to do process enumeration. The data returned by this function already includes the number of threads and parent process ID - and avoiding callingCreateToolhelp32Snapshot
for every enumerated process.Fix #32947