Skip to content

[testbed] fix load gen worker tick interval math to avoid div by zero #38084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/fix-testbed-div-by-zero.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: testbed

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Fix batch interval calculation to avoid possible division by zero

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [38084]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [api]
16 changes: 15 additions & 1 deletion testbed/testbed/load_generator.go
Original file line number Diff line number Diff line change
Expand Up @@ -213,12 +213,14 @@ func (ps *ProviderSender) generate() {

var workers sync.WaitGroup

tickDuration := ps.perWorkerTickDuration(numWorkers)

for i := 0; i < numWorkers; i++ {
workers.Add(1)

go func() {
defer workers.Done()
t := time.NewTicker(time.Second / time.Duration(ps.options.DataItemsPerSecond/ps.options.ItemsPerBatch/numWorkers))
t := time.NewTicker(tickDuration)
defer t.Stop()

var prevErr error
Expand Down Expand Up @@ -338,3 +340,15 @@ func (ps *ProviderSender) generateLog() error {
}
}
}

// perWorkerTickDuration calculates the tick interval each worker must observe in order to
// produce the desired average DataItemsPerSecond given the constraints of ItemsPerBatch and numWorkers.
//
// Of particular note are cases when the batchesPerSecond required of each worker is less than one due to a high
// number of workers relative to the desired DataItemsPerSecond. If the total batchesPerSecond is less than the
// number of workers then we are dealing with fractional batches per second per worker, so we need float arithmetic.
func (ps *ProviderSender) perWorkerTickDuration(numWorkers int) time.Duration {
batchesPerSecond := float64(ps.options.DataItemsPerSecond) / float64(ps.options.ItemsPerBatch)
batchesPerSecondPerWorker := batchesPerSecond / float64(numWorkers)
return time.Duration(float64(time.Second) / batchesPerSecondPerWorker)
}
58 changes: 58 additions & 0 deletions testbed/testbed/load_generator_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
package testbed

import (
"testing"
"time"
)

func Test_perWorkerTickDuration(t *testing.T) {
for _, test := range []struct {
name string
expectedTickDuration time.Duration
dataItemsPerSecond, itemsPerBatch, numWorkers int
}{
// Because of the way perWorkerTickDuration calculates the tick interval using dataItemsPerSecond,
// it is important to test its behavior with respect to a one-second Duration in particular.
{
name: "less than one second",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this change is to resolve the divide by zero risk, we should have a test case that would have previously resulted in the divide by zero panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @dehaansa! One of the test cases was as you describe (indeed it was the set of configuration that caused me to find this bug in the first place). I've made the comment and test name more explicit on this point.

expectedTickDuration: 100 * time.Millisecond,
dataItemsPerSecond: 100,
itemsPerBatch: 5,
numWorkers: 2,
},
{
name: "exactly one second",
expectedTickDuration: time.Second,
dataItemsPerSecond: 100,
itemsPerBatch: 5,
numWorkers: 20,
},
{
name: "more than one second",
expectedTickDuration: 5 * time.Second,
dataItemsPerSecond: 100,
itemsPerBatch: 5,
numWorkers: 100,
},
{
name: "default batch size and worker count",
expectedTickDuration: 8103727, // ~8.1ms
dataItemsPerSecond: 1234,
itemsPerBatch: 10,
numWorkers: 1,
},
} {
t.Run(test.name, func(t *testing.T) {
subject := &ProviderSender{
options: LoadOptions{
DataItemsPerSecond: test.dataItemsPerSecond,
ItemsPerBatch: test.itemsPerBatch,
},
}
actual := subject.perWorkerTickDuration(test.numWorkers)
if actual != test.expectedTickDuration {
t.Errorf("got %v; want %v", actual, test.expectedTickDuration)
}
})
}
}
Loading