Skip to content

Avoid default allocation for taps of length 1 in ScanSaveMem #1395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented May 8, 2025

The check we had for whether a variable was a default scan buffer always failed for single tapped outputs. There's a conservative check that the original value is not being broadcast to the number of initial taps, but that doesn't matter for single taps.

Also added some checks that we are actually only keeping buffers of the expected size in the test.


📚 Documentation preview 📚: https://pytensor--1395.org.readthedocs.build/en/1395/

@ricardoV94 ricardoV94 changed the title Avoid large allocation for taps of length 1 in ScanSaveMem Avoid default allocation for taps of length 1 in ScanSaveMem May 8, 2025
@ricardoV94 ricardoV94 requested a review from Copilot May 8, 2025 15:49
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an issue with the default scan buffer allocation for single-tapped outputs in ScanSaveMem and enhances the tests for buffer size validation. Key changes include:

  • Adjusting the test configuration by excluding "scan_pushout" and renaming an internal function from f_rnn to step for clarity.
  • Updating the implementation of default scan buffer handling by adding a new parameter (taps) to _is_default_scan_buffer and adapting buffer expansion and slicing logic accordingly.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
tests/scan/test_rewriting.py Modified test configuration and assertions regarding scan buffer sizes and function naming.
pytensor/scan/rewriting.py Updated _is_default_scan_buffer's signature and revised buffer handling logic using the taps value.

@@ -1186,7 +1186,7 @@ def while_scan_merge_subtensor_last_element(fgraph, scan_node):
return subtensor_merge_replacements


def _is_default_scan_buffer(x: TensorVariable) -> bool:
def _is_default_scan_buffer(x: TensorVariable, taps: int) -> bool:
Copy link
Preview

Copilot AI May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that all callers of _is_default_scan_buffer supply the correct 'taps' value so that the default buffer check correctly distinguishes between single and multiple taps.

Suggested change
def _is_default_scan_buffer(x: TensorVariable, taps: int) -> bool:
def _is_default_scan_buffer(x: TensorVariable, taps: int) -> bool:
"""
Determine if a scan buffer is the default buffer.
Parameters:
x (TensorVariable): The tensor variable to check.
taps (int): The number of taps (time steps) associated with the buffer.
Must be correctly supplied by the caller to ensure accurate checks.
Returns:
bool: True if the buffer is the default scan buffer, False otherwise.
"""

Copilot uses AI. Check for mistakes.

@@ -1574,15 +1574,16 @@ def scan_save_mem_rewrite(fgraph, node, backend_supports_output_pre_allocation:
# If the memory for this output has been pre-allocated
# before going into the scan op (by an alloc node)
if idx < op_info.n_mit_sot + op_info.n_sit_sot:
Copy link
Preview

Copilot AI May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that deriving 'taps' from init_l[i] accurately reflects the intended tap count, and that this value is consistently used to compute extra_size in buffer expansion.

Suggested change
if idx < op_info.n_mit_sot + op_info.n_sit_sot:
if idx < op_info.n_mit_sot + op_info.n_sit_sot:
# Validate init_l[i] before using it to derive taps
if not isinstance(init_l[i], int) or init_l[i] < 0:
raise ValueError(f"Invalid tap count in init_l[{i}]: {init_l[i]}")

Copilot uses AI. Check for mistakes.

@@ -1626,14 +1627,13 @@ def scan_save_mem_rewrite(fgraph, node, backend_supports_output_pre_allocation:
# val == 0 means that we want to keep all intermediate
# results for that state, including the initial values.
if idx < op_info.n_mit_sot + op_info.n_sit_sot:
taps = init_l[op_info.n_mit_mot + idx]
Copy link
Preview

Copilot AI May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm that updating the slice boundary to use 'taps' (instead of init_l) maintains the intended behavior for buffer trimming in ScanSaveMem.

Suggested change
taps = init_l[op_info.n_mit_mot + idx]
taps = taps[op_info.n_mit_mot + idx]

Copilot uses AI. Check for mistakes.

@@ -1207,7 +1208,7 @@ def test_inplace3(self):


class TestSaveMem:
mode = get_default_mode().including("scan_save_mem")
mode = get_default_mode().including("scan_save_mem").excluding("scan_pushout")
Copy link
Preview

Copilot AI May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm that excluding 'scan_pushout' aligns with the intended optimization behavior and does not conflict with other scan optimizations.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant