Skip to content

fix(template_processing): get_filters now works for IS_NULL and IS_NOT_NULL operators #33296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Prokos
Copy link

@Prokos Prokos commented Apr 30, 2025

SUMMARY

The macro get_filters within the Jinja context can be used to implement filtering in custom locations in your virtual dataset SQL. This works great, except for when filtering with the IS_NULL or IS_NOT_NULL operator. These are ignored, cannot be used, and will break certain implementations that rely on remove_filter to be functional (like the one I've outlined below).

The reason the code breaks for these operators is because they are unique in the sense that they don't have a comparator. Adjusting the code to not need a comparator for these operator types fixes the issue.

TESTING INSTRUCTIONS

  1. Create a virtual dataset on BigQuery with the following SQL
  • likely reproducible with less - but it's vital that both column selection/grouping and filtering is handled
  • not a BigQuery specific bug, thats just the database I use
{% set finalColumns = columns if columns is defined else ['__ALL__'] %}

SELECT


  sum(measure) as measure,
  
  {% for col in finalColumns %}

  	{% if col == 'col1' or col == '__ALL__' or (col is mapping and 'sqlExpression' in col and col.sqlExpression == 'col1') %}
  		col1
  		{%- if col == '__ALL__' -%},{%- endif -%}
  		{%- if not loop.last -%},{%- endif -%}
  	{% endif %}
  	
  	{% if col == 'col2' or col == '__ALL__' or (col is mapping and 'sqlExpression' in col and col.sqlExpression == 'col2') %}
  		col2
  		{%- if col == '__ALL__' -%},{%- endif -%}
  		{%- if not loop.last -%},{%- endif -%}
  	{% endif %}
  	
  {% endfor %}
  
FROM UNNEST([
  STRUCT(1 as measure, 'value1A' as col1, 'value2A' as col2),
  STRUCT(2 as measure, 'value1A' as col1, 'value2B' as col2),
  STRUCT(5 as measure, 'value1B' as col1, 'value2B' as col2),
  STRUCT(15 as measure, NULL as col1, NULL as col2)
])

-- FILTERS

{% for filter in get_filters('col1', remove_filter=True) -%}
    {%- if loop.first %} WHERE 1 = 1 {% endif %}
    AND
    {% if filter.get('op') == 'IN' -%}
        col1 IN {{ filter.get('val')|where_in }}
    {% elif filter.get('op') == 'NOT IN' -%}
        col1 NOT IN {{ filter.get('val')|where_in }}
    {% elif filter.get('op') == 'ILIKE' -%}
        LOWER(col1) LIKE LOWER({{ filter.get('val') }})
    {% elif filter.get('op') == 'IS NULL' -%}
        col1 IS NULL
    {% elif filter.get('op') == 'IS NOT NULL' -%}
        col1 IS NOT NULL
    {% else -%}
        col1 {{ filter.get('op') }} {{ filter.get('val') }}
    {%- endif -%}
{%- endfor %}

{% for filter in get_filters('col2', remove_filter=True) -%}
    {% if loop.first %} WHERE 1 = 1 {% endif %}
    AND
    {% if filter.get('op') == 'IN' -%}
        col2 IN {{ filter.get('val')|where_in }}
    {% elif filter.get('op') == 'NOT IN' -%}
        col2 NOT IN {{ filter.get('val')|where_in }}
    {% elif filter.get('op') == 'ILIKE' -%}
        LOWER(col2) LIKE LOWER({{ filter.get('val') }})
    {% elif filter.get('op') == 'IS NULL' -%}
        col2 IS NULL
    {% elif filter.get('op') == 'IS NOT NULL' -%}
        col2 IS NOT NULL
    {% else -%}
        col2 {{ filter.get('op') }} {{ filter.get('val') }}
    {%- endif -%}
{%- endfor %}

-- GROUP BY
  
{% set ns = namespace(groupByEmitted=false) %}

{% for col in finalColumns %}

	{% if col == 'col1' or col == '__ALL__' or (col is mapping and 'sqlExpression' in col and col.sqlExpression == 'col1') %}
		{%- if not ns.groupByEmitted %}
			GROUP BY
			{% set ns.groupByEmitted = true %}
		{% endif -%}
		col1
		{%- if col == '__ALL__' -%},{%- endif -%}
		{%- if not loop.last -%},{%- endif -%}
	{% endif %}
	
	{% if col == 'col2' or col == '__ALL__' or (col is mapping and 'sqlExpression' in col and col.sqlExpression == 'col2') %}
		{%- if not ns.groupByEmitted %}
			GROUP BY
			{% set ns.groupByEmitted = true %}
		{% endif -%}
		col2
		{%- if not loop.last -%},{%- endif -%}
	{% endif %}

{% endfor %}
  
  1. Click "Create Chart"
  2. Set a filter on one of the columns but don't select it as a dimension:
image
  1. On master, this breaks due to the following resulting SQL:
SELECT `measure` AS `measure`, `col2` AS `col2` 
FROM (
  SELECT
    sum(measure) as measure,
    col2
    
  FROM UNNEST([
    STRUCT(1 as measure, 'value1A' as col1, 'value2A' as col2),
    STRUCT(2 as measure, 'value1A' as col1, 'value2B' as col2),
    STRUCT(5 as measure, 'value1B' as col1, 'value2B' as col2),
    STRUCT(15 as measure, NULL as col1, NULL as col2)
  ])
	
  GROUP BY		
		col2
) AS `virtual_table` 
WHERE `col1` IS NULL
 LIMIT 1000
  1. With this fix implemented, this works and the resulting SQL looks like this:
SELECT `measure` AS `measure`, `col2` AS `col2` 
FROM (

SELECT

  sum(measure) as measure,
  col2
  
FROM UNNEST([
  STRUCT(1 as measure, 'value1A' as col1, 'value2A' as col2),
  STRUCT(2 as measure, 'value1A' as col1, 'value2B' as col2),
  STRUCT(5 as measure, 'value1B' as col1, 'value2B' as col2),
  STRUCT(15 as measure, NULL as col1, NULL as col2)
])

 WHERE 1 = 1 
  AND
  col1 IS NULL
	
	GROUP BY	
		col2

) AS `virtual_table`
 LIMIT 1000

ADDITIONAL INFORMATION

@dosubot dosubot bot added the global:jinja Related to Jinja templating label Apr 30, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Readability Unclear Boolean Expression Layout ▹ view
Files scanned
File Path Reviewed
superset/jinja_context.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines 381 to 389
and (
val
or
# IS_NULL and IS_NOT_NULL operators do not have a value
op in (
FilterOperator.IS_NULL.value,
FilterOperator.IS_NOT_NULL.value,
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear Boolean Expression Layout category Readability

Tell me more
What is the issue?

The boolean expression is split across multiple lines in a way that makes it harder to read, with the 'or' operator dangling on its own line.

Why this matters

Having logical operators on separate lines makes it difficult to quickly scan and understand the boolean logic flow.

Suggested change ∙ Feature Preview
            and (val or op in (FilterOperator.IS_NULL.value,
                                FilterOperator.IS_NOT_NULL.value))
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that this is an improvement, but happy to change if a more senior contributor does think so!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. The current format does provide clear logical grouping. Let's keep it as is unless senior contributors indicate otherwise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not me talking to an AI: after the ruff-format fix maybe the final logic is a bit too expanded but I'll leave it to a reviewer to decide on what formatting should be used. At least now it's passing the test.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

@Prokos Prokos changed the title fix(template_processing) get_filters now works for IS_NULL and IS_NOT_NULL operators fix(template_processing): get_filters now works for IS_NULL and IS_NOT_NULL operators Apr 30, 2025
Copy link

codecov bot commented Apr 30, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.28%. Comparing base (2a97a6e) to head (28bd367).

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #33296       +/-   ##
===========================================
+ Coverage        0   83.28%   +83.28%     
===========================================
  Files           0      553      +553     
  Lines           0    39936    +39936     
===========================================
+ Hits            0    33259    +33259     
- Misses          0     6677     +6677     
Flag Coverage Δ
hive 48.28% <ø> (?)
mysql 75.46% <ø> (?)
postgres 75.53% <ø> (?)
presto 52.76% <ø> (?)
python 83.28% <ø> (?)
sqlite 75.03% <ø> (?)
unit 61.36% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Prokos
Copy link
Author

Prokos commented Apr 30, 2025

@sfirke Pre-commit now runs successfully locally :)

@sfirke
Copy link
Member

sfirke commented Apr 30, 2025

I tagged a few committers who I believe are familiar with the Jinja internals. Hopefully one of them can take a quick look and we can get this merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
global:jinja Related to Jinja templating size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants