Skip to content

WIP: Filter Backends #455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

WIP: Filter Backends #455

wants to merge 3 commits into from

Conversation

n2ygk
Copy link
Contributor

@n2ygk n2ygk commented Aug 9, 2018

Fixes #

Description of the Change

  • Add optional jsonapi-style filter backends. See usage docs
    • query parameter validation -- raises 400 errors rather than silently ignoring "bad" parameters
    • sort - based on rest_framework.filters.OrderingFilter
    • keyword filter across multiple fields based on rest_framework.filters.SearchFilter
    • field-level filter based on django_filters.rest_framework.DjangoFilterBackend

This is an initial draft for your feedback/suggestions. Still no unit tests, and a couple of gaps in the implementation.

Checklist

  • PR only contains one change (considered splitting up PR)
  • unit-test added
  • documentation updated
  • changelog entry added to CHANGELOG.md
  • author name in AUTHORS

@n2ygk n2ygk requested a review from sliverc August 9, 2018 00:04
Copy link
Member

@sliverc sliverc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I certainly like how you add well written docstrings which is really helpful (DJA needs more of this)

There are some comments above, there are not complete but should give you feedback to continue.

One further thing I noticed is the naming of the classes.

I know that I have suggested to use JsonApi prefix in pagination but only later notices that we already have classes which are prefixed with JSONAPI like JSONAPIMetadata or JSONAPISettings.

As DRF also calls it classes with a uppercase JSON I think we should stick to JSONAPI as prefix.

We can leave the pagniation classes naming for now but might consider renaming it in the future.


This implements http://jsonapi.org/format/#fetching-sorting and raises 400 if any sort field is invalid.
"""
ignore_bad_sort_fields = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't provide such a option as I think DJA should simply follow the specification. If someone doesn't want to raise an error when sort field is invalid they can simply use the OrderingFilter of DRF.

Copy link
Contributor Author

@n2ygk n2ygk Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serves me right for trying to anticipate your review;-) I originally left it out:-) But see below...

return super(JsonApiOrderingFilter, self).remove_invalid_fields(queryset, fields, view, request)


class JsonApiSearchFilter(JsonApiFilterMixin, SearchFilter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think it is better not to provide specific json api classes where there is no need? In this case we could simply provide the settings how to configure SearchFilter.

I would call the search param filter[search] kind of feels natural and follows the specification.

Copy link
Contributor Author

@n2ygk n2ygk Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just include JSONAPISearchFilter in DEFAULT_FILTER_BACKENDS or the filter_backends attribute then the default value of filter[search], filter[all] or whatever is predefined. Otherwise, including plain SearchFilter will require one to also set settings.JSON_API_SEARCH_PARAM or search_param attribute in order to comply with the requirement to use filter. So it's less typing to just use the JSONAPI-flavored class even if it doesn't do much.

Regarding the name of the filter, I like all (adverb) because it implies search all the (searchable) fields whereas search feels more like a verb usage to me which is anti-REST. But you can use one and I can use the other as it's configurable.

all is also more djangoistic: see __all__ in DRF serializers, so maybe filter[__all__]...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of think filter[all] is misleading as it implies that it takes into account all fields for filtering. This is actually not true as it only search in the defined search_fields. filter[search] might also be a bit odd though I agree.

Even though everyone can define the name as they like if we add it to DJA we have to come up with a recommendation. But this doesn't seem to be easy as the json api is silent on it and we are just two users and have two different opinions already 😄

While thinking it seems that adding a JSONAPISearchFilter to DJA doesn't add any value as someone can already use the DRF SearchFilter today without this change and define whatever parameter he likes.

Also in DJA we don't have to reimplement all features which are in DRF but only implement what is different according to the json api specification.

For this reasons I think we shouldn't provide a JSONAPISearchFilter. We might add a reference to the DRF documentation that there are other filter backends though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call it filter[all-the-things] ;-)

I agree with your reasoning for calling it something that doesn't imply every field. How about filter[fields] or filter[search_fields] as the default?

I'm OK with documenting to just use SearchFilter and set search_param. I just needed a way to make sure when validating filter field names that the one we use for keyword searching doesn't come up as an error. I can just as easily change the code to look for DRF's settings.REST_FRAMEWORK['SEARCH_PARAM'].

I'll drop the code and document example usage of SearchFilter.

Raises ValidationError if not.
"""
for qp in request.query_params.keys():
bracket = qp.find('[')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a regex would be better here such as:

match = re.match(r'filter\[(.*?)\].*', qp)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to care about performance as this code executes for every request. Also the match would be something more complicated since it's searching for all keywords, whether or not they have brackets (e.g. filter[foo], sort, ...). It seems a regex is overkill and detracts from readability.

if qp[:
7] == 'filter[' and qp[-1] == ']' and qp != self.search_param:
key = qp[7:-1].replace(
'.', '__'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above a regex would make this more readable and the slices obsolete.

Thinking about it again I guess replacing dots with __ wouldn't hurt for users who don't care about this notation as dots are disallowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I'm going to stop using yapf. It messed up the line breaks. It should be this:

if qp[:7] == 'filter[' and qp[-1] == ']' and qp != self.search_param:



class JsonApiQueryValidationFilter(JsonApiFilterMixin, BaseFilterBackend):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure how to get this right. Especially when an API designer wants to define additional query params with different naming per view it would also need to be configurable.

All would be possible to do somehow but I kind of think it makes the life more difficult of an api designer and the gain is very little in the default case.

Wouldn't it be good enough just to verify the params in the other filter backend classes whether they exists and leave this filter class out for now?

From a maintenance perspective I would prefer not to provide this for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems inappropriate to verify the params in other classes as they only "own" their own params. Verification is already happening there. For instance, filter[blah] checks to see if blah is a defined filter.
Some of the params are not even backend filters, like include which is elsewhere in the serializer code.

The problem this class addresses is that a client would inadvertently misspell a param and then (if they don't notice that) wonder why the action they requested isn't happening. I'm all for making it easier on the client developer to notice these kinds of errors. If they want to "break" JSON:API and add additional parameters they can:

  1. Not include this backend filter at all and the default silent behavior of not complaining about params that are not recognized is achieved -- which was kind of the reason for the ignore_bad_sort_fields as well; If you really want your API to be silent to all query parameter errors, this is needed.
  2. Extend this backend filter to add additional params to the list by changing the jsonapi_query_keywords attribute (or add a JSON_API_QUERY_KEYWORDS to settings although I've currently left that out to make it difficult to break the standard.)

Needless to say I feel pretty strongly that this is a useful backend filter to help achieve JSON:API compliance for query parameter handling:

"If a server encounters a query parameter that does not follow the naming conventions above, and the server does not know how to process it as a query parameter from this specification, it MUST return 400 Bad Request."

Copy link
Contributor Author

@n2ygk n2ygk Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't address the different naming per view part. I believe the attribute is actually defined for the view. It's certainly declared that way. I'm guessing backend_filters classes are mixed in with the view class? I need to write unit tests for these permutations to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this would make DJA better comply with JSON API spec which is one of ours goals of DJA.

I think though there is simple no hurry to implement this in DJA and I think we shouldn't rush into implementing this without having a proper understanding how to do it well.
Because once its landed in DJA and we will find that it is not a good fit we will have to go through a deprecation cycle to change it which makes maintaining DJA harder and the code more complex.

Another reason why I think this is hard to implement is that query params can be accessed wherever request is accessible. Query params are not only used in filter backends but also in other parts e.g. pagination is one example but I am sure there are others.
Also it kind of feels odd to add validation into a filter backend which should be there for filtering.
Maybe a middleware might be the better spot but I am not sure and would try to experiment with it first to get a clear understanding.

We could think of an experimental area for this. We haven't done this in the past and would need to rethink how to structure it though.

So why do we not just try it out in our projects (I can also think of one of our projects where it could be helpful)? Once we have it running in production and it is solid we can still consider adding it to DJA. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to finish up this PR, leaving it marked WIP, and then maybe we can make an experimental branch and merge it there? We can always rebase later. I have a working example and test case in my local code but wanted to merge in to the example and test cases here so others can try it and provide feedback.

I would argue that filtering for valid query parameters is a legit use of a filter;-) But this could be done as a view mixin too. I think middleware is too early in the stack as we haven't even gotten to DRF at that point, whereas backend filters are a DRF layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sliverc FYI - I've closed the PR while I keep working on this. Do we want to create an experiment branch so others can test/contribute?

BTW, using DjangoFilterBackend adds a new dependency for django-filter. That could be a problem. It's not core Django or DRF but is the top-rated filtering package and the one reference from DRF docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of django-filter. We have the same issue with django-polymorphic. This is marked as optional dependency, we can do the same with django-filter. We just have to make sure that DJA can still be used when this dependency is not installed - meaning we have to be careful with imports.

Actually django-filter has the exact same issue just the other way around as it is also usable without Django REST Framework. All rest_framework related classes are part of the rest_framework module. We could create a django_filters module in our case.

In terms of how to move forward:
I am an advocate of small PRs as they are simpler to review and there quicker to get merged.

What I would suggest how we move forward:

  1. Create a PR for JSONAPIOrderingFilter - this is a fairly straight forward change and can be merged easily
  2. Create a PR for JSONAPIDjangoFilterBackend with its own module approach
  3. Create a PR for JsonApiQueryValidationFilter where we can experiment how this works best. (I will comment on the experimental discussion you started later on)

How does this sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sliverc you wrote:

In terms of django-filter. We have the same issue with django-polymorphic. This is marked as optional dependency, we can do the same with django-filter. We just have to make sure that DJA can still be used when this dependency is not installed - meaning we have to be careful with imports.

Actually there are no imports for django-polymorphic. I don't quite understand how this code works:

class PolymorphicSerializerMetaclass(SerializerMetaclass):
w.r.t. django-polymorphic.

Actually django-filter has the exact same issue just the other way around as it is also usable without Django REST Framework. All rest_framework related classes are part of the rest_framework module. We could create a django_filters module in our case.

I don't quite understand what you mean by creating a django_filters module. Are you saying submit this as a PR to django-filter? As it stands right now, I'll either have to wrap the JSONAPIDjangoFilter implementation with a try/except of from django_filters.rest_framework import DjangoFilterBackend or just make django-filter a dependency for DJA. Please advise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, what I mean is to create a folder django_filters in DJA and add a init,py and backends.py to implement a custom backend. (Hint: Simplified each python file is called a module but more details at https://docs.python.org/3.6/tutorial/modules.html)

If a user of DJA doesn't import this newly created django_filters module or uses it in settings then he doesn't have to install django_filters.

@codecov-io
Copy link

Codecov Report

Merging #455 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master    #455   +/-   ##
======================================
  Coverage    93.1%   93.1%           
======================================
  Files          54      54           
  Lines        3000    3000           
======================================
  Hits         2793    2793           
  Misses        207     207

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3bfff93...12cca8d. Read the comment docs.

@n2ygk
Copy link
Contributor Author

n2ygk commented Aug 12, 2018

closing (for now)

@n2ygk n2ygk closed this Aug 12, 2018
This was referenced Aug 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants