Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support exclusion file filters #826

Merged
merged 18 commits into from
Aug 12, 2024
Merged

Conversation

srikary12
Copy link
Contributor

@srikary12 srikary12 commented Jun 19, 2024

Overview

Support exclude file filter in user search queries

Details

  • All of the exclude file filter terms need to be satisfied
  • Any one of the include file filter terms should be satisfied

Example

  • Search Query: what happened yesterday? -file:"tasks.org" -file:"work.md" file:"diary.org" file:"journal.org
  • Behavior: Query will try find relevant notes in any of journal.org or diary.org and not in tasks.org and not in work.md

Closes #728

@debanjum debanjum changed the title Feature: Added file filters Support exclusion file filters Jun 19, 2024
Copy link
Member

@debanjum debanjum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @srikary12, just left 1 comment (to add tests) but otherwise changes look good! Excited to get file filters to word filter parity

src/khoj/search_filter/file_filter.py Outdated Show resolved Hide resolved
@srikary12
Copy link
Contributor Author

@debanjum Can you check and let me know if it's working fine.

@srikary12 srikary12 marked this pull request as draft June 24, 2024 13:42
file filter test cases
@srikary12 srikary12 marked this pull request as ready for review June 24, 2024 13:44
@srikary12
Copy link
Contributor Author

@debanjum @sabaimran let me know if there are any issues

@srikary12
Copy link
Contributor Author

@debanjum added the case

Copy link
Member

@debanjum debanjum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the working through the feedback! I've left some comments to fix the unit tests. Maybe a good idea to manually validate these changes locally and/or run the unit tests on your machine with pytest. It'll speed up the review cycle

tests/test_file_filter.py Outdated Show resolved Hide resolved
tests/test_file_filter.py Outdated Show resolved Hide resolved
tests/test_file_filter.py Outdated Show resolved Hide resolved
tests/test_file_filter.py Outdated Show resolved Hide resolved
@srikary12
Copy link
Contributor Author

Sure, I'll be working on it.

Thanks.

@srikary12 srikary12 marked this pull request as draft July 8, 2024 07:08
@srikary12 srikary12 marked this pull request as ready for review July 20, 2024 05:34
@srikary12
Copy link
Contributor Author

srikary12 commented Jul 20, 2024

@debanjum @sabaimran fixed the tests, Sorry for the delay.

@srikary12
Copy link
Contributor Author

test_notes_search_with_only_filters failed, I'm not sure if it's related to this PR.

Previously we were applying an "Or" filter, which would exclude any
file mentioned in a query with multiple exclude file filter.

This is not what we naturally mean when we ask excluding a file in a
query
- Arrange tests in approximately simpler to complex order
- Name tests for `can_filter' with `test_can_filter_' prefix
- Name tests for `get_filter_terms' with `test_get_*' prefix
- Add single file (include?) and exclude file filter tests
@debanjum
Copy link
Member

Hey @srikary12, I've tested the PR and pushed some changes. These changes are good to merge!

test_notes_search_with_only_filters failed, I'm not sure if it's related to this PR.

The test failure was because of the changes made in this PR (Django ORM expected a .*.org but query had *.org). I've fixed the regex escape logic (and improved the exclusion filter logic)

@debanjum debanjum self-requested a review August 12, 2024 11:33
@debanjum debanjum merged commit 05c0aa3 into khoj-ai:master Aug 12, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[IDEA] Support exclusion file filters
2 participants