Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use a compile time default matcher map #11

Merged
merged 3 commits into from
Jul 16, 2020

Conversation

ririsoft
Copy link
Contributor

@ririsoft ririsoft commented Jul 15, 2020

This change makes the default map of matchers defined at compile time, making Infer::new much cheaper for high workload.

This change should be (please double check) fully compatible as far as SemVer is concerned.

This also allows for further enhancement such having a completely static Infer experience with functions at crate level. Infer::new would only be needed when custom matchers are needed. I will probably submit a PR for this after this one.

Cheers

This change makes the default map of matchers
defined at compile time,
making `Infer::new` much cheaper for high workload.
@paolobarbolini
Copy link
Collaborator

Looks good to me. This would make it easy in the future to put Infer::get_from_path and Infer::add behind a feature flag and make this a no_std, no alloc crate.

@paolobarbolini
Copy link
Collaborator

paolobarbolini commented Jul 15, 2020

Thinking more about it we could make Infer::new a const fn so that it could be used like this:

static INFER: Infer = Infer::new();

@ririsoft
Copy link
Contributor Author

Why not ! I was just writing top level methods (that Infer would use internally) but the static INFER way can make it too.

With top level methods the API would be consumed this way:

// not necessary but this is just to highlight that here infer is an external crate.
extern crate infer;

fn my_func(buf: &[u8]) {
    match infer::get(buf) {
        Some(t) => do_other_stuff(buf),
        None => do_no_type_stuff(buf),
    }
}

With static INFER the API would be consumed this way:

// not necessary but this is just to highlight that here infer is an external crate.
extern crate infer;
use infer::INFER;

fn my_func(buf: &[u8]) {
    match INFER.get(buf) {
        Some(t) => do_other_stuff(buf),
        None => do_no_type_stuff(buf),
    }
}

I am totally fine with both, the later being very straightforward to implement. I could add a commit to this PR if you wish.

@paolobarbolini
Copy link
Collaborator

Let's see what @bojand thinks about it

@ririsoft
Copy link
Contributor Author

ririsoft commented Jul 15, 2020

I am afraid we won't be able to have this on stable rust for now without changing the API and the Matcher type:

function pointers in const fn are unstable
see issue #57563 <https://github.com/rust-lang/rust/issues/57563> for more information
add `#![feature(const_fn)]` to the crate attributes to enable

Forget about this, it works with a trick and without changing the API. I have some code ready for that if you want to play

@paolobarbolini
Copy link
Collaborator

I think I might have a workaround around it. I'll try it in the next few days.

@ririsoft
Copy link
Contributor Author

Sorry I updated my comment at the same time. Indeed there is a workaround. I have an implementation for it. I will submit a PR once we have some feedbacks from @bojand.

src/map.rs Show resolved Hide resolved
MATCHER_MAP
.iter()
.map(|(mt, mime, ext, matcher)| (mt, *mime, *ext, matcher))
.chain(custom)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to chain MATCHER_MAP to custom iterator? Idea being that if clients add custom matchers, perhaps we should evaluate those first before we evaluate the standard ones. Just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree but I chose to keep the current behavior (custom matchers were pushed to the vector). Since we all agree I will push a new commit to this PR that does the change, but you will need to bump the Infer release to a major number (0.2 -> 0.3) since this change might affect current users. I will document that in the API.

Copy link
Collaborator

@paolobarbolini paolobarbolini Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably have to do a 0.3 release anyway if we add no_std support, but it's probably better to leave it as is for the moment just so we can do one last 0.2 release integrating these changes and any other non breaking changes that might come up before the 0.3 release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is up to you guys, the commit is ready to be pushed to this PR or another one if you prefer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right... there is a semver-breaking implication to that change. Yup lets do the release dance and keep the breaking change out of this PR and release this change as a patch version; and then we can introduce the behaviour change in subsequent release where we're introducing API changes anyway.

@bojand
Copy link
Owner

bojand commented Jul 16, 2020

Hello, and thanks for the PR! I really appreciate the improvements and the collaboration 🙏.

I left a couple of minor comments / questions regarding the proposed changes, but otherwise looks good! When we address those we can merge this PR.

First a caveat, I don't do Rust development on a daily basis and context switch is sometimes difficult and I forget some of the mechanics of language or syntax, idioms, etc...

With regards to "top-level" API...

extern crate infer;

fn my_func(buf: &[u8]) {
    match infer::get(buf) {
        Some(t) => do_other_stuff(buf),
        None => do_no_type_stuff(buf),
    }
}

Just so I fully understand... what would the implementation of such API look like in the crate? All functions would be just be exported crate functions that refer to MATCHER_MAP to match. The calls are not tied to an instance of Infer struct at all? And then if a client needs custom matchers they would have to create an instance of Infer and add custom matchers, and within the implementation we would also have the same API that just wraps and calls the exported functions but chained with the iterator for custom matchers? Is that the general idea? Maybe I am missing something, especially around the context of the comment and issue linked in comment above. So I just want to make sure I understand the implementation details.

Purely from aesthetic perspective, perhaps this is a little bit "nicer" than having to initialize a new instance or refer to a static instance? But I think that's subjective or debatable.

Additionally wouldn't that have the (debatably relatively minor) advantage of not having an Infer instance allocated and created potentially at all if the client does not need custom matchers...?

The static instance change is relatively simple, whereas the "top-level-api-wrapped-in-instance" approach seems like a bit more rewriting and rewiring.

I feel either approach is beneficial and good, so I do not feel strongly about either one. Whatever makes sense holistically, and follows established practices is probably good.

@ririsoft
Copy link
Contributor Author

Hello, and thanks for the PR! I really appreciate the improvements and the collaboration 🙏.

Thanks for Infer and your time reviewing those enhancement proposals!

Just so I fully understand... what would the implementation of such API look like in the crate? All functions would be just be exported crate functions that refer to MATCHER_MAP to match. The calls are not tied to an instance of Infer struct at all? And then if a client needs custom matchers they would have to create an instance of Infer and add custom matchers, and within the implementation we would also have the same API that just wraps and calls the exported functions but chained with the iterator for custom matchers? Is that the general idea?

This is exactly the idea.

Purely from aesthetic perspective, perhaps this is a little bit "nicer" than having to initialize a new instance or refer to a static instance? But I think that's subjective or debatable.

This is indeed debatable. I also believe that this is a little bit "nicer" to use: you do not need to use infer::INFER; and then INFER.get(...), you can just do infer::get(...). The added value of having a static infer::INFER is not clear to me, since crate level functions already bring the no_std and no_alloc features (I may be too optimistic for the no_alloc, I need to check, but this can be easily achieved anyway).

Additionally wouldn't that have the (debatably relatively minor) advantage of not having an Infer instance allocated and created potentially at all if the client does not need custom matchers...?

Exactly

The static instance change is relatively simple, whereas the "top-level-api-wrapped-in-instance" approach seems like a bit more rewriting and rewiring.

Yes, but we are talking about couple of minutes here. I would be pleased to do the change since I need it for an internal project of mine.

I feel either approach is beneficial and good, so I do not feel strongly about either one. Whatever makes sense holistically, and follows established practices is probably good.

I would favor the top level functions instead of static INFER: Infer unless it shows additional value since it provides a simpler usage experience.

@paolobarbolini
Copy link
Collaborator

paolobarbolini commented Jul 16, 2020

This is indeed debatable. I also believe that this is a little bit "nicer" to use: you do not need to use infer::INFER; and then INFER.get(...), you can just do infer::get(...). The added value of having a static infer::INFER is not clear to me, since crate level functions already bring the no_std and no_alloc features (I may be too optimistic for the no_alloc, I need to check, but this can be easily achieved anyway).

I guess we could avoid exporting INFER ourselves while still giving users the possibility to do it themselves by making Infer::new a const function.
I completely agree on the fact that having to do infer::INFER.get(&buf) looks ugly, maybe we could declare INFER internally and just re-export every function from it for users who don't need custom matchers?

@ririsoft
Copy link
Contributor Author

I guess we could avoid exporting INFER ourselves while still giving users the possibility to do it themselves by making Infer::new a const function.

I believe this is a great compromise.
Once this PR is merged I will submit a new one with a proposal for you to review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants