-
-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCTK 30.1.0 detects classpath-exception-2.0 based only on word "classpath" in Java comments #2769
Comments
This make sense but https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/rules/classpath-exception-2.0_5.RULE has this text Would you have a file with the problematic detection to link or attach? |
Never mind I can the see the issue now. For instance this C++ snippet from https://github.com/SanDisk-Open-Source/SSD_Dashboard/blob/f0240a983544a86989eec80a9a5210f2b14fa1c1/uefi/gcc/gcc-4.6.3/libjava/gnu/classpath/jdwp/natVMVirtualMachine.cc#L280: using namespace gnu::classpath::jdwp::exception;
throw new InvalidLocationException (); is detected by this rule and with |
Signed-off-by: Philippe Ombredanne <[email protected]>
This is applying the renaming doone in the code to the actual rules Signed-off-by: Philippe Ombredanne <[email protected]>
Rename all match filter functions to use more explict names. Refactor function to set the lines as a LicenseMatch method. Add misc. new and improved license detection rules. Improve the order in which some match filters are processed. For instance this help to ensure that non spurious smaller matches are not merged and discarded in short spurious matches too early. Refine non-continuous matches filter for #2769 Rename filter_if_only_known_words_rule() to filter_non_continuous_matches() Also rename "continuous" Rule field to "is_continuous" Add new filter_short_matches_scattered_on_too_many_lines() filter This works by discarding some short matches that are scattered on too many lines to be a correct match. Improve overlapping filter for two-token matches that precede or follow longer matches and overlap only on the word "license". In these cases, these may be spurious and may be discarded. Add new and improved license detection rules, and improve existing license metadata. Improve code formatting and logging. Move model fields comments as help text on the model field defeinitions, such as License and Rule. This will help generate API documentation later. Signed-off-by: Philippe Ombredanne <[email protected]>
Description
SCTK reports a license_score=7.33 for a set of Java files based only on the word "classpath" in comments.
The match details are:
matched_rule_identifier = classpath-exception-2.0_5.RULE
matched_rule_matcher = 2-aho
matched_rule_length = 2
matched_rule_match_coverage = 2
matched-rule_relevance = 11
Since the word "classpath" is likely to appear frequently in Java comments, it would be good to avoid this false positive.
The text was updated successfully, but these errors were encountered: