Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute source ranges are not normalized correctly #2067

Closed
jhy opened this issue Nov 28, 2023 · 1 comment
Closed

Attribute source ranges are not normalized correctly #2067

jhy opened this issue Nov 28, 2023 · 1 comment
Assignees
Labels
bug Confirmed bug that we should fix fixed
Milestone

Comments

@jhy
Copy link
Owner

jhy commented Nov 28, 2023

As reported in #1933:


Here's my PR on the Rascal project. It works! But... I have a missing Range on the attributes on the top node.

usethesource/rascal#1893

For testing purposes I was parsing the pom.xml file of the rascal project itself:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.rascalmpl</groupId>
    <artifactId>rascal</artifactId>
    <version>0.35.0-RC2-SNAPSHOT</version>
    <packaging>jar</packaging>
  ... SNIP!

The project node has an xsi:schemaLocation attribute.

Here's some output of the readXML function that will print the filename only if the attribute sources ranges are -1:

rascal>import IO;
ok
rascal>import lang::xml::IO;
ok
rascal>iprintln(readXML(|cwd:///pom.xml|, trackOrigins=true), lineLimit=-1)
"#document"(
  "project"(
    "modelversion"(
      "4.0.0",
      src=|cwd:///pom.xml|(209,14,<2,4>,<2,18>)),
    "groupid"(
      "org.rascalmpl",
      src=|cwd:///pom.xml|(249,9,<4,4>,<4,13>)),
    "artifactid"(
      "rascal",
      src=|cwd:///pom.xml|(286,12,<5,4>,<5,16>)),
    "version"(
      "0.35.0-RC2-SNAPSHOT",
      src=|cwd:///pom.xml|(322,9,<6,4>,<6,13>)),
    "packaging"(
      "jar",
      src=|cwd:///pom.xml|(365,11,<7,4>,<7,15>)),
    "scm"(
      "developerconnection"(
        "scm:git:ssh://[email protected]/usethesource/rascal.git",
        src=|cwd:///pom.xml|(411,21,<10,8>,<10,29>)),
      "tag"(
        "SNAPSHOT",
        src=|cwd:///pom.xml|(515,5,<11,8>,<11,13>)),
      src=|cwd:///pom.xml|(397,5,<9,4>,<9,9>)),
    "repositories"(
      ... SNIP!
      "dependency"(
        "groupid"(
          "com.ibm.icu",
          src=|cwd:///pom.xml|(18901,9,<441,12>,<441,21>)),
        "artifactid"(
          "icu4j",
          src=|cwd:///pom.xml|(18944,12,<442,12>,<442,24>)),
        "version"(
          "69.1",
          src=|cwd:///pom.xml|(18987,9,<443,12>,<443,21>)),
        src=|cwd:///pom.xml|(18876,12,<440,8>,<440,20>)),
      src=|cwd:///pom.xml|(15755,14,<351,4>,<351,18>)),
    src=|cwd:///pom.xml|(0,204,<1,0>,<1,204>),
    srcs=("schemalocation":|cwd:///pom.xml|),
    schemalocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"),
  src=|cwd:///pom.xml|)

Here we see the srcs map saying: ("schemalocation":|cwd:///pom.xml|), which is caused by the AttributeRange.valueRange() having a begin offset of -1. I tracked this with the Java debugger. Is this supposed to happen? Or is the top-node somehow different?

Originally posted by @jurgenvinju in #1933 (comment)

@jhy
Copy link
Owner Author

jhy commented Nov 28, 2023

The issue is that the attribute source ranges are not being normalized per the settings of the parser. So in the above example, with attribute xsi:schemaLocation with preserve case=off, the range is tracked in the key xsi:schemaLocation, but the actual attribute is normalized to xsi:schemalocation.

https://github.com/usethesource/rascal/blob/16fe28156c7c57305c7c4f4702cba5d9ce93cb58/src/org/rascalmpl/library/lang/xml/IO.java#L92C5-L95

Fix is to normalize the Range keys when normalizing the Attribute names.

@jhy jhy added this to the 1.17.2 milestone Nov 28, 2023
@jhy jhy self-assigned this Nov 28, 2023
@jhy jhy added bug Confirmed bug that we should fix fixed labels Nov 28, 2023
@jhy jhy closed this as completed in fc4b175 Nov 28, 2023
jhy added a commit that referenced this issue Nov 28, 2023
Dedupes and normalizes attribute source range as we go.

Ensures that the tracked range is for the correct attribute.

Further fixes #2067
benkard added a commit to benkard/mulkcms2 that referenced this issue Dec 29, 2023
This MR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [flow-bin](https://github.com/flowtype/flow-bin) ([changelog](https://github.com/facebook/flow/blob/master/Changelog.md)) | devDependencies | minor | [`^0.224.0` -> `^0.225.0`](https://renovatebot.com/diffs/npm/flow-bin/0.224.0/0.225.1) |
| [org.liquibase.ext:liquibase-hibernate5](https://github.com/liquibase/liquibase-hibernate/wiki) ([source](https://github.com/liquibase/liquibase-hibernate)) | build | patch | `4.25.0` -> `4.25.1` |
| [org.liquibase:liquibase-maven-plugin](http://www.liquibase.org/liquibase-maven-plugin) ([source](https://github.com/liquibase/liquibase)) | build | patch | `4.25.0` -> `4.25.1` |
| [org.jsoup:jsoup](https://jsoup.org/) ([source](https://github.com/jhy/jsoup)) | compile | patch | `1.17.1` -> `1.17.2` |
| [io.hypersistence:hypersistence-utils-hibernate-62](https://github.com/vladmihalcea/hypersistence-utils) | compile | minor | `3.6.1` -> `3.7.0` |
| [io.quarkus:quarkus-maven-plugin](https://github.com/quarkusio/quarkus) | build | patch | `3.6.3` -> `3.6.4` |
| [io.quarkus:quarkus-universe-bom](https://github.com/quarkusio/quarkus-platform) | import | patch | `3.6.3` -> `3.6.4` |
| [org.apache.maven.plugins:maven-compiler-plugin](https://maven.apache.org/plugins/) | build | minor | `3.11.0` -> `3.12.1` |

---

### Release Notes

<details>
<summary>flowtype/flow-bin</summary>

### [`v0.225.1`](flow/flow-bin@62a43fb...23ec616)

[Compare Source](flow/flow-bin@62a43fb...23ec616)

### [`v0.225.0`](flow/flow-bin@e6104a1...62a43fb)

[Compare Source](flow/flow-bin@e6104a1...62a43fb)

</details>

<details>
<summary>liquibase/liquibase</summary>

### [`v4.25.1`](https://github.com/liquibase/liquibase/blob/HEAD/changelog.txt#Liquibase-4251-is-a-patch-release)

[Compare Source](liquibase/liquibase@v4.25.0...v4.25.1)

</details>

<details>
<summary>jhy/jsoup</summary>

### [`v1.17.2`](https://github.com/jhy/jsoup/blob/HEAD/CHANGES.md#&#8203;1172-2023-Dec-29)

##### Improvements

-   **Attribute object accessors**: Added `Element.attribute(String)` and `Attributes.attribute(String)` to more simply
    obtain an `Attribute` object. [2069](jhy/jsoup#2069)
-   **Attribute source tracking**: If source tracking is on, and an Attribute's key is changed (
    via `Attribute.setKey(String)`), the source range is now still tracked
    in `Attribute.sourceRange()`. [2070](jhy/jsoup#2070)
-   **Wildcard attribute selector**: Added support for the `[*]` element with any attribute selector. And also restored
    support for selecting by an empty attribute name prefix (`[^]`). [2079](jhy/jsoup#2079)

##### Bug Fixes

-   **Mixed-cased source position**: When tracking the source position of attributes, if the source attribute name was
    mix-cased but the parser was lower-case normalizing attribute names, the source position for that attribute was not
    tracked correctly. [2067](jhy/jsoup#2067)
-   **Source position NPE**: When tracking the source position of a body fragment parse, a null pointer
    exception was thrown. [2068](jhy/jsoup#2068)
-   **Multi-point emoji entity**: A multi-point encoded emoji entity may be incorrectly decoded to the replacement
    character. [2074](jhy/jsoup#2074)
-   **Selector sub-expressions**: (Regression) in a selector like `parent [attr=va], other`, the `, OR` was binding
    to `[attr=va]` instead of `parent [attr=va]`, causing incorrect selections. The fix includes a EvaluatorDebug class
    that generates a sexpr to represent the query, allowing simpler and more thorough query parse
    tests. [2073](jhy/jsoup#2073)
-   **XML CData output**: When generating XML-syntax output from parsed HTML, script nodes containing (pseudo) CData
    sections would have an extraneous CData section added, causing script execution errors. Now, the data content is
    emitted in a HTML/XML/XHTML polyglot format, if the data is not already within a CData
    section. [2078](jhy/jsoup#2078)
-   **Thread safety**: The `:has` evaluator held a non-thread-safe Iterator, and so if an Evaluator object was
    shared across multiple concurrent threads, a NoSuchElement exception may be thrown, and the selected results may be
    incorrect. Now, the iterator object is a thread-local. [2088](jhy/jsoup#2088)

***

Older changes for versions 0.1.1 (2010-Jan-31) through 1.17.1 (2023-Nov-27) may be found in
[change-archive.txt](./change-archive.txt).

</details>

<details>
<summary>vladmihalcea/hypersistence-utils</summary>

### [`v3.7.0`](https://github.com/vladmihalcea/hypersistence-utils/blob/HEAD/changelog.txt#Version-370---December-18-2023)

\================================================================================

Oracle Interval Type does not support negative intervals [#&#8203;682](vladmihalcea/hypersistence-utils#682)

Return original object if target and original are the same when merging [#&#8203;677](vladmihalcea/hypersistence-utils#677)

Add a hypersistence-utils-hibernate-63 module for Hibernate 6.3 [#&#8203;657](vladmihalcea/hypersistence-utils#657)

</details>

<details>
<summary>quarkusio/quarkus</summary>

### [`v3.6.4`](https://github.com/quarkusio/quarkus/releases/tag/3.6.4)

[Compare Source](quarkusio/quarkus@3.6.3...3.6.4)

##### Complete changelog

-   [#&#8203;37808](quarkusio/quarkus#37808) - CLI - Rework how missing commands are detected
-   [#&#8203;37803](quarkusio/quarkus#37803) - Dev mode: add null checks to TimestampSet.isRestartNeeded()
-   [#&#8203;37798](quarkusio/quarkus#37798) - Only update ~/.docker/config.json if it exists
-   [#&#8203;37787](quarkusio/quarkus#37787) - Take priority into account in ConfigurationImpl
-   [#&#8203;37775](quarkusio/quarkus#37775) - Docs: fix typo in rabbitmq reference documentation
-   [#&#8203;37770](quarkusio/quarkus#37770) - Add SequencedCollection to BANNED_INTERFACE_TYPES
-   [#&#8203;37768](quarkusio/quarkus#37768) - Running application build with JDK21 and target Java 17 crash with NoClassDefFoundError: java/util/SequencedCollection
-   [#&#8203;37731](quarkusio/quarkus#37731) - Query logging is being done in io.quarkus.mongodb.panache.common.runtime.MongoOperations
-   [#&#8203;37723](quarkusio/quarkus#37723) - Do not use CSRF cookie as the next token value
-   [#&#8203;37717](quarkusio/quarkus#37717) - Docs: Fix incorrect link reference in Cross-Site Request Forgery Prevention guide
-   [#&#8203;37714](quarkusio/quarkus#37714) - Remove the driver property in the documentation for Cloud SQL
-   [#&#8203;37710](quarkusio/quarkus#37710) - Use NoStackTraceException in metrics
-   [#&#8203;37677](quarkusio/quarkus#37677) - Bump io.quarkus:quarkus-platform-bom-maven-plugin from 0.0.100 to 0.0.101
-   [#&#8203;37654](quarkusio/quarkus#37654) - Make sure dev mode is properly written in doc
-   [#&#8203;36848](quarkusio/quarkus#36848) - CSRF Token with HMAC signature gets double signed

</details>

<details>
<summary>quarkusio/quarkus-platform</summary>

### [`v3.6.4`](quarkusio/quarkus-platform@3.6.3...3.6.4)

[Compare Source](quarkusio/quarkus-platform@3.6.3...3.6.4)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever MR is behind base branch, or you tick the rebase/retry checkbox.

👻 **Immortal**: This MR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4yNC4wIiwidXBkYXRlZEluVmVyIjoiMzQuMjQuMCJ9-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bug that we should fix fixed
Projects
None yet
Development

No branches or pull requests

1 participant