Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort-triples: Add option to sort triples as numbers not as strings #380

Closed
TobiasNx opened this issue Jul 12, 2021 · 10 comments · Fixed by #409
Closed

sort-triples: Add option to sort triples as numbers not as strings #380

TobiasNx opened this issue Jul 12, 2021 · 10 comments · Fixed by #409
Assignees

Comments

@TobiasNx
Copy link
Contributor

If you count-triples and then sort-triples(by="OBJECT") values are as sorted as strings. OBJECT after counting is the counted number as string.

e.g.: following list (with template("${o}\t${s}"))

...
24      metadata.mods.relatedItem.typeOfResource.usage
248     metadata.mods.name.role.roleTerm.value
27      metadata.mods.subject.type
29      metadata.mods.abstract.value
29      metadata.mods.name.affiliation.value
3       metadata.mods.originInfo.dateModified.encoding
3       metadata.mods.originInfo.dateModified.value
30      metadata.mods.relatedItem.abstract.altRepGroup
320     metadata.mods.name.role.roleTerm.authority
320     metadata.mods.name.role.roleTerm.type
322     metadata.mods.name.type
...

It would be great if triples could also be sorted as numbers not just as string.

@dr0i
Copy link
Member

dr0i commented Sep 13, 2021

Have you tried:

sort-triples(by="OBJECT", order="DECREASING")

@dr0i dr0i assigned TobiasNx and unassigned dr0i Sep 13, 2021
@TobiasNx
Copy link
Contributor Author

just tried it, unfortunately it does not work, same problem. Sorts as literal not as integer.

24      metadata.mods:mods.mods:relatedItem.mods:typeOfResource.manuscript
24      metadata.mods:mods.mods:relatedItem.mods:typeOfResource.usage
22      metadata.mods:mods.mods:abstract.altFormat
22      metadata.mods:mods.mods:abstract.altRepGroup
22      metadata.mods:mods.mods:abstract.contentType
1629    _id
1629    header.datestamp.value
1629    header.identifier.value
1575    header.status
15      metadata.mods:mods.mods:relatedItem.mods:abstract.altFormat
15      metadata.mods:mods.mods:relatedItem.mods:abstract.altRepGroup
15      metadata.mods:mods.mods:relatedItem.mods:abstract.contentType
14      metadata.mods:mods.mods:subject.mods:topic.authority
14      metadata.mods:mods.mods:subject.mods:topic.authorityURI
14      metadata.mods:mods.mods:subject.mods:topic.valueURI
1       metadata.mods:mods.mods:classification.usage
1       metadata.mods:mods.mods:genre.usage
1       metadata.mods:mods.mods:language.usage
1       metadata.mods:mods.mods:name.mods:nameIdentifier.invalid
1       metadata.mods:mods.mods:name.usage

@dr0i dr0i assigned dr0i and unassigned TobiasNx Sep 13, 2021
@dr0i
Copy link
Member

dr0i commented Sep 13, 2021

Should be implemented, see #43. But seems not to work. Test is missing, also.

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Sep 14, 2021

But the decreasing form of sorting is working, only the parameter by what is not. It is sorted decreasingly by alphanumerical values but not as integer values.

Decreasing as alphanumeric:

Is:

366
34
3
26
2444
2222555
113
19
1

Decreasing as integers:

Should:

22225555
2444
366
113
34
26
19
3
1

@blackwinter
Copy link
Member

Should be addressed by #409 (sort-triples(by="OBJECT",numeric=true)). Can you confirm?

@blackwinter blackwinter assigned TobiasNx and unassigned dr0i Oct 14, 2021
@TobiasNx
Copy link
Contributor Author

Unfortunately it does not. Also I have not seen an option numeric=true with an value that does not have quotation marks even boolean in metafacture flux.

tried it with:
https://raw.githubusercontent.com/TobiasNx/notWorkingFlux/main/sortTripplesNumeric/json-api-structure.flux

Error-Response:

Exception in thread "main" org.metafacture.flux.FluxParseException: Variable true not assigned.
        at org.metafacture.flux.parser.FlowBuilder.exp(FlowBuilder.java:604)
        at org.metafacture.flux.parser.FlowBuilder.arg(FlowBuilder.java:775)
        at org.metafacture.flux.parser.FlowBuilder.pipe(FlowBuilder.java:718)
        at org.metafacture.flux.parser.FlowBuilder.flowtail(FlowBuilder.java:514)
        at org.metafacture.flux.parser.FlowBuilder.flow(FlowBuilder.java:226)
        at org.metafacture.flux.parser.FlowBuilder.flux(FlowBuilder.java:122)
        at org.metafacture.flux.FluxCompiler.compileFlow(FluxCompiler.java:56)
        at org.metafacture.flux.FluxCompiler.compile(FluxCompiler.java:44)
        at org.metafacture.runner.Flux.main(Flux.java:78)

when using: | sort-triples(By="SUBJECT",numeric="TRUE")

Exception in thread "main" java.lang.NumberFormatException: For input string: "_index"
        at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.base/java.lang.Integer.parseInt(Integer.java:652)
        at java.base/java.lang.Integer.valueOf(Integer.java:983)
        at java.base/java.util.function.Function.lambda$andThen$1(Function.java:88)
        at org.metafacture.triples.AbstractTripleSort.lambda$createComparator$2(AbstractTripleSort.java:216)
        at java.base/java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
        at java.base/java.util.TimSort.sort(TimSort.java:234)
        at java.base/java.util.Arrays.sort(Arrays.java:1515)
        at java.base/java.util.ArrayList.sort(ArrayList.java:1750)
        at java.base/java.util.Collections.sort(Collections.java:179)
        at org.metafacture.triples.AbstractTripleSort.onCloseStream(AbstractTripleSort.java:137)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:68)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.metamorph.Metamorph.closeStream(Metamorph.java:321)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.framework.helpers.DefaultSender.closeStream(DefaultSender.java:70)
        at org.metafacture.flux.parser.Flow.close(Flow.java:122)
        at org.metafacture.flux.parser.FluxProgramm.start(FluxProgramm.java:164)
        at org.metafacture.runner.Flux.main(Flux.java:78)

@blackwinter
Copy link
Member

Also I have not seen an option numeric=true with an value that does not have quotation marks even boolean in metafacture flux.

Um, sorry, I'm not that well versed in Flux ;)

Exception in thread "main" java.lang.NumberFormatException: For input string: "_index"

What does your input look like? Are you sure you're sorting on the right field?

@blackwinter
Copy link
Member

The counts from count-triples are in the OBJECT, aren't they?

@TobiasNx
Copy link
Contributor Author

+1
You are right: | sort-triples(By="object",numeric="TRUE") works.

I didn't think about the reordering by the template-command. The error was due to the impossible task of counting letters instead of numbers.

Also | sort-triples(By="object",numeric="TRUE",order="DECREASING") works. That is great.

https://github.com/TobiasNx/notWorkingFlux/blob/52440eec4ecde1a3108f785bf6e7a2ec75b6eab6/sortTripplesNumeric/json-api-structure.flux

@TobiasNx TobiasNx assigned blackwinter and unassigned TobiasNx Oct 15, 2021
@blackwinter
Copy link
Member

Great, thanks.

dr0i added a commit that referenced this issue Oct 15, 2021
Will be used to update the flux-commands.md.

See #380.
@dr0i dr0i closed this as completed in 5a0d393 Oct 15, 2021
@dr0i dr0i mentioned this issue Nov 2, 2021
blackwinter added a commit that referenced this issue Dec 13, 2024
Ignore old value's path in `copy_field` Fix function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants