Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support units in Table Schema #537

Closed
rufuspollock opened this issue Oct 3, 2017 · 22 comments
Closed

Support units in Table Schema #537

rufuspollock opened this issue Oct 3, 2017 · 22 comments

Comments

@rufuspollock
Copy link
Contributor

Move the units draft spec http://specs.okfnlabs.org/units/ back to FD specs

/cc @pwalsh

@pwalsh
Copy link
Member

pwalsh commented Oct 9, 2017

@rufuspollock before we move it here, it would be good to discuss how it is to be implemented. Is the proposal to implement this spec, as is, as part of Table Schema?

@pwalsh pwalsh added this to the v1.1 milestone Oct 9, 2017
@rufuspollock
Copy link
Contributor Author

@pwalsh open to suggestions. I was thinking of keeping it separate but adding support for referencing it from table schema but not sure what is best.

@Stephen-Gates
Copy link
Contributor

I've drafted a pattern #607 and started a discussion on the forum https://discuss.okfn.org/t/table-schema-units-pattern/6573

@rufuspollock
Copy link
Contributor Author

Notes from thread in the PR #607

@Stephen-Gates wrote:

@rufuspollock I put this up to stimulate conversation. I guess we need to resolve if we adopt the suggestion on the forum about using an existing specification UCUM, ISO 80000, BIPM.

@rufuspollock wrote back

@Stephen-Gates have you looked at any of those specs in detail? UCUM seems pretty heavy duty - is there a subset of that we could adopt / summarize?

We really just want something relatively simple that fits the 80/20 rule ...

Re ISO 80000 - is that open? If not that would be an issue for us ...

The BIPM stuff looks good though very physically oriented and fundamental.

/cc @dr-shorthair

@dr-shorthair wrote

UCUM may look heavy duty (though I don't think it is, really). But it provides an approach to build any unit-of-measure-symbol from the atomic elements.

This gets over the problem that any static set will come up short. A static list might look like 80-20, but what do you say to the people that need one of the 20 (which is still a lot of applications!).

@rufuspollock
Copy link
Contributor Author

@Stephen-Gates (/cc @dr-shorthair) I think our aim is to see if we can extract a subset of UCUM that gives us 80/20 and then say if you want more go to UCUM.

I have to say i think this could / should go in 2 stages:

  1. Move the spec back here (as it is - even if flawed)
  2. Then start upgrading it ... (this can be a new issue ...)

This keeps things clean. wdyt? And if so would that mean we could merge #607 as it is?

@rufuspollock
Copy link
Contributor Author

@Stephen-Gates any luck here on progressing this?

@Stephen-Gates
Copy link
Contributor

Sorry been focussed a new release of Data Curator. Haven’t forgotten

@lwinfree
Copy link
Member

Hi @Stephen-Gates (and also @rufuspollock + @pwalsh)! I wanted to let you know that @mbomhoff from Planet Microbe has been working with data packages for their oceanographic data and has been thinking about what units specs would work best for them. I wanted to tag Matt so he can keep updated on the specs units conversation, and also intro y'all in case you want to connect and discuss what units ideas Matt has. Thanks both 😄

@mbomhoff
Copy link

mbomhoff commented Aug 20, 2019

@lwinfree Thanks Lilly! Our data packages are in https://github.com/hurwitzlab/planet-microbe-datapackages. For the time being we added a custom property unitRdfType.

@Stephen-Gates
Copy link
Contributor

@mbomhoff are you ok with the direction the draft PR was taking if we address the comments above?

@Stephen-Gates
Copy link
Contributor

@rufuspollock do you have any concerns about using UCUM given its licence?

@dr-shorthair
Copy link

Item 2. in the license is problematic:

Users shall not modify the Licensed Materials and may not distribute modified versions of the UCUM table (regardless of format) or UCUM Specification. Users shall not modify any existing contents, fields, description, or comments of the Licensed Materials, and may not add any new contents to it.

Unfortunately UCUM now appears to be an infrastructure orphan - I've not been able to make contact with Guenther Schadow for a couple of years now. Possibly retired. I'll try again.

@mbomhoff
Copy link

mbomhoff commented Aug 21, 2019

@Stephen-Gates It looks like the draft spec is capable of describing all of the units that we use in our project, but I think our application falls under the case of using an existing spec. One of the goals of our project is to use ontologies to unify disparate datasets from various sources. To describe a field we supply an Environment Ontology (ENVO, http://environmentontology.org/) purl in the rdfType property and a Unit Ontology (UO, https://github.com/bio-ontology-research-group/unit-ontology) purl in a custom unitRdfType property. If the proposal is adopted we would add the unit property for consistency but most likely keep our existing rdfType and unitRdfType properties. For example, to describe the "depth" field we would use:

rdfType: "http://purl.obolibrary.org/obo/ENVO_3100031",
unitRdfType: "http://www.ontobee.org/ontology/UO?iri=http://purl.obolibrary.org/obo/UO_0000008",
unit: "m"

For us the UO purl provides stronger semantics and some additional info such as aliases (meter, metre) and a text description.

@rufuspollock
Copy link
Contributor Author

@Stephen-Gates any chance to look at this further. It sounds like we have to steer around UCUM atm.

@dr-shorthair
Copy link

UCUM only provides the terminal symbols, and a grammar to combine them into any UoM. So it is a mistake to talk about 80:20 provide by UCUM with respect to some finite set. UCUM probably provides 95%+, but by utilising the grammar.

Meanwhile, I have now tracked down the owner of UCUM so it's not dead yet.

@Jacob-Barhak
Copy link

Lilly Winfree directed me to this discussion after I asked her how you handle units in your project in PyData Asutin. We have been dealing with Unit standardization for over a year and can connect you to some of unit specs - at least in the medical domain.
However, you will see that we did a lot of the work already in the following project:
https://clinicalunitmapping.com/
In the About page, you will see many publications that describe details. Currently we already bundle 4 unit specs/standards and have machine learning tools developed to address those - it is easy to add more specs. If you are interested in joining forces, please contact me by email outside this Github thread to see how it is possible. The project is currently not fully open, yet there is a desire by some stakeholders to keep some of its elements open. So I will appreciate offline communication before returning to this open thread and continuing public discussion.
In any case, please note that some open standards contain UCUM, so your fear of the UCUM license was already handled by others successfully, so it may be possible to resolve this issue with some effort. Yet I need to see your issues first and discuss how this can be resolved offline.
Looking forward for more communications.

@rufuspollock
Copy link
Contributor Author

@Jacob-Barhak this is great info - if you could share your experience and links that would help esp any key pointers. Your tip re UCUM is also very helpful. We will look at https://clinicalunitmapping.com/

@Jacob-Barhak
Copy link

So @rufuspollock , all documentation associated with the project is available in the about page: https://clinicalunitmapping.com/about
You will find many publications already and some presentations, yet no important code is currently open source. If your problem is small, then you should have enough pointers there to resolve your unit issues - yet if you are interested in a global solution for the units problem, we should schedule a video chat and talk.
Again, despite expressed desire by many stakeholders to make it an open source project, it is currently not fully available to the public. If you have ideas on how to make it available, I am open to suggestions.

@cpina
Copy link
Contributor

cpina commented Dec 4, 2020

If adding support to units was done: frictionless-py could output the values with its units (maybe optionally) using https://pint.readthedocs.io/en/stable/ Obviously if the units used in the spec were available in the Pint library.

@roll roll changed the title Move units spec back to here as draft spec Support units in Table Schema Apr 12, 2024
@jgunstone
Copy link

made a comment about units over here, acep-uaf/aetr-web-book-2024#40

but maybe this issue is a more appropriate place so I've copied below:

another units good standard: https://www.qudt.org/doc/DOC_VOCAB-UNITS.html

FYI in case its useful / interesting:
the https://github.com/annotated-types/annotated-types?tab=readme-ov-file#unit package has an implementation that allows you to save units as metadata only; and then in the docs has a demonstrattion of a BYO units package implementation... thus side-stepping any authority on units or the units packages / standards that should be used....

@Jacob-Barhak
Copy link

Are you still looking for solutions for units?

You may want to check advances in clinicalunitmapping.com
https://clinicalunitmapping.com/infer

There is now AI behind this that is pretty good already. This project is still in beta and there is still work to do and its use is limited to demonstrate feasibility, yet it is getting better.

You can find recent publications in:
https://clinicalunitmapping.com/about

What are your unit needs? Why exactly do you need them? Will be happy to talk via video.

@dr-shorthair
Copy link

Also see https://si-digital-framework.org/ from BIPM who are the authority on SI.

@frictionlessdata frictionlessdata locked and limited conversation to collaborators Oct 21, 2024
@roll roll converted this issue into discussion #1002 Oct 21, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
Status: Done
Development

No branches or pull requests

10 participants