Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-LD context for Data Package and Tabular Data Package #218

Closed
rufuspollock opened this issue Sep 24, 2015 · 18 comments
Closed

JSON-LD context for Data Package and Tabular Data Package #218

rufuspollock opened this issue Sep 24, 2015 · 18 comments

Comments

@rufuspollock
Copy link
Contributor

This issue is about creating a valid JSON-LD context a Data Package / Tabular Data Package.

Previous discussion on related topics in #110

@pwalsh
Copy link
Member

pwalsh commented Dec 23, 2015

@rgrp

I'd really like to see if we can move forward on this.

I've gone over this thread #110 and while interesting background information, it does very little to help me get a clear picture on how to move forward practically.

If I summarise that thread for myself, it is basically:

  1. rgrp: JSON-LD, RDF and linked data is good, and we do want to have datapackage.json be compatible. However, datapackage.json mostly targets the fact that data is published in CSV and Excel (and the like), and therefore, we don't want to MUST in the direction of JSON-LD (> RDF), in the interest of keeping the core spec as minimal as possible.
  2. others: linked data is the future, and adoption in datapackage.json can be automated (no friction for users). datapackage.json MUST be valid JSON-LD to be a thing.

What I'm completely lacking there is actual examples and evidence for JSON-LD adoption - the conversation focusses on the spec itself and linked data / semantic web, but not at all on the actual state of data publication and reuse. The only example I could extract out of that thread is around search engines and schema.org. As someone who has been involved in search engine optimisation extensively in the past, I'm not convinced that schema.org is at all a success in terms of the web as a data catalog due to extremely low adoption (please show me otherwise if I am out of touch there).

My position is that datapackage.json should not MUST on JSON-LD, but, that we should definitely focus on documentation, specific, real world examples, and possibly a MAY "spec" for making datapackage.json valid JSON-LD.

I'd be interested to setup a small group of contributors with specific interest here and outline the shape of such work. I believe this is about much more than @context and @id as discussed in #110, and at least we should directly address @type, which relates to the JSON Table Schema spec used in Tabular Data Package.

Is there anyone regularly working with linked data at a practical level who is interested in this?

@jbenet @sballesteros do you still have interest?

@marek-dudas @jindrichmynarz is this something you would be interested in contributing to? It ties in very well with the https://github.com/openbudgets project, where we represent data in both Fiscal Data Package and the OpenBudgets Data Model (RDF), are exploring ways to transform data between both formats. As you are both Linked Data experts, and, we have here a clear use case around real data published by governments, I'm interested in your thoughts on datapackage.json as JSON-LD.

@jindrichmynarz
Copy link

I would consider Schema.org a success. According to a recently published paper by Google and Microsoft, Schema.org can be found in 31.3 % or web pages (based on a 10M sample from Google index and WebDataCommons). JSON-LD is used in many widely deployed applications, such as GMail actions.

However, it seems to be that using JSON-LD for FDP is quite indirect. A more appropriate tool may be the recent CSV2RDF W3C recommendation, which uses JSON-LD in part.

We at the University of Economics in Prague will work on converting FDP to RDF. Whether doing so would require data package in JSON-LD is unclear.

@pwalsh
Copy link
Member

pwalsh commented Dec 23, 2015

@jindrichmynarz, thanks for the link to the article.

Schema.org has seen a huge jump in adoption in the last 12 months, that's awesome!

I would like to see a deeper analysis of this without Wordpress themes that simply markup blog posts and other types in WP, which, while very cool, probably skews these figures a bit considering how much of the web runs on Wordpress.

@sballesteros
Copy link

@pwalsh I have lost interest and have decided to stick with schema.org Dataset class.
Regarding tooling and interop, the W3C CSV on the web effort covers all my needs, and I complement schema:Dataset with what I need from the W3C Metadata Vocabulary for Tabular Data.

In addition to the paper that @jindrichmynarz mentioned, I find things like https://developers.google.com/knowledge-graph/ pretty exciting! I really hope that one day we can easily use mainstream search engines (and their hypermedia APIs) to query things like clinical trials (hence my interest in schema.org).

Hopefully sooner rather than later, the schema.org Dataset class will evolve and adopt a significant part of the W3C Metadata Vocabulary for Tabular Data (or maybe it will be a schema.org extension). Who knows, maybe schema.org potential Actions (see blog post and overview doc) will be used to expose potential data transformation (or sync options) to the user in a nice interoperable hypermedia API.

Anyway, I don't think this is helping the thread, so feel free to delete, but I just wanted to answer the question from @pwalsh regarding loss of interest.

@pwalsh
Copy link
Member

pwalsh commented Dec 28, 2015

@sballesteros thanks for your response, and, no, there is no need to delete it :) - your post and Jindrich's have prompted me to look more deeply into recent work around schema.org. I'm quite interested in schema.org generally, but for a number of reasons, quite skeptical of mainstream search engines as generic data catalogues in the way you describe.

As an aside, you mentioned clinical trials - are you aware of the large project we are working on in this domain (opentrials.net)? Maybe it is worth syncing on that in another channel?

@pwalsh
Copy link
Member

pwalsh commented Dec 28, 2015

@jindrichmynarz about your points on JSON-LD for Fiscal Data Package: yes, I see it is indirect, but here we are talking about a general pattern for making datapackage.json JSON-LD compatible, which is a simpler case. The next step after that might be, of course, to convert to/from Tabular Data Package <> CSVW, which, as I see it now, it quite straight forward one we establish the pattern for metadata in JSON-LD, considering the common basis of each.

@jindrichmynarz
Copy link

@pwalsh: You're right. I assumed the narrower context of Fiscal Data Package. However, here the discussion is about data packages in general. In this context, it surely makes sense to see what steps need to be taken to turn datapackage.json into JSON-LD.

Regarding the mapping from FDP to RDF, @marek-dudas recently started sketching the process here.

@pwalsh
Copy link
Member

pwalsh commented Dec 28, 2015

So @jindrichmynarz if I could pick your brain on making datapackage.json JSON-LD compatible in the new year that would be great. We have lots of developments happening around datapackage.json and I'm quite keen to get clear alignment.

@ppKrauss
Copy link

+1 vote to this issue!


For nowadays (not far future), there are some directive or new Dataprotocols convention to express semantic in fields? (resources/schema/fields at tabular-data-package stanard)... Example: the W3C's propertyUrl and aboutUrl will be usefull for express semantics at Datasets.


(sorry, can I post this kind of comment here?)

@pwalsh
Copy link
Member

pwalsh commented Mar 7, 2016

@ppKrauss could you provide an example here?

@ppKrauss
Copy link

ppKrauss commented Mar 8, 2016

@pwalsh, Is a REST concept based on end-points that are cool URLs, to use in URI-templates. Well known examples, are the so-called URN resolvers,

NOTE: the VAT number of an organization (ex. www.outlandish.com is at UK and have the VAT number 102018679) may be resolved by a template-URL, but not all countries offer a REST system for VAT resolution. In the example http://ec.europa.eu/taxation_customs/vies/vatRequest.html resolves by XML-POST, so is not valid as template-URL.

@pwalsh
Copy link
Member

pwalsh commented Mar 8, 2016

@ppKrauss thanks. What I meant was, an example of a datapackage.json or a JSON Table Schema portion that demonstrates your suggestion.

@hbruch
Copy link

hbruch commented Jan 28, 2021

When republishing inaccessible datasets, I started using json-ld/schema.org to make them findable e.g. via google dataset search. Json-ld and datapackage.json have much overlap, though not completely. Datapackages lend themselves more to an in depth description of the data and provides support to easily process the data (at least tabular data).

Would be nice to create a (basic) data package from json-ld or inversely generate json-ld from a datapackage.json. This issue seems to have gone stale since 2016. Are there any current plans to either make them more compatible or provide conversion utilities?

@ioggstream
Copy link

@hbruch I am interested in generating json-ld from datapackage.json too. Currently there are a lot of possible options and a simple solution could even be based on passing a JSON-LD context in datapackage.json

@rufuspollock
Copy link
Contributor Author

@hbruch @ioggstream this is really welcome - we just need someone to step up to make it happen 😄

@ioggstream
Copy link

ioggstream commented Jul 11, 2022

@rufuspollock Ok, so for now I have a proposal loosely based on this I-D https://datatracker.ietf.org/doc/draft-polli-restapi-ld-keywords/

The general idea I'm working is this one:

Given this CSV

id,label_it,label_en
FRA,Francia,France
ITA,Italia,Italie

I have this DataPackage

    schema:
      fields:
        - { name: id,  type: string }
        - { name: label_it,  type: string }
        - { name: label_en, type: string }
      # Extension keyword to provide a json-ld context  
      x-jsonld-context:  
        "@vocab": https://countries.example/
        skos: http://www.w3.org/2004/02/skos/core#

        id:
          "@type": "@id"
        # Localize labels. Order is relevant.
        label_en:
          "@id": skos:prefLabel
          "@language": en
        label_it:
          "@id": skos:prefLabel
          "@language": it
      missingValues:
        - ""

The CSV can be easily trasformed in json, and then in json-ld adding the above context open in json-ld playground

{
  "@context": {
    "@vocab": "https://countries.example/",
    "skos": "http://www.w3.org/2004/02/skos/core#",

    "id": { "@type": "@id", "@id": "@id" },
    "label_en": { "@id": "skos:prefLabel", "@language": "en" },
    "label_it": { "@id": "skos:prefLabel", "@language": "it" }
  },
  "@graph": [
    { "id": "ITA", "label_it": "Italia", "label_fr": "Italie" }, 
    { "id": "FRA", "label_it": "Francia", "label_fr": "France" }
  ]
}

This approach introduces #451 without having to define specific behavior, and delegates all the LD processing to the JSON-LD specifications: this means that if JSON-LD adds new features in context, we just inherite them.

WDYT?

cc: @mfortini @giorgialodi @hbruch

@rufuspollock
Copy link
Contributor Author

@ioggstream seems good and i'm happy to have any concrete proposal to move things forward 😄

@ioggstream
Copy link

I am drafting this document to better analyse the possible choices
https://docs.google.com/document/d/1ACMG0dbzHt1NSXxeJ2pHf8zFnnbl7pSiZ6X_-uggdQI/edit?usp=drivesdk

@roll roll modified the milestones: Backlog, v2 Apr 14, 2023
@roll roll removed this from the v2 milestone Jan 3, 2024
@roll roll added this to the v2.1 milestone Jun 26, 2024
@frictionlessdata frictionlessdata locked and limited conversation to collaborators Oct 21, 2024
@roll roll converted this issue into discussion #993 Oct 21, 2024
@roll roll removed this from the v2.1 milestone Oct 22, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

9 participants