You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Providing an existing schema that has been created elsewhere, probably by hand, is a great feature but one that will be probably only used by power users or data-savvy publishers that are familiar with validation, standards, etc.
To really engage publishers in the description of their data we need to guide the creation of these schemas based on the actual contents of the file, what people is familiar with.
In a nutshell, when uploading or linking to a new file, the user gets a list of the existing fields, with an option to define the type of that field (a guessed one is provided for them). Additionally they can provide extra information about the field like user-friendly labels or a description.
This gets transformed into a Table Schema internally that gets stored in the schema field.
This pattern is well established (see eg Socrata), the challenge is how to integrate it in the existing workflow in CKAN for creating a dataset.
Obviously we need to read the file somehow to infer the fields and types. There are two options:
Read the file in the browser using tableschema-js and infer a schema that would be used to generate the edit interface. This tool shows how the general interaction would be (and perhaps we can reuse part of it): https://csv-schema.surge.sh/. We would need to consider browser support for this.
Rework the resource form to have a two step process, with a dedicated endpoint to upload the file first and read its contents. This has the benefit that it will be probably be required anyway for the next version of the synchronous validation (TODO ref) but obviously it means more clicks
Tasks
Integrate schema editor in current resource form
Infer field names and types. See above for options
Design schema editor component. Basic requirements: definite title, description and type of fields (TODO: prior work?)
Implement schema editor component functionality. Based on the inputs above, create a Table Schema JSON object that gets stored in the schema form field.
Estimate
7 days
The text was updated successfully, but these errors were encountered:
Description
Providing an existing schema that has been created elsewhere, probably by hand, is a great feature but one that will be probably only used by power users or data-savvy publishers that are familiar with validation, standards, etc.
To really engage publishers in the description of their data we need to guide the creation of these schemas based on the actual contents of the file, what people is familiar with.
In a nutshell, when uploading or linking to a new file, the user gets a list of the existing fields, with an option to define the type of that field (a guessed one is provided for them). Additionally they can provide extra information about the field like user-friendly labels or a description.
This gets transformed into a Table Schema internally that gets stored in the
schema
field.This pattern is well established (see eg Socrata), the challenge is how to integrate it in the existing workflow in CKAN for creating a dataset.
Obviously we need to read the file somehow to infer the fields and types. There are two options:
Read the file in the browser using tableschema-js and infer a schema that would be used to generate the edit interface. This tool shows how the general interaction would be (and perhaps we can reuse part of it): https://csv-schema.surge.sh/. We would need to consider browser support for this.
Rework the resource form to have a two step process, with a dedicated endpoint to upload the file first and read its contents. This has the benefit that it will be probably be required anyway for the next version of the synchronous validation (TODO ref) but obviously it means more clicks
Tasks
schema
form field.Estimate
7 days
The text was updated successfully, but these errors were encountered: