Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using loops to reduce lines of code #286

Open
ChristophEwertowski opened this issue Jul 25, 2018 · 0 comments
Open

Using loops to reduce lines of code #286

ChristophEwertowski opened this issue Jul 25, 2018 · 0 comments

Comments

@ChristophEwertowski
Copy link

ChristophEwertowski commented Jul 25, 2018

In lobid-resources (repository) we transform catalog data in XML from a part of Germany to ntriples which we transform to JSON-LD. The JSON-LD data we show at a frontend (only German) and make it available via an API. An example:

Source material in an XML

<datafield tag="100" ind1="b" ind2="1">
   <subfield code="p">Barenboim, Daniel</subfield>
   <subfield code="d">1942-</subfield>
   <subfield code="4">cnd</subfield>
   <subfield code="3">Dirigent</subfield>
   <subfield code="4">drt</subfield>
   <subfield code="3">Regisseur</subfield>
   <subfield code="9">(DE-588)118506560</subfield>
</datafield>

Transformation format in simplified ntriples

<URL of described file> <bibframe:contribution> _:rdfListBlankNode .
_:rdfListBlankNode <rdfs:firstEntry> _:firstContributionBlankNode .
_:rdfListBlankNode <rdfs:rest> _:secondRdfListBlankNode .
_:secondRdfListBlankNode <rdfs:firstEntry> _:secondContributionBlankNode .
_:secondRdfListBlankNode <rdfs:restEntry> <rdfs:nil> .
_:firstContributionBlankNode <bibframe:agent> http://d-nb.info/gnd/118506560 .
_:firstContributionBlankNode <bibframe:role> http://id.loc.gov/vocabulary/relators/cnd .
_:secondContributionBlankNode <bibframe:agent> http://d-nb.info/gnd/118506560 .
_:secondContributionBlankNode <bibframe:role> http://id.loc.gov/vocabulary/relators/drt .
http://d-nb.info/gnd/118506560 <gnd:gndIdentifier> "118506560" .
http://d-nb.info/gnd/118506560 <rdfs:label> "Barenboim, Daniel" .

JSON-LD, used for frontend and API format

"contribution" : [
   {
      "agent" : {
         "gndIdentifier" : "118506560",
         "id" : "http://d-nb.info/gnd/118506560",
         "label" : "Barenboim, Daniel"
      },
      "role" : {
         "id" : "http://id.loc.gov/vocabulary/relators/cnd",
         "label" : "Conductor" [in German "Dirigent/in", added after transformation from origin XML to RDF/XML]
      }
   },
   {
      "agent" : {
         "gndIdentifier" : "118506560",
         "id" : "http://d-nb.info/gnd/118506560",
         "label" : "Barenboim, Daniel"
      },
      "role" : {
         "id" : "http://id.loc.gov/vocabulary/relators/drt",
         "label" : "Director" [in German: Regie]
      }
   }
]

To create distinct contribution blank nodes I would need a way to concat the agent id "http://d-nb.info/gnd/118506560" with each role id to "http://d-nb.info/gnd/118506560http://id.loc.gov/vocabulary/relators/cnd" and "http://d-nb.info/gnd/118506560http://id.loc.gov/vocabulary/relators/drt".

To solve this problem I used <concat> on all roles, put it into a variable and then used regular expressions on the variable to create multiple, distinct contribution bnodes. Because this is needed for every fourth field from 100 to 296 it leads to many lines of code (even with <macro>) and I only can cover three role subfields: The first and last relators and one in the middle. What would easy the work a lot would be something like this:

<combine name="@agentId100" value="${a}">
   <data source="100??.9">
      <regexp match="\(DE-588\)(.*)" format="http://d-nb.info/gnd/${1}"/>
   </data>
</combine>
<combine name="@relator100" value="${a}" flushWith="100??.[b4]">
   <choose name="a" flushWith="100??.[b4]">
      <data source="100??.[b4]">
         <regexp match="^cnd$" format="http://id.loc.gov/vocabulary/relators/cnd"/>
      </data>
      <data source="100??.[b4]">
         <regexp match="^drt$" format="http://id.loc.gov/vocabulary/relators/drt"/>
      </data>
   </choose>
</combine>

<combine name="@contributionBnode100" value="_:${a}${b}" loop="true">
   <data source="@agentId100" name="a"/>
   <data source="@relator100" name="b"/>
</combine>

Explanation: Create an agent id from a subfield, create multiple relators from all role subfields of the field 100, combine each relator with the agent for multiple contribution blank nodes.

I hope that I expressed myself good enough. If you have questions ask me or @dr0i .

Update: ntriples is the transformation format, not RDF/XML. It was simply too hot when I wrote this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants