WG Research Data Repository Interoperability Session
Date: Friday 16th September, 13:30 - 15:00
Agenda:
- Introduction
- Presentation of the Case Statement review results
- Short talks on state of the art technologies
- Discussion about suitability and gaps
- Summary of results and next steps
1.
- Good overall review
- Realistic scope for 12-18 months
- Check existing initiatives and solutions!
- Check relevance of candidate technologies for data endeavor!
- Collection of technology contacts:
- Fedora Commons (David Wilcox, dwilcox@duraspace.org)
- DSpace (David Wilcox, dwilcox@duraspace.org)
- KIT Data Manager (Thomas Jejkal, thomas.jejkal@kit.edu)
- Hydra (Rick Johnson, rick.johnson@nd.edu)
- Dash research data portal/Merritt
- DataVerse (Gustavo Durand, gdurand@iq.harvard.edu)
- Invenio (Tim Smith, Tim.Smith@cern.ch (Tech. contact: Tibor.Simko@cern.ch))
- iRods (Terrell Russell, tgr@renci.org)
- OneData (Matthew Jones, jones@nceas.ucsb.edu)
- FairPort (Luiz Olavo Bonino da Silva Santos, luiz.bonino@dtls.nl)
- ePrints
- Figshare (Dan Valen, dan@figshare.com)
- ICat (Rolf Krahl, rolf.krahl@helmholtz-berlin.de)
- Dariah Repository (Stefan E. Funk, funk@sub.uni-goettingen.de)
- da|ra (Brigitte Hausstein, Brigitte.hausstein@gesis.org)
- DataCite (Martin Fenner, martin.fenner@datacite.org)
2.
- Outcome of the COAR Interoperability Project (Thomas Jejkal)
- Focus on publication world, partly applicable to research data
- Good state of the art overview usable for D.1
- Good opportunity for collaboration, especially in the direction of technical interoperability issues
Comments:
- Check overlap regarding common metadata and vocabulary
- Reference implementation highly appreciated
- Refreshing the SWORD Protocol (Dom Fripp)
- Enable SWORD for (research) data
- E.g. POST data by reference
- Long term sustainability plan
- Community maintenance best options in terms of long term guarantees and initial effort)
- Alternatives: maintaines by JISC (no guarantee) or IETF/NISO (high initial effort)
- Take next generation repositories into account
- ResourceSync and SWORD integration planned
- Comment at https://goo.gl/8E11wf
Comments:
- Reference implementation highly appreciated
- For version 2 available at swordapp.org
- Import/Export Standards for Repository Resources (David Wilcox)
- New server-sided APIs not available in ‚legacy’ instances of repository platforms
- Standalone tool using public interfaces could be used for old and new repository platform instances
- Looking into BagIt as container format
- No information about payload needed
- BagIt Profiles extension to add payload semantics for later import
- Apache Camel for RDF serialization
- See https://wiki.duraspace.org/display/FF/Design+-+Import+-+Export for more details
Comments:
- BagIt also used in DataONE, support for OAI-ORE resources planned?
- What is the effort for creating BagIt profiles?
3.
- What about domain specific repositories?
- Outcome of WG should be applicable also for such repositories.
- Will this WG try to find a consensus between repositories on resolving (persistent) identifiers?
- No, identifier resolution depends on identifier.
- Check PIT and Data Fabric output, e.g. to obtain content by identifier in a standardized way.
- Check Metadata IG for common metadata elements
- Think about flexible multi-purpose protocol (like HTTP)
- Additional technologies to investigate/support: HubZero, DataONE API, Globus Publication, Frictionless data, R (as consumer application)
- 3930 reads