Skip to main content

Notice

The new RDA web platform is still being rolled out. Existing RDA members PLEASE REACTIVATE YOUR ACCOUNT using this link: https://rda-login.wicketcloud.com/users/confirmation. Please report bugs, broken links and provide your feedback using the UserSnap tool on the bottom right corner of each page. Stay updated about the web site milestones at https://www.rd-alliance.org/rda-web-platform-upcoming-features-and-functionalities/.

Minutes from yesterday’s call

  • Creator
    Discussion
  • #122075

    Dear all,
    here are the merged minutes from yesterdays’ group call. The next call
    will take place in 2 weeks at the usual timeslot: Tuesday June 14, 13:00
    GMT.
    Best, Tobias
    Attendees: Frederik, Ulrich, Christopher, Tobias
    Notes:
    * Formal definitions – set theory:
    o potentially use ADT as intermediate step between model and
    implementation
    o Look into Haskell docs, use that to get to a small core def for
    sorted/unsorted, multimembership/unique
    * Models are essentially a set of attributes and rules about them
    (plus operations)
    o Traits are then useful because they group things together and
    make it simpler to explain what a certain formal concept (like
    sorting) implies in terms of practically useful properties and
    methods
    o The clone operation (i.e. cache a snapshot of elements in
    recursive collections) is useful because it may solve the
    complexity issues that arise once we have recursion
    o for citation use case: the copy/clone method (i.e. making
    snapshots) may solve the issue that collections whose deep
    membership changes may not remain citable
    + citation use case is present in many communities, and we
    need to cover their differences. Snapshotting may not be
    possible for everyone.
    + but we may figure out a way to compress the snapshotting
    process so we can reconstruct a snapshot on request.
    o for very dynamic data we may want to state that there are
    policies that must be observed (basic versioning)
    + Christopher Harrison: use case with high volume and lots of
    change every day – not possible to statically snapshot it –
    need a UC description and see where the gaps are wrt the
    formal model we end up with
    * Collection API – what about a member API? Such an API would answer
    e.g. “which collection(s) does this object belong to?” – different
    from scope of current swagger API
    o Might be solvable through PIT + registered property, but not
    sure whether this is the only action of that API
    o Close to a global search, so costly/difficult – might not be
    implementable by all use cases, but perhaps relevant across
    multiple RDA use cases
    o Ulrich: nice if every member of a collection has a pinned PID to
    its parent; but only feasible for static collections inside a
    repository
    + across repositories, you will need a crawler that then
    creates a graph; the query however then deviates to “which
    subcollection(s) of a given (entry point) collection does
    this object belong to?” which actually will lead to a two
    parameter function
    SubcollectionsMemberBelongsTo(member,collection) – we can
    include this in the hierarchy trait
    + actually, the pointer to parents is a collection in itself
    o Christopher: Subscription model could help with dynamic data as well
    o Frederik: RSS or blog pingback might provide useful ideas and
    infrastructure

    Tobias Weigel
    Abteilung Datenmanagement
    Deutsches Klimarechenzentrum GmbH (DKRZ)
    Bundesstraße 45 a • 20146 Hamburg • Germany
    Phone: +49 40 460094-104
    Email: ***@***.***
    URL: http://www.dkrz.de
    ORCID: orcid.org/0000-0002-4040-0215
    Geschäftsführer: Prof. Dr. Thomas Ludwig
    Sitz der Gesellschaft: Hamburg
    Amtsgericht Hamburg HRB 39784

  • Author
    Replies
  • #132970

    Dear all,
    … as I wasn’t able to participate in the last videomeeting and because
    we want some discussion on the mailing list … and weekend is coming …
    I have the feeling we try to say too much about the individual items
    inside a collection. From my perspective, *anything* which has an adress
    can be an item inside of a collection. But that means it is difficult to
    say anything about the item itself from the collections perspective. And
    it is not necessary: there is the PIT API or similar approaches.
    a)
    It is not necessary that an item has the capability of storing parent’s
    information – nor the creator of a collection maybe doesn’t have write
    permissions on items he/she is adding to a collection. I also don’t know
    any programming language where you can ask an object: “Tell me to which
    collections you belong to”. Even in the closed namespace / memory area
    of an application this is not realized, how should that work on a higher
    / global level?
    b)
    I’m not sure if we really should care about a differentiation between
    dynamic / static collections wrt the items a collection contains. A
    collection itself can be dynamic or static – I agree. But we can’t say
    *anything* about the items inside of a collection. Maybe there are items
    which are defined as a rule like “Give me the last measurement”. I don’t
    see a way how such an item could tell the collection “Today I’m giving
    back other results then yesterday”. It could also be that such an item
    doesn’t exist anymore – we don’t have a “wayback” channel and yes, I’m
    not talking only about PIDs here. If static just means that the
    collection cannot be changed anymore – thats fine.
    c)
    I’m not sure if I understood the “trait” thing (and I never heard about
    something like that before): it is about collecting properties/functions
    in to a functional group? Do we really need this level of abstraction?
    Best,
    Tom

  • #132965

    Hello Tom,
    I concur that we may allow anything that has an address to be inside a
    collection, however, there may be an important side condition: That
    either the agent adding the item to a collection or the one responsible
    for managing the collection can make a realistic claim about the item’s
    current life cycle state and expected development. There is not a clear
    distinction here, which is probably the cause for many of our problems.
    Do we want to be totally arbitrary regarding the items? Or should the
    benefit of using a collection API rather be that you *can* assume that
    some essential information about item status will be available? I don’t
    think we have a clear take on this yet.
    On your item a) – parts known what they are part of – the best example
    for such cases are trees, where you would be unable to traverse otherwise.
    On b) – I think you are right and there may be a line here that we do
    not want to cross wrt the “backchannel” from item to collection you
    explained. A collection should specify whether its constituency is
    dynamic or static, but it is probably too difficult to answer this by
    redirecting to individual items and leave the answer up to them.
    On the traits: It is in principle as you describe, gathering properties
    and methods into flexible “chunks” that can be recombined and their
    recombination may give rise to other special methods. I like the model
    because it at least circumvents some of the issues with multiple
    inheritance. I am currently sticking with it because it is very
    flexible, but I am also not sure if this will be reflected in the API at
    the end. Traits-based programming [1] is probably not the most
    accessible paradigm and I’m not completely sure if this is the right
    description for what’s currently in the document.
    Best, Tobias
    [1] https://en.wikipedia.org/wiki/Trait_%28computer_programming%29
    ——– Original Message ——–
    Subject: Re: [rda-collection-wg] Minutes from yesterday’s call
    From: ThomasZastrow

    To: TobiasWeigel , RDA Collections WG

    Date: 03 Jun 2016, 15:42

  • #132955

    On 06/06/2016 05:20 AM, TobiasWeigel wrote:
    > Hello Tom,
    >
    > I concur that we may allow anything that has an address to be inside a
    > collection, however, there may be an important side condition: That
    > either the agent adding the item to a collection or the one
    > responsible for managing the collection can make a realistic claim
    > about the item’s current life cycle state and expected development.
    > There is not a clear distinction here, which is probably the cause for
    > many of our problems. Do we want to be totally arbitrary regarding the
    > items? Or should the benefit of using a collection API rather be that
    > you *can* assume that some essential information about item status
    > will be available? I don’t think we have a clear take on this yet.
    >> The latter is what I have been assuming, and without it it would be

    On 06/06/2016 05:20 AM, TobiasWeigel wrote:
    > Hello Tom,
    >
    > I concur that we may allow anything that has an address to be inside a
    > collection, however, there may be an important side condition: That
    > either the agent adding the item to a collection or the one
    > responsible for managing the collection can make a realistic claim
    > about the item’s current life cycle state and expected development.
    > There is not a clear distinction here, which is probably the cause for
    > many of our problems. Do we want to be totally arbitrary regarding the
    > items? Or should the benefit of using a collection API rather be that
    > you *can* assume that some essential information about item status
    > will be available? I don’t think we have a clear take on this yet.
    >> The latter is what I have been assuming, and without it it would be
    hard for me to justify the value of the collections API to our use cases.
    >
    > On your item a) – parts known what they are part of – the best example
    > for such cases are trees, where you would be unable to traverse
    > otherwise.
    >
    > On b) – I think you are right and there may be a line here that we do
    > not want to cross wrt the “backchannel” from item to collection you
    > explained. A collection should specify whether its constituency is
    > dynamic or static, but it is probably too difficult to answer this by
    > redirecting to individual items and leave the answer up to them.
    >
    > On the traits: It is in principle as you describe, gathering
    > properties and methods into flexible “chunks” that can be recombined
    > and their recombination may give rise to other special methods. I like
    > the model because it at least circumvents some of the issues with
    > multiple inheritance. I am currently sticking with it because it is
    > very flexible, but I am also not sure if this will be reflected in the
    > API at the end. Traits-based programming [1] is probably not the most
    > accessible paradigm and I’m not completely sure if this is the right
    > description for what’s currently in the document.
    >
    >> I think the traits are what I had previously been thinking of as

  • #132951

    Hello Bridget,
    well said – the ability to retrieve essential item status and membership
    information through the collection API is a deciding feature for many
    use case providers, including yours and also ours, actually.
    Best, Tobias
    ——– Original Message ——–
    Subject: Re: [rda-collection-wg] Minutes from yesterday’s call
    From: balmas
    To: ***@***.***-groups.org
    Date: 06 Jun 2016, 15:45

Log in to reply.