Skip to main content

Notice

The new RDA web platform is still being rolled out. Existing RDA members PLEASE REACTIVATE YOUR ACCOUNT using this link: https://rda-login.wicketcloud.com/users/confirmation. Please report bugs, broken links and provide your feedback using the UserSnap tool on the bottom right corner of each page. Stay updated about the web site milestones at https://www.rd-alliance.org/rda-web-platform-upcoming-features-and-functionalities/.

Relevancy Ranking Task Force Wiki

  • Creator
    Discussion
  • #137750

    Siri Jodha Khalsa
    Participant

    This is the Main Wiki page for the Relevancy Ranking Task Force

    Co-Leads
    Peter Cotroneo
    Mingfang Wu

    Contributors
    Anita de Waard
    Øystein Godøy
    Jeffrey Grethe
    Beth Huffer
    Siri Jodha Khalsa
    Jens Klump
    Lewis McGibbney
    Jun-ichi Onami
    Craig Willis

    Survey instrument for current practices in relevancy ranking systems is available here.

    A summary of per survey question and ongoing analysis of survey data is avilable here.

    Goals

    Relevancy Ranking is a specific feature of a data search system, yet it is an important component for a data search system to deliver what a data seeker is searching for. This task force aims to achieve the following goals:

    1. Help people choose appropriate technologies when implementing or improving search functionality at their repositories.
    2. Provide a means or forum for sharing experiences with relevancy ranking.
    3. Capture the aspirations, successes and challenges encountered from repository managers.

    Progress

    To achieve the above goals, we have carried out the following activities:

    1. Identify who is the target for recommendations/outputs of the Relevance Ranking Task Force?
      • For repository (primary target) – how to implement, what to implement, what tools are available in open code repositories.
      • For data producers – relates to choice of metadata standard, etc.  
      • For data searchers – how to formulate queries to get appropriate results ranking.
    2. Identify issues on search ranking from within a repository and evaluation methods.
    3. Identify current practices in relevancy ranking for data search by designing a draft survey questionnaire. It is expected the aggregated search result would be used to support the 2nd & 3rd goals.
    4. Explore possible testbeds to address data search challenges. Some possibilities may include:
      • Elsevier can provide AWS EC2 instances for a relevancy test bed. The Elsevier team could probably clone the machines that they used during the recent bioCADDIE Challenge.
      • Discuss with NDS Labs if they can provide a testbed (NB Anita can make connection, discuss in Barcelona?)
      • ANDS can provide a corpus of the Research Data Australia repository.

    Deliverables

    At this stage, we have been doing scoping study. We hope after we conduct the survey, we will be able to share where the community is at in implementing relevancy ranking, what are benchmarks, and what are common relevancy ranking activities that data search implementers would like to start and participate.

    Next Steps

    The planned future activities include:

    • Finalise the survey instrument and conduct the survey.
    • Analyse the survey result to understand current practices on relevancy ranking and prioritise future activities for the group.
    • It takes quite a lot of effort to experiment and evaluate various factors that affect relevancy ranking, this task force will collaborate with data search community (or search community at large) to explore what are realistic and yet reliable ways for data repositories to carry out such a comparison and evaluation task.  

    Minutes

    The task force has four meetings so far (about every two weeks). The notes from each meeting is available from the task force’s wiki page.

Log in to reply.