BSI Hub

Topics:

BSI Hub

BSI Hub is a platform for the translation and loading of data from multiple systems, each of which uses different terminology, into a single database for collective access. For example, if a community of repositories want to gather data into a centralized location to promote data sharing amongst them, they may encounter difficulties with data sets being from a wide array of legacy systems. Data from those systems would most probably be stored in vastly different formats. Instead of forcing any participating groups to migrate to a single platform, BSI Hub allows for continued use of legacy systems while providing the ability to automatically translate data into a single, uniform format in a BSI database so that data can be viewed collectively.

Multiple translations can be created to facilitate translation of data from different legacy formats. These translations are coded ahead of time by IMS staff using a specific set of requirements agreed upon by all involved parties.

APIs are available to programmatically process files using a selected BSI Hub translation instead of submitting files via the website user interface.

Users can submit files to BSI Hub containing only specimen data or containing specimen data in conjunction with subject and/or location data. Subject and location data cannot be submitted independent of specimen data to BSI Hub. BSI Hub accepts .csv and .xlsx file types for submission.

To process a file on BSI Hub:

  1. Login to BSI Hub.
    1. If you have access to more than one repository, select the one to which you are submitting data.
  2. Select a value for the Type field. This will be the translation used to systematize the data from the legacy system into vocabulary matching that of BSI.
  3. Using the file explorer, upload the file to be submitted.
  4. If desired, enter Notes about the file upload.
  5. Select the Submit

BSI_Hub_interface.pngAfter submission, files are funneled through the Pentaho Data Integration (PDI), where data translations are made according to the translation selected on submit. If an error is returned during translation, it will be shown to the user on the BSI Hub site under the Processing Error tab of the submission. Once translations have been completed, the data is pushed into BSI. In the BSI database, the loaded data is committed via Data Entry batches. An error at this stage is shown to the user on the submission’s BSI Errors tab.

Example: A cohort of 3 biobanks decide to use BSI as the central repository for sharing their collected data with researchers. Each biobank works with IMS to build a translation which will transform the file’s data into a format readable by BSI.

  • Biobank #1 exports a master file containing all records from their legacy system once a week. A technician logs into BSI Hub and submits the single file.
  • Biobank #2 programmatically exports files from their legacy system to a network drive nightly. They use BSI Hub APIs to automatically submit new files stored in this network location to BSI Hub for processing each morning. They do not login to BSI Hub.
  • Biobank #3 exports a file containing all new or updated subject records and a second file containing all new or updated specimens stored with the legacy system. A technician logs into BSI Hub and submits both files.