Training activity information

Details

Collect a complex dataset and store in an open-source relational database management system suitable for further genomic analysis

Type

Developmental training activity (DTA)

Evidence requirements

Evidence the activity has been undertaken by the trainee​.

Reflection on the activity at one or more time points after the event including learning from the activity and/or areas of the trainees practice for development.

An action plan to implement learning and/or to address skills or knowledge gaps identified.

Considerations

  • Data provenance
  • Database normalisation
  • Cleaning and tidying data
  • Data storage approaches
  • Databases
  • Relational algebra
  • SQL

Reflective practice guidance

The guidance below is provided to support reflection at different time points, providing you with questions to aid you to reflect for this training activity. They are provided for guidance and should not be considered as a mandatory checklist. Trainees should not be expected to provide answers to each of the guidance questions listed.

Before action

  • What knowledge of data collection methods and open-source database systems (e.g., MySQL/MariaDB, PostGres) is required?
  • What challenges do you expect in collecting and integrating a complex dataset? How will you learn to populate and manage data within a relational database? What are your current skills in data collection and database management?
  • Will you plan the data collection process and research suitable open-source database systems? Have you discussed the dataset and database choice with your training officer? What potential issues might arise during data collection or storage? How do you feel about the complexity of this task?

In action

  • Where are you sourcing the complex dataset from? What steps are you taking to collect and prepare the data for storage? How are you interacting with the chosen relational database (e.g., using SQL commands or a graphical interface)? What decisions are you making about the database schema and data types?
  • Are you successfully collecting the complete dataset? Are you encountering any issues with data cleaning or transformation before loading? Is the data being stored correctly in the database?
  • If you encounter problems with data integrity or loading, are you adjusting your data preparation steps? Are you troubleshooting SQL errors or database connection issues? Are you considering alternative ways to structure the data within the database?

On action

  • Describe the complex dataset you collected. What steps did you take to store it in an open-source relational database management system?
  • What did you learn about the process of collecting and preparing a complex dataset for storage? What practical skills did you gain in using an open-source relational database management system (RDBMS)? Were there any difficulties encountered in data collection or storage? What did you learn from these challenges? How did your approach during data collection and storage (‘reflect-in-action’) influence the final database? How does the ability to collect and store data in an RDBMS relate to your future work in Clinical Bioinformatics?
  • What aspects of data collection and database management do you want to develop further? How can you apply these skills to manage and analyse genomic data in your routine practice? What are the next steps you will take to improve your skills in data collection and RDBMS usage? What support or resources would be beneficial for further developing your database management skills?

Beyond action

  • Have you subsequently collected and stored other complex datasets? What challenges did you encounter and how did the experience of this training activity help you overcome them?
  • How has your understanding of open-source database systems evolved since this activity? Have you explored other systems?
  • How might your experience in data collection and management contribute to future research projects or the development of data resources within your workplace?

Relevant learning outcomes

# Outcome
# 1 Outcome

Arrange and store data for programmatic analysis.