Module information
Module details
- Title
- Applied Statistics, Data Science and Quality in Clinical Bioinformatics
- Type
- Specialist
- Module code
- S-BG-S3
- Credits
- 15
- Phase
- 3
- Requirement
- Compulsory
Aim of this module
This module will develop trainees’ familiarity with data types encountered in genomic laboratories, including: how to hold and statistically evaluate data, use programmatic methods to examine, interrogate and draw conclusions from data and how to communicate the insight derived from the data to support clinical decision making.
A large proportion of this module is well suited for a major project that touches many of the competencies detailed. An example project is developing and validating a bioinformatics tool or pipeline and then bringing it into service.
Work-based content
Training activities
# | Learning outcome | Training activity | Type | Action |
---|---|---|---|---|
# 1 | Learning outcome 1 |
Training activities
Retrieve data from a REST application programming interfaces (API) and manipulate to create dataframes in both python 3 (pandas) and R |
Type DTA | Action View |
# 2 | Learning outcome 2, 3 |
Training activities
Hold genomics data in dataframes and perform statistical analyses using both python 3 (pandas) and R |
Type DTA | Action View |
# 3 | Learning outcome 1 |
Training activities
Design and describe a relational data model for genomics |
Type DTA | Action View |
# 4 | Learning outcome 1 |
Training activities
Collect a complex dataset and store in an open-source relational database management system suitable for further genomic analysis |
Type DTA | Action View |
# 5 | Learning outcome 2, 3 |
Training activities
Analyse the variation in a genomics dataset by deriving summary statistics programmatically, and justify the choice of summary statistics |
Type DTA | Action View |
# 6 | Learning outcome 3, 4 |
Training activities
Visualise the variation for multiple aspects of a genomics dataset programmatically using multiple plots |
Type DTA | Action View |
# 7 | Learning outcome 5 |
Training activities
Review and critique an existing quality control process for a diagnostic assay |
Type DTA | Action View |
# 8 | Learning outcome 4, 5 |
Training activities
Determine an appropriate threshold for a quality control metric for genomic data, and explain the quality control metric and threshold to a laboratory colleague who is not a bioinformatician |
Type DTA | Action View |
# 9 | Learning outcome 3, 4, 5 |
Training activities
Plan an implementation for an improvement to an existing quality control process for a diagnostic assay |
Type DTA | Action View |
# 10 | Learning outcome 5 |
Training activities
Review tests which have failed next generation sequencing (NGS) quality control metric thresholds, and identify the reason for failure and downstream consequences |
Type ETA | Action View |
# 11 | Learning outcome 3 |
Training activities
Describe the differences between two genomics datasets by applying statistical methods including several of the following:
|
Type DTA | Action View |
# 12 | Learning outcome 4 |
Training activities
Summarise and present the results of a statistical data analysis of genomic data to laboratory colleagues, using appropriate visualisation of the data to support your explanation |
Type DTA | Action View |
# 13 | Learning outcome 3, 4 |
Training activities
Select one widely used and one as-of-yet unestablished metric used for variant interpretation and explain to Clinical Scientists in Genomics how they are calculated, how to apply them appropriately for clinical use and their limitations |
Type DTA | Action View |
# 14 | Learning outcome 4, 6 |
Training activities
Present the opportunities and challenges in applying machine learning for genomics |
Type DTA | Action View |
# 15 | Learning outcome 4, 7 |
Training activities
Complete or revise a data protection impact assessment for a data analysis process and make recommendations for action where required |
Type DTA | Action View |
Assessments
Complete 3 Case-Based Discussions
Complete 3 DOPS or OCEs
Direct Observation of Practical Skills Titles
- Select an appropriate statistical test and justify its use to answer a specific question of data.
- Generate a quality control report for an assay.
- Plot the results of a statistical data analysis.
Observed Clinical Event Titles
- Present the results of a statistical data analysis to clinicians, clinical scientists (non-bioinformatician) or technologists.
- Demonstrate the use of a pathogenicity prediction algorithm to clinicians, clinical scientists (non-bioinformatician) or technologists.
- Explain a quality control report to a non-bioinformatician.
Learning outcomes
# | Learning outcome |
---|---|
1 | Arrange and store data for programmatic analysis. |
2 | Perform programmatic data analysis. |
3 | Apply statistical methods to derive meaningful conclusions from data to support clinical decision making. |
4 | Summarise results of data analysis to stakeholders. |
5 | Appraise laboratory quality control systems. |
6 | Evaluate the potential of emerging methods in data science and the application to Clinical Bioinformatics Genomics. |
7 | Practice in accordance with data protection legislation. |
Clinical experiences
Clinical experiences help you to develop insight into your practice and a greater understanding of your specialty's impact on patient care. Clinical experiences should be included in your training plan and you may be asked to help organise your experiences. Reflections and observations from your experiences may help you to advance your practice and can be used to develop evidence to demonstrate your awareness and appreciation of your specialty.
Activities
- Attend an information governance meeting to understand the application of information governance guidance in a clinical setting.
- Observe how data is entered into a hospital system, such as a patient administration system or electronic health record system and appreciate manual and automated aspects.
- Observe the review of a QC report by laboratory staff to gain insight into how quality is maintained and the process for failing samples.
- Appreciate emerging international bioinformatics standards and the impact of their adoption in the NHS.
- Attend a Genetic Counselling appointment where risk and statistics are being explained to a patient.
Academic content (MSc in Clinical Science)
Important information
The academic parts of this module will be detailed and communicated to you by your university. Please contact them if you have questions regarding this module and its assessments. The module titles in your MSc may not be exactly identical to the work-based modules shown in the e-portfolio. Your modules will be aligned, however, to ensure that your academic and work-based learning are complimentary.
Learning outcomes
On successful completion of this module the trainee will be able to:
- Demonstrate the application of SQL, R and a high-level programming language to perform data analyses.
- Apply integrative knowledge of fundamental statistical concepts.
- Critically evaluate and select appropriate statistical tests for genomic datasets.
- Design, build, populate and query genomics databases.
- Critically evaluate, select and apply effective data visualisation methods suitable for genomics datasets.
Indicative content
Databases
- Designing and using relational databases
- Common RDMS, including: MySQL/MariaDB and PostGres
- Structured query language (SQL) commands
- Database programmatic access
Data analysis
- Python3 and data analysis packages such as numpy and pandas
- R for data analysis, R Studio and tidyverse
Statistics
- Common statistical concepts in genomics and bioinformatics
- Normal distribution, standard deviation and standard error of the mean
- Sample size and power calculations
- Odd ratios and effect sizes
- Linear and logistic regression
- Correct selection of statistical tests
Data visualisation
- Plotting data, ggplot and matplotlib
Machine learning
- Machine learning principles
- Critical evaluation of machine learning applications
Module assigned to
Specialties
Specialty code | Specialty title | Action |
---|---|---|
Specialty code SBI1-1-22 | Specialty title Clinical Bioinformatics Genomics [2022] | Action View |
Specialty code SBI1-1-23 | Specialty title Clinical Bioinformatics Genomics [2023] | Action View |
Specialty code SBI1-1-24 | Specialty title Clinical Bioinformatics Genomics [2024] | Action View |