Module information

Module details

Title
Data Management
Type
Specialist
Module code
SBI220
Credits
10
Requirement
Compulsory

Aim of this module

The aim of this rotation is to provide trainees with an overview of the key elements of the data management of public health data and its impact on patients and the public. This will include areas such as data governance and basic systems development. To ensure that we support effective public health action there is a requirement for an informed understanding of both the data and the topic area, underpinned by a sound scientific interpretation of the evidence. Such evidence must frequently be transformed from raw data into consumable information before it can be used for making decisions, determining policy, and conducting and evaluating public health programmes. High-quality accurate public health data support the development of public health policy, strategy, development and introduction of public health programmes and ultimately improve health outcomes. This aim of this module is to enable the trainee to develop their knowledge and understanding of data management and apply their skills to ensure that data are governed appropriately and managed in accordance with legislative and good practice guidelines.  

Work-based content

Competencies

# Learning outcome Competency Action
# 1 Learning outcome 1 Competency

Liaise with an information manager to identify how a database specification should be written.

Action View
# 2 Learning outcome 1 Competency

Define the user requirements for the relational database, including purpose, scope, structure and security.

Action View
# 3 Learning outcome 1 Competency

Identify the key elements of the database’s design, including relationships and keys/indexes and techniques to provide quality assurance.

Action View
# 4 Learning outcome 1 Competency

Produce appropriate documentation to define the system.

Action View
# 5 Learning outcome 2 Competency

Identify a database containing health or exposure data and extract the relevant fields from multiple tables within the database to answer a specific question using Structured Query Language (SQL).

Action View
# 6 Learning outcome 2 Competency

Import data into software which allows further data analysis, e.g. R, SQL.

Action View
# 7 Learning outcome 2 Competency

Run queries to identify quality issues, including coding anomalies, incomplete data and general data inaccuracies.

Action View
# 8 Learning outcome 2 Competency

Resolve issues identified where possible.

Action View
# 9 Learning outcome 2 Competency

Identify relevant fields and de-duplicate data.

Action View
# 10 Learning outcome 2 Competency

Manipulate the data to produce aggregated counts.

Action View
# 11 Learning outcome 3 Competency

Produce a short report describing and evaluating the quality issues identified.

Action View
# 12 Learning outcome 3 Competency

Propose recommendations to resolve or mitigate the data quality issues.

Action View
# 13 Learning outcome 3 Competency

Present findings to colleagues, defend the recommendations and agree an action plan.

Action View

Assessments

Information:

This module has no work-based assessments.

Learning outcomes

  1. Document and design a specification for a relational database for collecting or storing health data, ensuring compliance with security, governance and ethical issues.
  2. Extract, import and manipulate data within a data set.
  3. Draft a report summarising the quality of the data, make recommendations required to improve the data quality and agree an action plan.

Academic content (MSc in Clinical Science)

Important information

The academic parts of this module will be detailed and communicated to you by your university. Please contact them if you have questions regarding this module and its assessments. The module titles in your MSc may not be exactly identical to the work-based modules shown in the e-portfolio. Your modules will be aligned, however, to ensure that your academic and work-based learning are complimentary.

Learning outcomes

  1. Describe the key elements and purpose of data sharing/access agreements to protect patients and the public.
  2. Discuss security issues around client server systems and system security in the context of NHS data governance and ethical concerns.
  3. Describe the key elements of databases, including purpose, scope and application.
  4. Identify the key elements and importance of database design.
  5. Describe the principles and steps of data linkage.
  6. Manipulate, manage and quality assure data within a data set.
  7. Discuss the importance of data quality and explain how to mitigate and improve it.
  8. Interrogate an SQL database.
  9. Describe the principles of data warehousing.
  10. Summarise the concept of data mining.

Indicative content

  • Describe the key elements and purpose of data sharing/access agreements to protect patients and the public
    • Patient confidentiality
    • Caldicott/Caldicott Guardians
    • Legal issues, e.g. legislation (Data Protection Act), Information Commissioner’s Office, impact of breaches
    • Data sharing agreements
    • NHS Information Governance Toolkit – overview, how it helps safeguard data
    • Systems security – user access controls, firewalls, encryption (s/w, h/w, database)
    • Pseudonymisation
    • System level security policy
    • Data flows diagrams
    • Risk assessment/Patient identifiable information
    • Disaster recovery/resilience
    • Secure information exchange
    • Freedom of information requirements
  • Discuss security issues around client server systems and system security in the context of NHS data governance and ethics concerns
    • Logins and access to databases, server/database roles and in-database roles (access to specific tables, etc.)
    • Database logs
    • Disaster recovery
    • Principles of information governance and be aware of the safe and effective use of health and social care information
    • Recognise and respond appropriately to situations where it is necessary to share information to safeguard service users or the wider public
    • The need to manage records and all other information in accordance with applicable legislation, protocols and guidelines
  • Describe the key elements of databases, including purpose, scope and application
    • What is a database and why are they needed?
    • Kinds of database (relational, document, graph)
    • Products – SQL Server, Oracle, Access
    • What can we do with them? (lookup, analysis, online transaction processing [OLTP], online analytical processing [OLAP], machine learning, stream analytics)
  • Describe the key elements and importance of database design
    • Creating the database – considerations on size, logical structure for data storage
    • Structure – tables, views, ioins, primary/foreign keys
    • Field types
    • Nulls and empty strings – how do we deal with missing data
    • Normalisation (to save space and not repeat data, primary/foreign keys)
    • Indexes
    • Importance of documentation
    • Importance of using standardised coding
  • Describe the principles and steps of data linkage
    • What it is, why we need to do it and principles
    • Common problems
  • Manipulate, manage and quality assure data within a data set
    • Variables – types, numeric formats, decimals, date and time, string
    • Getting data into and out of programmes
    • Documentation commands – labels
    • Calculations – generate and replace, recoding, checking correctness, missing data
    • Data structure – selecting observations and variables, renaming and reordering, sorting, collapsing data, combining files
    • Data entry – folders, filenames, variable names, error prevention
    • Approaches and methods to ensure reproducibility of analyses and outputs
  • Data quality
    • Principles: completeness, accuracy, validity, accuracy, timeliness, consistency
    • Standards
    • Data validation
    • Data dictionaries, standards and coding
  • Interrogate an SQL database
  • Using a dummy data set
  • Importing data
    • Import and Export Wizard
    • SQL insert, update, delete
    • Querying the data
    • SQL Management Studio
    • Select, Where, Group by, Order by, Count
    • Joins
    • Formats and conversion
    • Handling nulls
    • Query efficiency
    • ODBC – Access, Excel, EpiData, R, etc.
  • Describe the principles of data warehousing
    • Definition of data warehouse
    • Application
    • Process: data cleaning, data integration and data consolidations
  • Summarise the concept of data mining
    • Use and application of data mining
    • Data mining tools
    • Steps in data mining: change detection, dependency modelling, clustering, classification, regression, summarisation, results validation

Clinical experiences

Important information

Clinical experiential learning is the range of activities trainees may undertake in order to gain the experience and evidence to demonstrate their achievement of module competencies and assessments. The list is not definitive or mandatory, but training officers should ensure, as best training practice, that trainees gain as many of these clinical experiences as possible. They should be included in training plans, and once undertaken they should support the completion of module assessments and competencies within the e-portfolio.

Activities

  • Arrange a face-to-face interview with an information governance officer, having prepared a document describing what regulations and legislation you feel are relevant with regard to data and data management, including when you are using patient identifiable information and the related issues. Discuss the document and then amend to reflect any additional information.
  • Shadow a database manager or system administrator to better understand the process of developing databases and their relationships.