Training activity information
Details
Produce a functional genomic sequence data file in an appropriate format and perform quality analysis of the sequence data
Type
Entrustable training activity (ETA)
Evidence requirements
Evidence the activity has been undertaken by the trainee repeatedly, consistently, and effectively over time, in a range of situations. This may include occasions where the trainee has not successfully achieved the outcome of the activity themselves. For example, because it was not appropriate to undertake the task in the circumstances or the trainees recognised their own limitations and sought help or advice to ensure the activity reached an appropriate conclusion.
Reflection at multiple timepoints on the trainee learning journey for this activity.
Considerations
- Relevant file formats e.g., bcl, fastq etc.
- WGS, WES and targeted
- Sequencing technologies, e.g., single-end, paired-end, multiplexed sequencing etc.
- Tools for assessing the quality of the data and the meaning of the quality metrics e.g., number of reads or Q30
- Storing data in appropriate formats according to local and national standards, taking into consideration patient confidentiality
- Data integrity and patient safety
Reflective practice guidance
The guidance below is provided to support reflection at different time points, providing you with questions to aid you to reflect for this training activity. They are provided for guidance and should not be considered as a mandatory checklist. Trainees should not be expected to provide answers to each of the guidance questions listed.
Before action
- What are the intended outcomes of producing a functional genomic sequence data file in the correct format and successfully performing quality analysis on it? What specific data format (e.g., BAM, SAM, FASTQ) and expected quality metrics or thresholds define a ‘functional’ file and successful quality analysis, ensuring you work within your scope of practice?
- What do you already know about working with genomic sequence data files, bioinformatic data formats like FASTQ, BAM, or SAM, and quality analysis tools? What possible challenges might you face during the activity, such as handling large file sizes, unfamiliar software, or troubleshooting data with poor quality? How might you handle these challenges? When would you need to seek advice or help if you encounter significant data issues or are unsure about interpreting quality metrics?
- What specific skills do you want to develop, such as proficiency in using particular quality analysis tools or working with various genomic data formats? What specific insights do you hope to gain about common issues affecting sequencing data quality, or the relationship between raw data and downstream analysis steps?
- Have you reviewed any notes from previous attempts at this or similar tasks regarding challenges or areas for improvement? What important information do you need to consider before embarking on the activity, such as reviewing Standard Operating Procedures (SOPs) related to data handling and quality control, or specific instructions for the dataset you will be using?
In action
- As you are producing the data file or performing quality analysis, make a note of anything that feels surprising or different from what you anticipate. For example, does a specific quality metric (e.g., Q-score, coverage) fall unexpectedly below thresholds, is the file format different from what you prepared for, or do you encounter an unfamiliar error message during software execution? Consider how this experience compares with previous experiences of similar activities, such as working with other data files or running quality control on different datasets. Does it feel more or less familiar, or are there new challenges you have not faced before?
- Identify how any unexpected developments, such as unusually poor data quality or a software crash, impact your immediate actions. Do you immediately try to troubleshoot the error, pause to review documentation, or seek advice from a colleague? Do you adapt or change your approach to producing the file or performing the quality analysis? For instance, do you try a different quality analysis tool, adjust parameters, or change your sequence of investigation? Do you find it difficult to adapt to the unexpected issue? Does it affect your confidence in your ability to assess data quality or produce the required file? Do you feel positive you can reach a successful conclusion?
- Identify how you work within your scope of practice when dealing with the unexpected event. Do you recognise when you might need to seek immediate advice or help, such as when data issues seem insurmountable or are beyond your current understanding of troubleshooting? Identify what you learn as a result of the unexpected development. For example, do you discover a new troubleshooting technique for a specific data quality issue, or a nuance about a particular file format that improves your handling of it?
On action
- Summarise the key steps you took to produce the genomic sequence data file and perform the quality analysis. Describe the specific file format you used and why it was appropriate. Detail the quality control (QC) checks you performed and the results. Were there any specific events, actions, or interactions that felt important during the process?
- What specific learning can you take from producing the data and performing the QC? For example, what strengths did you demonstrate in using tools or interpreting QC metrics? What skills or knowledge gaps were evident regarding file formats or QC standards? How did this experience compare against previous times you have produced or analysed sequence data? Were any previous actions for development in this area achieved? Do you feel your practice has improved? Identify any challenges you experienced, such as issues with data volume, software, or interpreting complex QC reports, and how you reacted to them. Did these challenges affect your ability to deal with the situation? Were you able to overcome them? Was there anything significant about this activity, such as needing to seek advice or clarification on a specific QC metric or file format?
- Identify the specific actions or ‘next steps’ you will take based on this experience to support your learning. What will you do differently next time you produce or quality analyse sequence data? Has anything changed in terms of what you would do if faced with a similar situation again? Do you need to practice any specific aspects of data production, formatting, or QC further? How will you assimilate any feedback you received on your data file or QC report?
Beyond action
- Think back on the different times you have performed this specific activity of producing and analysing sequence data quality. Have you revisited your notes or reflections from those previous experiences? What specific points for improvement did you identify regarding producing functional data or performing quality analysis? Have you taken actions to address those identified areas, such as refining scripts, adjusting parameters, or improving documentation? Are you now ready to demonstrate this new, improved learning consistently in your future quality analysis activities? Have you discussed challenges and successes in performing this activity with peers or colleagues? Did their experiences or perspectives change how you view certain aspects of data quality analysis?
- How has the accumulation of experiences in performing quality analysis over multiple sequencing runs or datasets developed your practice? Do you now more readily identify common quality issues or select appropriate metrics? How does your cumulative learning from performing this activity prepare you for assessing sequencing quality or presenting quality control checks? Based on your repeated experiences, do you have a better sense of when data quality issues might be beyond your current scope of practice to fix, requiring escalation or advice?
Relevant learning outcomes
| # | Outcome |
|---|---|
| # 1 |
Outcome
Explain the structure of the human genome and the impact of variation on human development, health, and disease. |
| # 2 |
Outcome
Evaluate sources of information about variation in the human genome including access, application and clinical impact. |
| # 3 |
Outcome
Select appropriate tools for next generation sequencing (NGS) analysis of inherited and acquired disease. |
| # 4 |
Outcome
Analyse NGS data in a clinical setting applying appropriate quality control and data validation. |