Training activity information

Details

Hold genomics data in dataframes and perform statistical analyses using both python 3 (pandas) and R

Type

Developmental training activity (DTA)

Evidence requirements

Evidence the activity has been undertaken by the trainee​.

Reflection on the activity at one or more time points after the event including learning from the activity and/or areas of the trainees practice for development.

An action plan to implement learning and/or to address skills or knowledge gaps identified.

Considerations

  • The use of series and dataframes for data analyses
  • Quality control
  • The appropriate statistical analysis to answer the question

Reflective practice guidance

The guidance below is provided to support reflection at different time points, providing you with questions to aid you to reflect for this training activity. They are provided for guidance and should not be considered as a mandatory checklist. Trainees should not be expected to provide answers to each of the guidance questions listed.

Before action

  • What understanding of genomics data and fundamental statistical concepts is required beforehand?
  • What specific statistical analyses do you expect to perform? What insights do you hope to gain regarding the application of Python and R for statistical analysis of genomics data? What are your current strengths and weaknesses in this area?
  • Will you review statistical concepts or practice using relevant functions in pandas and R? Have you discussed the types of statistical analyses with your training officer? What challenges might you encounter in applying statistical methods to genomics data? How confident do you feel about this activity?

In action

  • How are you structuring the genomics data within the dataframes? What statistical analyses are you attempting to perform and why? Which specific functions or methods are you using in Python/R?
  • Are the statistical analyses running without errors? Are the results what you expected based on your understanding of the data? Are you having to modify your code or analytical approach as you proceed?
  • If a particular statistical method is not working as expected, are you considering alternative methods? Are you checking the documentation for the statistical functions to ensure you are using them correctly?

On action

  • Describe the process of holding genomics data in dataframes using Python (pandas) and R. What statistical analyses did you perform?
  • What statistical analysis skills did you develop or improve through this activity? Did you learn new ways to structure or manage genomics data within dataframes? Were there any unexpected difficulties in performing the statistical analyses? What did you learn from these? How did your decisions during the analysis (‘reflect-in-action’) impact the results you obtained? How does performing statistical analyses on genomics data relate to post-programme practice?
  • What areas for continued development in applying statistical methods in Python and R have you identified? How can you apply the data holding and analysis techniques to future genomics projects? What are the next steps you will take to enhance your statistical analysis skills in a bioinformatics context? What resources or support would be beneficial for further developing your skills in this training activity?

Beyond action

  • Have you encountered different types of genomic data since this training activity? How did your understanding of dataframes and statistical analysis in pandas and R help you?
  • Can you identify specific instances where you have applied the statistical analysis skills learned in this training activity to real-world data? What were the outcomes?
  • How will your foundation in programmatic statistical analysis enable you to tackle more advanced statistical challenges in the future?

Relevant learning outcomes

# Outcome
# 2 Outcome

Perform programmatic data analysis.

# 3 Outcome

Apply statistical methods to derive meaningful conclusions from data to support clinical decision making.