BMMB554 | Biological data analysis

Abstract artistic visualization of biological data analysis

Image by Bob Harris

Goals of this class

Genomic data analysis is a pyramid. Consider an RNA-seq experiment: you receive reads from your sequencing machine, map them, and count how many fall within each gene. This primary analysis uses well-established tools and workflow systems, often requiring significant computational infrastructure. The output—count tables for RNA-seq, variant lists for variant calling, or peak coordinates for ChIP-seq—are intermediate datasets. They don’t yet contain biological insights.

The secondary analysis extracts meaning from these intermediate data. It’s typically done on a case-by-case basis using frameworks like Jupyter, RStudio, or Observable, since the approach depends on your experimental design. This is where programming, data management, and version control skills become critical.

Data analysis as a pyramid. Primary data (sequencing reads) is processed into smaller intermediate datasets, which are then analyzed to produce final insights—the content of your theses, publications, and blog posts. As you climb the pyramid, standardization (availability of standard tools and procedures) and computational demands all decrease.

In this course we will try to learn key skills for climbing this pyramid. The course will be divided into the following sections:

  1. Foundations | Tools - UNIX shell, Python for data manipulation, Python for visualization, version control with Git/GitHub, and responsible use of agentic AI tools.
  2. Foundations | Computational approaches - algorithms for alignment, mapping, assembly, and composition analysis.
  3. Application | Datatypes and analysis - hands-on primary and secondary analyses in genome assembly, transcriptomics, variation, epigenetics, and community analysis.
  4. Application | Class projects - class will be divided into groups with each group given a research project.

Lectures

Grading

Component Weight
Thursday Quizzes 10%
Section Exams (3 × 16.7%) 50%
Final Project 40%

Thursday Quizzes — Short weekly quizzes covering material from the previous week’s lectures. These are designed to ensure that you do read the material that is given as homework.

Section Exams — Three take-home exams, one after each major section: (1) Tools, (2) Computational approaches, (3) Datatypes and analysis.

Final Project — Group research project applying course concepts to a real biological dataset.

Academic integrity

Academic integrity is the pursuit of scholarly activity in an open, honest, and responsible manner. All students should act with personal integrity, respect other students’ dignity, rights, and property, and help create and maintain an environment in which all can succeed through the fruits of their efforts (see Penn State Policy G-9).

Dishonest behavior will not be tolerated. Students facing allegations of academic misconduct may not drop/withdraw from the affected course unless they are cleared of wrongdoing.

Use of AI Tools: Students may use generative AI tools (such as ChatGPT, Claude, GitHub Copilot) for learning and exploring concepts. However, all submitted work must represent your own understanding. If you use AI assistance, you must be able to explain your code and reasoning.

Educational Equity

Penn State takes great pride in fostering a diverse and inclusive environment for students, faculty, and staff. Discrimination or harassment against any person because of age, ancestry, color, disability, gender identity, national origin, race, religious belief, sex, sexual orientation, or veteran status is not tolerated.

Report incidents of bias at https://equity.psu.edu/report-bias.

Disability Accommodations

Penn State welcomes students with disabilities into the University’s educational programs. If you have a disability-related need for reasonable academic adjustments, contact Student Disability Resources at your campus.

Counseling and Mental Health

Students facing personal or academic stressors may benefit from counseling or other support. Contact Counseling and Psychological Services (CAPS) or the Penn State Crisis Line at 877-229-6400 (24/7).


This page complies with Penn State accessibility requirements.