Day 1

“Curiosity often leads to trouble.” — Alice in Wonderland (1951 / 2010)

Pre-reading

Please review the material below if you are unfamiliar with it:

For Loops in BASH

In today’s session, you will be using for loops in your Bash scripts. If you aren’t familiar with them yet, here’s a quick introduction.

Below is a simple example of a for loop:

for file in dir/*; do
    echo "$file"
done

Let’s walk through what this code is doing:

First line: for file in dir/*; do This tells Bash: “For every file in the directory dir, do the following actions.” As the loop runs, the variable file will take on the name of each file in that directory, one at a time.

Second line: echo "$file" This is the action we want to perform for each file. In this example, we’re simply printing the file’s name to the screen. In today’s session, you will be processing files using CellRanger.

Third line: done This marks the end of the loop. Once a file has been processed and the actions are completed, Bash moves on to the next file and repeats the process until all files have been handled.

Data Location

/ourdisk/hpc/iicomicswshp/dont_archive/omics_workshop_2025/mouse_kidney_all_cells_raw_data

Session Objectives

Set up all required R packages locally for the remainder of the workshop
Understand sample preparation workflows for single-cell sequencing using different library kits
Use CellRanger for barcode processing and UMI counting to quantify gene expression

Session Materials

Summary

This session introduces participants to the primary steps of processing 10x Genomics single-cell RNA-seq data using the Cell Ranger software suite. We begin by outlining the major stages of single-cell analysis—from handling raw FASTQ files to generating a usable gene-by-cell count matrix—and discuss the computational requirements necessary to run Cell Ranger efficiently on the OSCER high-performance computing system.

Participants learn how to work with genome references (both prebuilt and custom), prepare sequencing data, and execute cellranger count to perform alignment, UMI processing, quantification, and where to find essential output files. By the end, attendees understand how to run primary analysis reproducibly on an HPC environment and are prepared for downstream secondary and tertiary analysis in R on subsequent workshop days.

Other helpful resources:

How single cell sequencing data analysis works