Skip to the content.

RADCamp NYC 2023 Part II (Bioinformatics)

Day 1 (PM)

Overview of the afternoon activities:

Intro to Code Ocean

Lead: Sandra (45’)

Code Ocean: The Digital Lab for Computational Scientists

Look at the real data we generated

Lead: Isaac (30’)

Now we will move to the real data

Empirical Data QC

Lead: Isaac (45’)

Form groups for working with the empirical data

Groups will be organized around the 10 sets of samples that obtained sufficient sequencing (>3m reads total). Each group will have a lead, normally the individual who the samples belong to, and the groups will work together to run assemblies today and analyse the data tomorrow. The following file indicates the group membership:

RADCamp groups for assembling and analysing the real data

Attaching ‘Data Assets’ in CodeOcean

Before we can start QC’ing the data we need to first attach the ‘Data Assets’ that contain the raw data files. Data assets are just like physical disk drives that you can plug and unplug, but they are much easier to work with.

NB: The ‘_DemultiplexedData’ asset contains all of the pre-demultiplexed samples for all groups. We want to run fastqc on the sample files, not on the entire raw data files (slow), so we have cheated a bit and pre-demux’d all the data to make this step faster and easier.

Now the groups will work together on the data QC for their specific dataset.

3RAD Data Quality Control (fastqc)

Coffee break (3:30-3:45)

Briefly report back on fastqc results

Empirical assembly