Skip to content

Commit 74bd026

Browse files
committed
Updates and renames containerization lesson from numbered to named
1 parent afd4e95 commit 74bd026

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

73 files changed

+103
-108
lines changed

Lesson-02.qmd renamed to Lesson-Contain.qmd

Lines changed: 39 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: "Biostat 823 - Containerization"
33
author: "Hilmar Lapp"
44
institute: "Duke University, Department of Biostatistics & Bioinformatics"
5-
date: "Sep 14, 2023"
5+
date: "Oct 10, 2024"
66
format:
77
revealjs:
88
slide-number: true
@@ -23,7 +23,7 @@ Reproducibility of computational research faces four major challenges^[Boettiger
2323

2424
## "Dependency Hell"
2525

26-
* Software dependencies have themselves dependencies recursively
26+
* Software dependencies have themselves dependencies, recursively
2727
* Dependencies can be be often difficult to install (require compilation, manual "tweaks" due local OS or other differences, etc)
2828
* Required version may conflict with that required by other software, or may not work with the local OS version, making it impossible to install.
2929
- The likelihood of conflicts is particularly high on shared computing environments.
@@ -42,7 +42,7 @@ Reproducibility of computational research faces four major challenges^[Boettiger
4242
- This can happen anywhere in the dependency chain.
4343
* Dependencies can also become unmaintained or end-of-life
4444
- Can result in removal from package repositories.
45-
- Python 2.x example
45+
- Python 2.x example; CRAN package removal by policy
4646

4747
## Virtual Machine as solution?
4848

@@ -65,10 +65,10 @@ Reproducibility of computational research faces four major challenges^[Boettiger
6565
- 2007 (v1) and 2013--16 (v2): Linux control groups ([cgroups](https://en.wikipedia.org/wiki/Cgroups))
6666
- 2008: [Linux Containers (LXC)](https://en.wikipedia.org/wiki/LXC)
6767
- 2013: [Docker](https://www.docker.com)
68-
- 2015: [Singularity](https://en.wikipedia.org/wiki/Singularity_(software))
68+
- 2015: [Singularity](https://en.wikipedia.org/wiki/Singularity_(software)) and ([since 2021](https://apptainer.org/news/community-announcement-20211130/)) [Apptainer](https://apptainer.org)
6969

7070
::: aside
71-
There are many [OS-level virtualization](https://en.wikipedia.org/wiki/OS-level_virtualization) systems. LXC, Docker, and Singularity are by far the most important ones.
71+
There are many [OS-level virtualization](https://en.wikipedia.org/wiki/OS-level_virtualization) systems. LXC, and especially *Docker*, and *Apptainer/Singularity* are by far the most important ones.
7272
:::
7373

7474
## Properties of containerized processes {.smaller}
@@ -83,8 +83,8 @@ There are many [OS-level virtualization](https://en.wikipedia.org/wiki/OS-level_
8383
![](images/docker-for-mac.png){fig-align="right" width="30%" style="float: right"}
8484
<br/><br/>
8585
* On Windows and macOS, requires a Linux VM
86-
- Part of the Docker installation (uses WSL on Windows; LinuxKit / Hypervisor Framework on macOS)
87-
- Unsupported by Singularity
86+
- Part of the Docker installation (uses [WSL/WSL2](https://learn.microsoft.com/en-us/windows/wsl/about) on Windows; LinuxKit / Hypervisor Framework on macOS)
87+
- Apptainer can [use WSL/WSL2 on Windows](https://apptainer.org/docs/admin/main/installation.html#windows), with access to GPUs; [on macOS](https://apptainer.org/docs/admin/main/installation.html#mac), requires [Lima](https://lima-vm.io) as VM host (no GPU)
8888

8989
::: aside
9090
Figure modified from [Gianluca Quercini, Cloud computing -- Docker Primer](https://gquercini.github.io/courses/cloud-computing/references/docker-primer/)
@@ -98,18 +98,19 @@ Figure modified from [Gianluca Quercini, Cloud computing -- Docker Primer](https
9898
From [ELIXIR containers nextflow: Docker](https://biocorecrg.github.io/ELIXIR_containers_nextflow/docker.html)
9999
:::
100100

101-
## Singularity: Containers for HPC
101+
## Apptainer / Singularity: Containers for HPC {.smaller}
102102

103103
* HPC systems are shared computing environments
104104
- Docker daemon runs as root, processes within container can run as root
105105
- Not permissible on a shared computing environment
106-
* Singularity does not require elevated privileges
106+
* Apptainer does not require elevated privileges
107107
- Launcher run by user, not a daemon run by root
108108
- Processes inside container run as same user as outside
109-
* Singularity containers can be built (bootstrapped) from (many) Docker container images
109+
* Apptainer containers can be built (bootstrapped) from (many) Docker container images
110110
- Most scientific software containers are compatible
111+
- Issues can occur for containers that run services under a privileged user (httpd, database server, etc)
111112

112-
## Singularity architecture vs Docker
113+
## Apptainer / Singularity vs Docker
113114

114115
![](images/singularity_architecture.png)
115116

@@ -162,15 +163,6 @@ CMD ["java", "-jar", "picard.jar"]
162163
- [GitHub Packages](https://ghcr.io) Repository (includes container images)
163164
- Gitlab container registry (gitlab-registry.oit.duke.edu for Duke OIT's Gitlab installation)
164165

165-
## (Note) Container images are layered
166-
167-
* Container file system is a [union mount](https://en.wikipedia.org/wiki/Union_mount)
168-
- [OverlayFS](https://en.wikipedia.org/wiki/OverlayFS) supported by Linux kernel since 2014
169-
- Allows layering image content
170-
- Each command in the definition creates a layer
171-
- Layers are cached for image builds and pulls
172-
* [Best practices for container definition](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) include controlling layer cache invalidation
173-
174166
## (Note) Multi-stage builds
175167

176168
* Build layers are read-only
@@ -179,39 +171,50 @@ CMD ["java", "-jar", "picard.jar"]
179171
- Multiple container builds in one container definition
180172
- Use to retain build products but not the software environment needed to create them (which can be large)
181173

182-
## (Note) Build docker, run singularity
174+
## (Note) Build docker, run apptainer
183175

184-
* Building Docker images typically more flexible
185-
- No Singularity Desktop version for Windows or macOS (requires Linux VM instead)
186-
- `singularity build` normally requires `sudo` privileges
187-
* Singularity can use (most) Docker images directly
176+
* Building Docker container images typically more flexible
177+
- No Apptainer Desktop version for Windows or macOS (requires Linux VM instead)
178+
- Container build instructions may cause problems with `apptainer build` in unprivileged environment (which uses `--fakeroot` by default)
179+
* Apptainer can use (most) Docker images directly
188180
- Can download and run in one step:
189181
```shell
190-
$ singularity run docker://<docker_url> <cmd>
182+
$ apptainer run docker://<docker_url> <cmd>
191183
```
192-
* Use `--fakeroot` for `singularity build` in a non-privileged environment
193184

194-
## (Note) Mounting data into the container
185+
## (Note) Mounting data into container {.smaller}
195186

196187
Requires bind mount at container runtime (`docker run`):
197188

198-
* `--volume <local-path>:<container-path>` (Docker)
199-
* `--bind <local-path>:<container-path>` (Singularity)
189+
* [Docker](https://docs.docker.com/engine/storage/bind-mounts/):
190+
191+
`--volume <local-path>:<container-path>`
192+
193+
or
194+
195+
`--mount type=bind,source=<local-path>,target=<container-path>`
196+
197+
Using `--mount` generates an error if target directory (or file) doesn't exist
198+
199+
* [Apptainer](https://apptainer.org/docs/user/main/bind_paths_and_mounts.html#user-defined-bind-paths):
200+
201+
`--bind <local-path>:<container-path>`
202+
203+
Or use `--mount` (see above).
200204
* Can be used for directories and files
201-
* Using `--mount` generates an error if target directory (or file) doesn't exist
202205
203206
## Resources (I)
204207
205208
* [Dockerfile reference](https://docs.docker.com/engine/reference/builder/)
206209
* [Docker command line reference](https://docs.docker.com/engine/reference/commandline/cli/)
207-
* [Singularity file reference](https://docs.sylabs.io/guides/latest/user-guide/definition_files.html)
208-
* [Singularity command line reference](https://docs.sylabs.io/guides/latest/user-guide/cli.html)
209-
* [Open Containers Initiative (OCI) standard for annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md)
210+
* [Apptainer file reference](https://apptainer.org/docs/user/main/definition_files.html)
211+
* [Apptainer command line reference](https://apptainer.org/docs/user/main/cli.html)
212+
* [Open Containers Initiative (OCI) standard for annotations](https://specs.opencontainers.org/image-spec/annotations/)
210213
211214
## Resources (II)
212215
213-
* [Introduction to Docker](https://carpentries-incubator.github.io/docker-introduction/) (Carpentries Incubator lesson)
216+
* [Intro to Docker Workshop](https://imageomics.github.io/docker-workshop/) (Based on Carpentries Incubator lesson)
217+
* [Into to Singularity Workshop](https://carpentries-incubator.github.io/singularity-introduction/) (Carpentries Incubator lesson)
214218
* [DCC OnDemand](https://dcc-ondemand-01.oit.duke.edu/)
215219
* [Jupyter Docker Stacks](https://jupyter-docker-stacks.readthedocs.io/)
216220
- Customized [Biostat Jupyter Docker container](https://github.com/Duke-GCB/biostat-jupyter)
217-
* [Biostat-823 "everything" GPU container](https://gitlab.oit.duke.edu/owzar001/bios-823-container-gpu/-/blob/main/README.md) (Singularity)

0 commit comments

Comments
 (0)