py-rocket-geospatial-2

Project Home Report Issues DOI
stable release ghcr.io


What is py-rocket-geospatial-2?

py-rocket-geospatial-2 logo

py-rocket-geospatial-2 is a Python–R geospatial Docker image for large-scale earth-science data analysis in JupyterHub environments.

It is designed for users working with large earth-observation datasets, especially cloud-native data, from organizations such as NOAA, NASA, and other public earth-science data providers. The image targets workflows common in cryoscience, oceanography, climate science, and remote sensing. It is optimized for:


py-rocket-geospatial-2 combines three ecosystems:

Python (Pangeo-style big-data stack)

R (Rocker-based geospatial environment)

Desktop tools for earth science

The image also includes Quarto, TeX Live, MyST, and JupyterBook for scientific publishing.


Runtime overview


Using

The image is designed to be used in JupyterHubs and you can use in your hub yaml with ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial-2:latest but best practice is to pin to a specific tag.

You can also run on a computer with Docker installed with

docker pull ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial-2:latest
docker run -it --rm -p 8888:8888 ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial-2:latest

Image structure

See the py-rocket-base documentation for base image design details.


Reproducibility and validation

This repository automatically maintains pinned and validated package lists:

Pinned versions are extracted directly from the built image and validated against the requested package lists to support reproducibility and debugging.


CI/CD Workflow

Automated Build and Test Pipeline

The repository uses a streamlined CI/CD workflow that ensures quality before publishing Docker images:

Workflow: Build → Test → Push → Create Release PR (all in one job)

The main build-test-push job executes:

  1. Build - Docker image is built and tagged (stays in runner’s Docker cache)
  2. Test Python - Python notebook tests run against the built image
  3. Test Packages - Package validation ensures all specified packages are installed
  4. Push - Image is pushed to GHCR only if tests pass
  5. Create Release PR - A separate job creates a pull request with pinned package versions

Design: The Docker image (~7GB compressed) stays in the build runner’s local Docker cache, avoiding artifact transfer overhead. Only small artifacts (test results, validation reports) are uploaded with 7-day retention.

Manual Workflow Triggers

You can manually trigger the workflow with options:

Workflow Files

Automatic Triggers

The workflow automatically runs when changes are pushed to main affecting:


Customization and derivative images

To customize py-rocket-geospatial-2

If changes affect core platform behavior, please open an issue in py-rocket-base

To create derivative images

  1. You can create a derivative image using py-rocket-geospatial-2 as the base. This will add packages to the conda and R environments. For example
FROM ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial-2:2026.02.08

USER root

COPY . /tmp/
RUN /pyrocket_scripts/install-conda-packages.sh /tmp/your-environment.yml || echo "install-conda-packages.sh failed" || true
RUN /pyrocket_scripts/install-r-packages.sh /tmp/install.R || echo "install-r-package.sh failed" || true
RUN rm -rf /tmp/*

USER ${NB_USER}
WORKDIR ${HOME}
  1. You can use the https://github.com/nmfs-opensci/py-rocket-geospatial-2/Dockerfile as a template.

  2. Making your derivative image build automatically in GitHub from your repo.

    • Copy .github/actions/build-and-push/action.yml to the same location in your repo
    • Copy .github/workflows/build-and-push.yml into your repo and edit the image-name.
    • Set up your repo to allow packages to be published to your location from your repo.

Provenance

This image was originally maintained under
https://github.com/nmfs-opensci/container-images

It now lives in its own dedicated repository as part of the NMFS OpenSci container ecosystem.