about summary refs log tree commit diff
path: root/pkgs/development
diff options
context:
space:
mode:
authorConnor Baker <connor.baker@tweag.io>2023-11-07 14:35:37 +0000
committerConnor Baker <connor.baker@tweag.io>2023-12-07 16:45:56 +0000
commitbfaefd0873a91aaffaae4254da5734f2fb311f48 (patch)
tree31280e23c51c6646e4676c7efa28d9cba74c6205 /pkgs/development
parent8e800cedaf24f5ad9717463b809b0beef7677000 (diff)
cudaPackages: add docs
Diffstat (limited to 'pkgs/development')
-rw-r--r--pkgs/development/cuda-modules/README.md32
-rw-r--r--pkgs/development/cuda-modules/modules/README.md27
2 files changed, 59 insertions, 0 deletions
diff --git a/pkgs/development/cuda-modules/README.md b/pkgs/development/cuda-modules/README.md
new file mode 100644
index 0000000000000..f4844c46a2c2e
--- /dev/null
+++ b/pkgs/development/cuda-modules/README.md
@@ -0,0 +1,32 @@
+# cuda-modules
+
+> [!NOTE]
+> This document is meant to help CUDA maintainers understand the structure of the CUDA packages in Nixpkgs. It is not meant to be a user-facing document.
+> For a user-facing document, see [the CUDA section of the manual](../../../doc/languages-frameworks/cuda.section.md).
+
+The files in this directory are added (in some way) to the `cudaPackages` package set by [cuda-packages.nix](../../top-level/cuda-packages.nix).
+
+## Top-level files
+
+Top-level nix files are included in the initial creation of the `cudaPackages` scope. These are typically required for the creation of the finalized `cudaPackages` scope:
+
+- `backend-stdenv.nix`: Standard environment for CUDA packages.
+- `flags.nix`: Flags set, or consumed by, NVCC in order to build packages.
+- `gpus.nix`: A list of supported NVIDIA GPUs.
+- `nvcc-compatibilities.nix`: NVCC releases and the version range of GCC/Clang they support.
+
+## Top-level directories
+
+- `cuda`: CUDA redistributables! Provides extension to `cudaPackages` scope.
+- `cudatoolkit`: monolothic CUDA Toolkit run-file installer. Provides extension to `cudaPackages` scope.
+- `cudnn`: NVIDIA cuDNN library.
+- `cutensor`: NVIDIA cuTENSOR library.
+- `generic-builders`:
+  - Contains a builder `manifest.nix` which operates on the `Manifest` type defined in `modules/generic/manifests`. Most packages are built using this builder.
+  - Contains a builder `multiplex.nix` which leverages the Manifest builder. In short, the Multiplex builder adds multiple versions of a single package to single instance of the CUDA Packages package set. It is used primarily for packages like `cudnn` and `cutensor`.
+- `modules`: Nixpkgs modules to check the shape and content of CUDA redistributable and feature manifests. These modules additionally use shims provided by some CUDA packages to allow them to re-use the `genericManifestBuilder`, even if they don't have manifest files of their own. `cudnn` and `tensorrt` are examples of packages which provide such shims. These modules are further described in the [Modules](./modules/README.md) documentation.
+- `nccl`: NVIDIA NCCL library.
+- `nccl-tests`: NVIDIA NCCL tests.
+- `saxpy`: Example CMake project that uses CUDA.
+- `setup-hooks`: Nixpkgs setup hooks for CUDA.
+- `tensorrt`: NVIDIA TensorRT library.
diff --git a/pkgs/development/cuda-modules/modules/README.md b/pkgs/development/cuda-modules/modules/README.md
new file mode 100644
index 0000000000000..31aa343bd9d51
--- /dev/null
+++ b/pkgs/development/cuda-modules/modules/README.md
@@ -0,0 +1,27 @@
+# Modules
+
+Modules as they are used in `modules` exist primarily to check the shape and content of CUDA redistributable and feature manifests. They are ultimately meant to reduce the repetitive nature of repackaging CUDA redistributables.
+
+Building most redistributables follows a pattern of a manifest indicating which packages are available at a location, their versions, and their hashes. To avoid creating builders for each and every derivation, modules serve as a way for us to use a single `genericManifestBuilder` to build all redistributables.
+
+## `generic`
+
+The modules in `generic` are reusable components meant to check the shape and content of NVIDIA's CUDA redistributable manifests, our feature manifests (which are derived from NVIDIA's manifests), or hand-crafted Nix expressions describing available packages. They are used by the `genericManifestBuilder` to build CUDA redistributables.
+
+Generally, each package which relies on manifests or Nix release expressions will create an alias to the relevant generic module. For example, the [module for CUDNN](./cudnn/default.nix) aliases the generic module for release expressions, while the [module for CUDA redistributables](./cuda/default.nix) aliases the generic module for manifests.
+
+Alternatively, additional fields or values may need to be configured to account for the particulars of a package. For example, while the release expressions for [CUDNN](./cudnn/releases.nix) and [TensorRT](./tensorrt/releases.nix) are very close, they differ slightly in the fields they have. The [module for CUDNN](./modules/cudnn/default.nix) is able to use the generic module for release expressions, while the [module for TensorRT](./modules/tensorrt/default.nix) must add additional fields to the generic module.
+
+### `manifests`
+
+The modules in `generic/manifests` define the structure of NVIDIA's CUDA redistributable manifests and our feature manifests.
+
+NVIDIA's redistributable manifests are retrieved from their web server, while the feature manifests are produced by [`cuda-redist-find-features`](https://github.com/connorbaker/cuda-redist-find-features).
+
+### `releases`
+
+The modules in `generic/releases` define the structure of our hand-crafted Nix expressions containing information necessary to download and repackage CUDA redistributables. These expressions are created when NVIDIA-provided manifests are unavailable or otherwise unusable. For example, though CUDNN has manifests, a bug in NVIDIA's CI/CD causes manifests for different versions of CUDA to use the same name, which leads to the manifests overwriting each other.
+
+### `types`
+
+The modules in `generic/types` define reusable types used in both `generic/manifests` and `generic/releases`.