TGSAI · tasansal · Apr 29, 2025 · Apr 29, 2025 · Apr 29, 2025 · Apr 29, 2025
diff --git a/docs/conf.py b/docs/conf.py
@@ -17,6 +17,7 @@
     "sphinx.ext.napoleon",
     "sphinx.ext.intersphinx",
     "sphinx.ext.autosummary",
+    "sphinxcontrib.autodoc_pydantic",
     "sphinx.ext.autosectionlabel",
     "sphinx_click",
     "sphinx_copybutton",
@@ -38,6 +39,7 @@
 intersphinx_mapping = {
     "python": ("https://docs.python.org/3", None),
     "numpy": ("https://numpy.org/doc/stable/", None),
+    "pydantic": ("https://docs.pydantic.dev/latest/", None),
     "zarr": ("https://zarr.readthedocs.io/en/stable/", None),
 }
 
@@ -50,6 +52,14 @@
 autoclass_content = "class"
 autosectionlabel_prefix_document = True
 
+autodoc_pydantic_field_list_validators = False
+autodoc_pydantic_field_swap_name_and_alias = True
+autodoc_pydantic_field_show_alias = False
+autodoc_pydantic_model_show_config_summary = False
+autodoc_pydantic_model_show_validator_summary = False
+autodoc_pydantic_model_show_validator_members = False
+autodoc_pydantic_model_show_field_summary = False
+
 html_theme = "furo"
 
 myst_number_code_blocks = ["python"]

diff --git a/docs/data_models/chunk_grids.md b/docs/data_models/chunk_grids.md
@@ -0,0 +1,154 @@
+```{eval-rst}
+:tocdepth: 3
+```
+
+```{currentModule} mdio.schemas.chunk_grid
+
+```
+
+# Chunk Grid Models
+
+```{article-info}
+:author: Altay Sansal
+:date: "{sub-ref}`today`"
+:read-time: "{sub-ref}`wordcount-minutes` min read"
+:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
+```
+
+The variables in MDIO data model can represent different types of chunk grids.
+These grids are essential for managing multi-dimensional data arrays efficiently.
+In this breakdown, we will explore four distinct data models within the MDIO schema,
+each serving a specific purpose in data handling and organization.
+
+MDIO implements data models following the guidelines of the Zarr v3 spec and ZEPs:
+
+- [Zarr core specification (version 3)](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html)
+- [ZEP 1 — Zarr specification version 3](https://zarr.dev/zeps/accepted/ZEP0001.html)
+- [ZEP 3 — Variable chunking](https://zarr.dev/zeps/draft/ZEP0003.html)
+
+## Regular Grid
+
+The regular grid models are designed to represent a rectangular and regularly
+paced chunk grid.
+
+```{eval-rst}
+.. autosummary::
+   RegularChunkGrid
+   RegularChunkShape
+```
+
+For 1D array with `size = 31`{l=python}, we can divide it into 5 equally sized
+chunks. Note that the last chunk will be truncated to match the size of the array.
+
+`{ "name": "regular", "configuration": { "chunkShape": [7] } }`{l=json}
+
+Using the above schema resulting array chunks will look like this:
+
+```bash
+ ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→  ↔ 3
+┌───────┬───────┬───────┬───────┬───┐
+└───────┴───────┴───────┴───────┴───┘
+```
+
+For 2D array with shape `rows, cols = (7, 17)`{l=python}, we can divide it into 9
+equally sized chunks.
+
+`{ "name": "regular", "configuration": { "chunkShape": [3, 7] } }`{l=json}
+
+Using the above schema, the resulting 2D array chunks will look like below.
+Note that the rows and columns are conceptual and visually not to scale.
+
+```bash
+ ←─ 7 ─→ ←─ 7 ─→  ↔ 3
+┌───────┬───────┬───┐
+│       ╎       ╎   │  ↑
+│       ╎       ╎   │  3
+│       ╎       ╎   │  ↓
+├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
+│       ╎       ╎   │  ↑
+│       ╎       ╎   │  3
+│       ╎       ╎   │  ↓
+├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
+│       ╎       ╎   │  ↕ 1
+└───────┴───────┴───┘
+```
+
+## Rectilinear Grid
+
+The [RectilinearChunkGrid](RectilinearChunkGrid) model extends
+the concept of chunk grids to accommodate rectangular and irregularly spaced chunks.
+This model is useful in data structures where non-uniform chunk sizes are necessary.
+[RectilinearChunkShape](RectilinearChunkShape) specifies the chunk sizes for each
+dimension as a list allowing for irregular intervals.
+
+```{eval-rst}
+.. autosummary::
+   RectilinearChunkGrid
+   RectilinearChunkShape
+```
+
+:::{note}
+It's important to ensure that the sum of the irregular spacings specified
+in the `chunkShape` matches the size of the respective array dimension.
+:::
+
+For 1D array with `size = 39`{l=python}, we can divide it into 5 irregular sized
+chunks.
+
+`{ "name": "rectilinear", "configuration": { "chunkShape": [[10, 7, 5, 7, 10]] } }`{l=json}
+
+Using the above schema resulting array chunks will look like this:
+
+```bash
+ ←── 10 ──→ ←─ 7 ─→ ← 5 → ←─ 7 ─→ ←── 10 ──→
+┌──────────┬───────┬─────┬───────┬──────────┐
+└──────────┴───────┴─────┴───────┴──────────┘
+```
+
+For 2D array with shape `rows, cols = (7, 25)`{l=python}, we can divide it into 12
+rectilinear (rectangular bur irregular) chunks. Note that the rows and columns are
+conceptual and visually not to scale.
+
+`{ "name": "rectilinear", "configuration": { "chunkShape": [[3, 1, 3], [10, 5, 7, 3]] } }`{l=json}
+
+```bash
+ ←── 10 ──→ ← 5 → ←─ 7 ─→  ↔ 3
+┌──────────┬─────┬───────┬───┐
+│          ╎     ╎       ╎   │  ↑
+│          ╎     ╎       ╎   │  3
+│          ╎     ╎       ╎   │  ↓
+├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
+│          ╎     ╎       ╎   │  ↕ 1
+├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
+│          ╎     ╎       ╎   │  ↑
+│          ╎     ╎       ╎   │  3
+│          ╎     ╎       ╎   │  ↓
+└──────────┴─────┴───────┴───┘
+```
+
+## Model Reference
+
+:::{dropdown} RegularChunkGrid
+:animate: fade-in-slide-down
+
+```{eval-rst}
+.. autopydantic_model:: RegularChunkGrid
+
+----------
+
+.. autopydantic_model:: RegularChunkShape
+```
+
+:::
+:::{dropdown} RectilinearChunkGrid
+:animate: fade-in-slide-down
+
+```{eval-rst}
+.. autopydantic_model:: RectilinearChunkGrid
+
+----------
+
+.. autopydantic_model:: RectilinearChunkShape
+```
+
+:::
diff --git a/docs/data_models/compressors.md b/docs/data_models/compressors.md
@@ -0,0 +1,100 @@
+```{eval-rst}
+:tocdepth: 3
+```
+
+```{currentModule} mdio.schemas.compressors
+
+```
+
+# Compressors
+
+```{article-info}
+:author: Altay Sansal
+:date: "{sub-ref}`today`"
+:read-time: "{sub-ref}`wordcount-minutes` min read"
+:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
+```
+
+## Dataset Compression
+
+MDIO relies on [numcodecs] for data compression. We provide good defaults based
+on opinionated and limited heuristics for each compressor for various energy datasets.
+However, using these data models, the compression can be customized.
+
+[Numcodecs] is a project that a convenient interface to different compression
+libraries. We selected the [Blosc] and [ZFP] compressors for lossless and lossy
+compression of energy data.
+
+## Blosc
+
+A high-performance compressor optimized for binary data, combining fast compression
+with a byte-shuffle filter for enhanced efficiency, particularly effective with
+numerical arrays in multi-threaded environments.
+
+For more details about compression modes, see [Blosc Documentation].
+
+```{eval-rst}
+.. autosummary::
+   Blosc
+```
+
+## ZFP
+
+ZFP is a compression algorithm tailored for floating-point and integer arrays, offering
+lossy and lossless compression with customizable precision, well-suited for large
+scientific datasets with a focus on balancing data fidelity and compression ratio.
+
+For more details about compression modes, see [ZFP Documentation].
+
+```{eval-rst}
+.. autosummary::
+   ZFP
+```
+
+[numcodecs]: https://github.com/zarr-developers/numcodecs
+[blosc]: https://github.com/Blosc/c-blosc
+[blosc documentation]: https://www.blosc.org/python-blosc/python-blosc.html
+[zfp]: https://github.com/LLNL/zfp
+[zfp documentation]: https://computing.llnl.gov/projects/zfp
+
+## Model Reference
+
+:::
+:::{dropdown} Blosc
+:animate: fade-in-slide-down
+
+```{eval-rst}
+.. autopydantic_model:: Blosc
+
+----------
+
+.. autoclass:: BloscAlgorithm()
+    :members:
+    :undoc-members:
+    :member-order: bysource
+
+----------
+
+.. autoclass:: BloscShuffle()
+    :members:
+    :undoc-members:
+    :member-order: bysource
+```
+
+:::
+
+:::{dropdown} ZFP
+:animate: fade-in-slide-down
+
+```{eval-rst}
+.. autopydantic_model:: ZFP
+
+----------
+
+.. autoclass:: ZFPMode()
+    :members:
+    :undoc-members:
+    :member-order: bysource
+```
+
+:::