cheesechaser.datapool.nozomi

This module provides a data pool implementation for Nozomi datasets.

It extends the functionality of the IncrementIDDataPool class to specifically handle Nozomi datasets stored in a Hugging Face repository. The module defines a constant for the repository name and a class that initializes the data pool with the appropriate repository and revision information.

Note

The dataset deepghs/nozomi_standalone_full is gated, you have to get the access of it before using this module.

NozomiDataPool

class cheesechaser.datapool.nozomi.NozomiDataPool(revision: str = 'main', hf_token: str | None = None)[source]

A data pool class specifically designed for Nozomi datasets.

This class inherits from IncrementIDDataPool and initializes it with the Nozomi-specific repository information. It provides a simple way to create a data pool for Nozomi datasets with optional revision specification.

Parameters:
  • revision (str) – The revision of the Nozomi dataset to use, defaults to ‘main’

  • hf_token (Optional[str]) – Optional Hugging Face authentication token

Usage:
>>> nozomi_pool = NozomiDataPool()  # Uses the 'main' revision
>>> nozomi_pool_dev = NozomiDataPool(revision='dev')  # Uses the 'dev' revision
>>> nozomi_pool_auth = NozomiDataPool(hf_token='your_token_here')  # Uses authentication
__init__(revision: str = 'main', hf_token: str | None = None)[source]

Initialize the NozomiDataPool with the specified revision and optional authentication token.

This method sets up the data pool using the Nozomi-specific repository and the provided revision. It also allows for optional authentication using a Hugging Face token.

Parameters:
  • revision (str) – The revision of the Nozomi dataset to use, defaults to ‘main’

  • hf_token (Optional[str]) – Optional Hugging Face authentication token for accessing private repositories