cheesechaser.datapool.realbooru

This module provides a data pool implementation for Realbooru dataset.

The RealbooruDataPool class extends the IncrementIDDataPool to specifically handle the Realbooru dataset, which is stored in a Hugging Face repository.

Note

The dataset deepghs/realbooru_full is gated, you have to get the access of it before using this module.

RealbooruDataPool

class cheesechaser.datapool.realbooru.RealbooruDataPool(revision: str = 'main', hf_token: str | None = None)[source]

A data pool class for accessing and managing Realbooru dataset.

This class inherits from IncrementIDDataPool and is specifically designed to work with the Realbooru dataset stored in a Hugging Face repository. It provides an interface to access and manage the data using incremental IDs.

The Realbooru dataset is a large collection of images and associated metadata, commonly used for machine learning tasks in computer vision and image processing.

Parameters:
  • revision (str) – The specific revision of the Realbooru dataset to use, defaults to ‘main’.

  • hf_token (Optional[str]) – Optional Hugging Face authentication token for accessing private repositories.

__init__(revision: str = 'main', hf_token: str | None = None)[source]

Initialize the RealbooruDataPool.

This constructor sets up the data pool by specifying the Hugging Face repository and revision for both the data and index. It uses the _REALBOORU_REPO constant as the repository ID for both data and index.

Parameters:
  • revision (str) – The specific revision of the Realbooru dataset to use, defaults to ‘main’. This allows for version control of the dataset.

  • hf_token (Optional[str]) – Optional Hugging Face authentication token for accessing private repositories. If provided, it enables access to private datasets.