cheesechaser.datapool.realbooru
This module provides a data pool implementation for Realbooru dataset.
The RealbooruDataPool class extends the IncrementIDDataPool to specifically handle the Realbooru dataset, which is stored in a Hugging Face repository.
Note
The dataset deepghs/realbooru_full is gated, you have to get the access of it before using this module.
RealbooruDataPool
- class cheesechaser.datapool.realbooru.RealbooruDataPool(revision: str = 'main', hf_token: str | None = None)[source]
A data pool class for accessing and managing Realbooru dataset.
This class inherits from IncrementIDDataPool and is specifically designed to work with the Realbooru dataset stored in a Hugging Face repository. It provides an interface to access and manage the data using incremental IDs.
The Realbooru dataset is a large collection of images and associated metadata, commonly used for machine learning tasks in computer vision and image processing.
- Parameters:
revision (str) – The specific revision of the Realbooru dataset to use, defaults to ‘main’.
hf_token (Optional[str]) – Optional Hugging Face authentication token for accessing private repositories.
- __init__(revision: str = 'main', hf_token: str | None = None)[source]
Initialize the RealbooruDataPool.
This constructor sets up the data pool by specifying the Hugging Face repository and revision for both the data and index. It uses the _REALBOORU_REPO constant as the repository ID for both data and index.
- Parameters:
revision (str) – The specific revision of the Realbooru dataset to use, defaults to ‘main’. This allows for version control of the dataset.
hf_token (Optional[str]) – Optional Hugging Face authentication token for accessing private repositories. If provided, it enables access to private datasets.