cheesechaser.datapool.sankaku

This module provides data pool classes for accessing Sankaku image data.

It contains two classes:

  1. SankakuDataPool: For accessing the full Sankaku dataset.

  2. SankakuWebpDataPool: For accessing the WebP-formatted Sankaku dataset with 4M pixel images.

Both classes inherit from IncrementIDDataPool and provide easy access to the respective datasets stored in Hugging Face repositories. These classes simplify the process of retrieving and working with Sankaku image data, allowing users to easily integrate this data into their projects or research.

Note

The datasets deepghs/sankaku_full and deepghs/sankaku-webp-4Mpixel is gated, you have to get the access of it before using this module.

SankakuDataPool

class cheesechaser.datapool.sankaku.SankakuDataPool(revision: str = 'main', hf_token: str | None = None)[source]

A data pool class for accessing the full Sankaku dataset.

This class inherits from IncrementIDDataPool and is configured to access the full Sankaku dataset stored in the ‘deepghs/sankaku_full’ repository. It provides methods to retrieve image data based on image IDs.

Parameters:

revision (str) – The revision of the dataset to use, defaults to ‘main’.

Note:

This class uses a base level of 4 for file organization, which means the images are stored in a directory structure with 4 levels of subdirectories.

__init__(revision: str = 'main', hf_token: str | None = None)[source]

Initialize the SankakuDataPool.

Parameters:
  • revision (str) – The revision of the dataset to use, defaults to ‘main’.

  • hf_token (Optional[str]) – Hugging Face authentication token, defaults to None.

SankakuWebpDataPool

class cheesechaser.datapool.sankaku.SankakuWebpDataPool(revision: str = 'main', hf_token: str | None = None)[source]

A data pool class for accessing the WebP-formatted Sankaku dataset with 4M pixel images.

This class inherits from IncrementIDDataPool and is configured to access the WebP-formatted Sankaku dataset stored in the ‘deepghs/sankaku-webp-4Mpixel’ repository. It provides methods to retrieve WebP-formatted image data based on image IDs.

Parameters:
  • revision (str) – The revision of the dataset to use, defaults to ‘main’.

  • hf_token (Optional[str]) – Hugging Face authentication token, defaults to None.

Note:

This class uses a base level of 3 for file organization, which means the images are stored in a directory structure with 3 levels of subdirectories. Authentication may be required to access this dataset.

__init__(revision: str = 'main', hf_token: str | None = None)[source]

Initialize the SankakuWebpDataPool.

Parameters:
  • revision (str) – The revision of the dataset to use, defaults to ‘main’.

  • hf_token (Optional[str]) – Hugging Face authentication token, defaults to None.