cheesechaser.datapool.nhentai
This module provides data pool classes for managing and accessing NHentai manga and image data.
The module includes two main classes:
NHentaiImagesDataPool: A data pool for managing NHentai images.
NHentaiMangaDataPool: A data pool for managing NHentai manga data, including image associations.
These classes provide functionality for retrieving manga information, downloading images, and managing resources from a Hugging Face dataset repository.
Note
The dataset deepghs/nhentai_full is gated, you have to get the access of it before using this module.
NHentaiImagesDataPool
- class cheesechaser.datapool.nhentai.NHentaiImagesDataPool(revision: str = 'main', hf_token: str | None = None)[source]
A data pool class for managing NHentai images.
This class extends the IncrementIDDataPool to provide specific functionality for handling NHentai image data. It allows for efficient retrieval and management of image resources from a Hugging Face dataset repository.
- Parameters:
revision (str) – The revision of the data to use, defaults to ‘main’.
hf_token (Optional[str]) – Hugging Face API token for authentication, defaults to None.
- Usage:
images_pool = NHentaiImagesDataPool(revision=’latest’) # Use images_pool to access and manage NHentai images
NHentaiMangaDataPool
- class cheesechaser.datapool.nhentai.NHentaiMangaDataPool(revision: str = 'main', hf_token: str | None = None)[source]
A data pool class for managing NHentai manga data.
This class provides methods for retrieving manga information, downloading associated images, and managing manga resources. It utilizes the NHentaiImagesDataPool for handling image data.
- Parameters:
revision (str) – The revision of the data to use, defaults to ‘main’.
hf_token (Optional[str]) – Hugging Face API token for authentication, defaults to None.
- Usage:
manga_pool = NHentaiMangaDataPool(revision=’latest’) # Use manga_pool to access manga information and associated images
- __init__(revision: str = 'main', hf_token: str | None = None)[source]
Initialize the NHentaiMangaDataPool.
- Parameters:
revision (str) – The revision of the data to use, defaults to ‘main’.
hf_token (Optional[str]) – Hugging Face API token for authentication, defaults to None.
- classmethod manga_id_map(revision: str = 'main', local_files_prefer: bool = True, hf_token: str | None = None)[source]
Get a mapping of manga IDs to their associated image IDs.
This method is cached for efficiency and provides a quick lookup for manga-to-image associations.
- Parameters:
revision (str) – The revision of the data to use, defaults to ‘main’.
local_files_prefer (bool) – Whether to prefer local files, defaults to True.
hf_token (Optional[str]) – Hugging Face API token for authentication, defaults to None.
- Returns:
A dictionary mapping manga IDs to lists of image IDs.
- Return type:
dict
- classmethod manga_posts_table(revision: str = 'main', local_files_prefer: bool = True, hf_token: str | None = None)[source]
Retrieve the manga posts table as a pandas DataFrame.
This method is cached for efficiency and provides access to the complete manga post information.
- Parameters:
revision (str) – The revision of the data to use, defaults to ‘main’.
local_files_prefer (bool) – Whether to prefer local files, defaults to True.
hf_token (Optional[str]) – Hugging Face API token for authentication, defaults to None.
- Returns:
A pandas DataFrame containing manga post information.
- Return type:
pandas.DataFrame
- mock_resource(resource_id, resource_info) AbstractContextManager[Tuple[str, Any]] [source]
Create a mock resource for a given manga.
This method downloads the associated images for a manga and organizes them in a temporary directory. It’s useful for processing or analyzing manga content.
- Parameters:
resource_id (int) – The ID of the manga resource.
resource_info (Any) – Additional information about the resource.
- Yield:
A tuple containing the path to the temporary directory with the images and the resource info.
- Return type:
Tuple[str, Any]
- Raises:
ResourceNotFoundError – If the specified manga resource is not found.