rllm.datasets.IMDB¶
- class rllm.datasets.IMDB(cached_dir: str, transform: Callable | None = None, force_reload: bool | None = False)[source]¶
Bases:
DatasetIMDB is a heterogeneous graph containing three types of entities, as collected in the MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding paper.
The movies are divided into three classes (action, comedy, drama) according to their genre. Movie features correspond to elements of a bag-of-words representation of its plot keywords.
- Parameters:
cached_dir (str) – Root directory where dataset should be saved.
transform (callable, optional) – A function/transform that takes in an HeteroGraphData object and returns a transformed version. The data object will be transformed before every access. (default: None)
forced_reload (bool) – If set to True, this dataset will be re-process again.
Statics: Name movie actors directors nodes 4,278 5,257 2,081
- property processed_filenames¶
file names in the self.processed_dir
- property raw_filenames¶
file names in the self.raw_dir