rllm.datasets.TLF2KDataset¶
- class rllm.datasets.TLF2KDataset(cached_dir: str, force_reload: bool | None = False)[source]¶
Bases:
DatasetTLF2KDataset is a multi-table relational dataset containing 3 tables, as collected in the rLLM: Relational Table Learning with LLMs paper.
It contains three tables: users, movies and ratings. The artists table includes information about artists, such as location and genre. The user_artists table contains the interaction between the user and artist as format: [user, artist, listening_count]. The user_friends table represents bi-directional friendship between users. The default task of this dataset is to predict artists’s genre.
- Parameters:
cached_dir (str) – Root directory where dataset should be saved.
forced_reload (bool) – If set to True, this dataset will be re-process again.
Table1: artists --------------- Statics: Name Users Features Size 9,047 10 Table2: user_artists ------------------ Statics: Name Movies Features nodes 80,009 3 Table3: user_friends ------------------ Statics: Name Ratings Features nodes 12,717 2
- property processed_filenames¶
file names in the self.processed_dir
- property raw_filenames¶
file names in the self.raw_dir