rllm.datasets¶
Graph Datasets¶
Heterogeneous Graph Datasets¶
DBLP is a heterogeneous graph containing four types of entities, as collected in the MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding paper. |
|
IMDB is a heterogeneous graph containing three types of entities, as collected in the MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding paper. |
Homogeneous Graph Datasets¶
The citation network datasets from the Revisiting Semi-Supervised Learning with Graph Embeddings paper, which include |
|
The citation network datasets, include cora and pubmed, collected from paper Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning paper. |
|
Three text-attributed-graph datasets, including cora from Automating the Construction of Internet Portals, pubmed from Collective Classification in Network Data and citeseer from CiteSeer: an automatic citation indexing system paper. |
Table Datasets¶
Single Table Datasets¶
The Titanic dataset is a widely-used dataset for machine learning and statistical analysis, as featured in the Titanic: Machine Learning from Disaster competition on Kaggle. |
|
The Adult dataset is a dataset from a classic data mining project, which was extracted from the 1994 Census database. |
|
The Bank Marketing dataset is related to direct marketing campaigns of a Portuguese banking institution. |
|
The Churn Modelling dataset is used to predict which customers are likely to churn from the organization by analyzing various attributes and applying machine learning and deep learning techniques. |
Multi-Table Datasets¶
TACM12KDataset is a multi-table relational dataset containing 4 tables, as collected in the rLLM: Relational Table Learning with LLMs paper. |
|
TLF2KDataset is a multi-table relational dataset containing 3 tables, as collected in the rLLM: Relational Table Learning with LLMs paper. |
|
TML1MDataset is a multi-table relational dataset containing 3 tables, as collected in the rLLM: Relational Table Learning with LLMs paper. |
|
Override methods for RelBench datasets. |
|
A wrapper for rel-f1 dataset in RelBench benchmark from RelBench: A Benchmark for Deep Learning on Relational Databases paper, which contains Formula 1 racing data with 9 tables and 3 tasks. |
|
An enumeration. |
|