rllm.datasets.Titanic¶
- class rllm.datasets.Titanic(cached_dir: str, forced_reload: bool | None = False, transform=None, tokenizer_config=None)[source]¶
Bases:
DatasetThe Titanic dataset is a widely-used dataset for machine learning and statistical analysis, as featured in the Titanic: Machine Learning from Disaster competition on Kaggle.
The dataset contains various features related to the passengers aboard the Titanic, and the task is to predict whether a passenger survived.
- Parameters:
cached_dir (str) – Root directory where dataset should be saved.
forced_reload (bool) – If set to True, this dataset will be re-process again.
Statics: Name Passengers Features Size 891 12
- property processed_filenames¶
file names in the self.processed_dir
- property raw_filenames¶
file names in the self.raw_dir