rllm.datasets.Adult

class rllm.datasets.Adult(cached_dir: str, forced_reload: bool | None = False)[source]

Bases: Dataset

The Adult dataset is a dataset from a classic data mining project, which was extracted from the 1994 Census database.

The dataset encompasses a variety of features pertaining to adults and their income. The primary objective is to predict whether an individual’s annual income surpasses $50,000.

Parameters:
  • cached_dir (str) – Root directory where dataset should be saved.

  • forced_reload (bool) – If set to True, this dataset will be re-process again.

Statics:
Name   Individuals  Features
Size   48842        14
download()[source]

download the datasets to self.raw_dir

process(num_rows: int | None = None) None[source]

process data and save to ‘./cached_dir/{dataset}/processed/’.

property processed_filenames

file names in the self.processed_dir

property raw_filenames

file names in the self.raw_dir