rllm.preprocessing.TextEmbedderConfig

class rllm.preprocessing.TextEmbedderConfig(text_embedder: Callable[[list[str]], Tensor], batch_size: int | None = None)[source]

Bases: object

Configuration for text embedding in preprocessing pipelines. It defines the embedding callable and optional mini-batch size used during inference.

Parameters:
  • text_embedder (Callable[[list[str]], Tensor]) – Callable that maps a batch of strings to embeddings.

  • batch_size (Optional[int]) – Mini-batch size for embedding. If None, all samples are embedded in one call.