rllm.nn.conv.table_conv.ExcelFormerConv

class rllm.nn.conv.table_conv.ExcelFormerConv(conv_dim: int, num_heads: int = 8, head_dim: int = 16, dropout: float = 0.5)[source]

Bases: Module

The ExcelFormerConv Layer introduced in the “ExcelFormer: A neural network surpassing GBDTs on tabular data” paper.

This layer is designed to handle tabular data by applying a combination of normalization, attention, and gated linear unit (GLU). In essence, it is a variant of the attention mechanism tailored for tabular data. If metadata is provided, the pre-encoder is used to preprocess the input data before applying the subsequent encoders. The layer normalizes the input, applies semi-permeable attention, and then uses a GLU layer to enhance the representation learning capability.

Parameters:
  • conv_dim (int) – Input/Output dimensionality.

  • num_heads (int) – Number of attention heads (default: 8).

  • head_dim (int) – Dimensionality of each attention head (default: 16).

  • dropout (float) – Attention module dropout (default: 0.3).

Example

>>> import torch
>>> conv = ExcelFormerConv(conv_dim=32, num_heads=8, head_dim=16, dropout=0.1)
>>> x = torch.randn(10, 7, 32)
>>> out = conv(x)
>>> out.shape
torch.Size([10, 7, 32])
forward(x: Tensor) Tensor[source]

Apply attention block and GLU block with residual connections.

Parameters:

x (Tensor) – Input tensor of shape [batch_size, num_cols, conv_dim].

Returns:

Output tensor with the same shape as input.

Return type:

Tensor