rllm.dataloader.NeighborLoader

class rllm.dataloader.NeighborLoader(data: GraphData, num_neighbors: List[int], seeds: Tensor | None = None, transform: Callable | None = None, replace: bool = False, shuffle: bool = False, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]

Bases: DataLoader

The random neighbor sampler, which allows for mini-batch training of GNNs on large-scale graphs where full-batch training is not feasible.

Parameters:
  • data (GraphData) – The graph data to be sampled.

  • num_neighbors (List[int]) – The number of neighbors to sample for each node in each layer.

  • seeds (Optional[Tensor]) – The nodes to sample from. If None, all nodes will be used.

  • transform (Optional[Callable]) – A function/transform that takes in a graph and returns a transformed version. The data loader will use this function to transform the graph before returning it.

  • replace (bool, optional) – Whether to sample with replacement. (default: False)

  • shuffle (bool, optional) – Whether to shuffle the data at every epoch. (default: False)

  • batch_size (int, optional) – How many samples per batch to load. (default: 1)

  • num_workers (int, optional) – How many subprocesses to use for data loading. (default: 0)

  • **kwargs – Additional keyword arguments to be passed to the torch.utils.data.DataLoader class.

collate_fn(batch: List[Tensor]) Tuple[int, Tensor, List[Tensor]][source]

Collate function for the NeighborLoader. Samples neighbors for each node in the batch and returns the sampled subgraph.

Parameters:

batch (List[Tensor]) – A list of seed node indices.

Returns:

A tuple of (batch_size, n_id, adjs) where batch_size is the number of seed nodes, n_id contains all sampled node indices, and adjs is a list of sparse adjacency matrices per hop.

Return type:

Tuple[int, Tensor, List[Tensor]]

get_in_neighbors(node: int) Tensor[source]

Get the in-neighbors of a given node in the graph.

Parameters:

node (int) – The node for which to get the in-neighbors.

Returns:

The indices of in-neighbor nodes.

Return type:

Tensor

sample_neighbors_one_layer(seed_nodes: List[int], num_neighbor: int) Tuple[Tensor, Tensor][source]

Sample neighbors for a given set of seed nodes.

Parameters:
  • seed_nodes (List[int]) – The nodes to sample neighbors from.

  • num_neighbor (int) – The number of neighbors to sample for each node.

Returns:

A tuple containing the sampled source nodes and destination nodes.

Return type:

Tuple[Tensor, Tensor]