rllm.dataloader.NeighborLoader¶

class rllm.dataloader.NeighborLoader(data: GraphData, num_neighbors: List[int], seeds: Tensor | None = None, transform: Callable | None = None, replace: bool = False, shuffle: bool = False, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]¶

Bases: DataLoader

The random neighbor sampler, which allows for mini-batch training of GNNs on large-scale graphs where full-batch training is not feasible.

Parameters:

data (GraphData) – The graph data to be sampled.
num_neighbors (List[int]) – The number of neighbors to sample for each node in each layer.
seeds (Optional[Tensor]) – The nodes to sample from. If None, all nodes will be used.
transform (Optional[Callable]) – A function/transform that takes in a graph and returns a transformed version. The data loader will use this function to transform the graph before returning it.
replace (bool, optional) – Whether to sample with replacement. (default: False)
shuffle (bool, optional) – Whether to shuffle the data at every epoch. (default: False)
batch_size (int, optional) – How many samples per batch to load. (default: 1)
num_workers (int, optional) – How many subprocesses to use for data loading. (default: 0)
**kwargs – Additional keyword arguments to be passed to the torch.utils.data.DataLoader class.

collate_fn(batch: List[Tensor]) → Tuple[int, Tensor, List[Tensor]][source]¶

Collate function for the NeighborLoader. Samples neighbors for each node in the batch and returns the sampled subgraph.

Parameters:: batch (List[Tensor]) – A list of seed node indices.
Returns:: A tuple of (batch_size, n_id, adjs) where batch_size is the number of seed nodes, n_id contains all sampled node indices, and adjs is a list of sparse adjacency matrices per hop.
Return type:: Tuple[int, Tensor, List[Tensor]]

get_in_neighbors(node: int) → Tensor[source]¶

Get the in-neighbors of a given node in the graph.

Parameters:: node (int) – The node for which to get the in-neighbors.
Returns:: The indices of in-neighbor nodes.
Return type:: Tensor

sample_neighbors_one_layer(seed_nodes: List[int], num_neighbor: int) → Tuple[Tensor, Tensor][source]¶

Sample neighbors for a given set of seed nodes.

Parameters:

seed_nodes (List[int]) – The nodes to sample neighbors from.
num_neighbor (int) – The number of neighbors to sample for each node.

Returns:

A tuple containing the sampled source nodes and destination nodes.

Return type:

Tuple[Tensor, Tensor]