rllm.dataloader.NeighborLoader¶
- class rllm.dataloader.NeighborLoader(data: GraphData, num_neighbors: List[int], seeds: Tensor | None = None, transform: Callable | None = None, replace: bool = False, shuffle: bool = False, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]¶
Bases:
DataLoaderThe random neighbor sampler, which allows for mini-batch training of GNNs on large-scale graphs where full-batch training is not feasible.
- Parameters:
data (GraphData) – The graph data to be sampled.
num_neighbors (List[int]) – The number of neighbors to sample for each node in each layer.
seeds (Optional[Tensor]) – The nodes to sample from. If None, all nodes will be used.
transform (Optional[Callable]) – A function/transform that takes in a graph and returns a transformed version. The data loader will use this function to transform the graph before returning it.
replace (bool, optional) – Whether to sample with replacement. (default:
False)shuffle (bool, optional) – Whether to shuffle the data at every epoch. (default:
False)batch_size (int, optional) – How many samples per batch to load. (default:
1)num_workers (int, optional) – How many subprocesses to use for data loading. (default:
0)**kwargs – Additional keyword arguments to be passed to the
torch.utils.data.DataLoaderclass.
- collate_fn(batch: List[Tensor]) Tuple[int, Tensor, List[Tensor]][source]¶
Collate function for the NeighborLoader. Samples neighbors for each node in the batch and returns the sampled subgraph.
- Parameters:
batch (List[Tensor]) – A list of seed node indices.
- Returns:
A tuple of
(batch_size, n_id, adjs)wherebatch_sizeis the number of seed nodes,n_idcontains all sampled node indices, andadjsis a list of sparse adjacency matrices per hop.- Return type:
Tuple[int, Tensor, List[Tensor]]
- get_in_neighbors(node: int) Tensor[source]¶
Get the in-neighbors of a given node in the graph.
- Parameters:
node (int) – The node for which to get the in-neighbors.
- Returns:
The indices of in-neighbor nodes.
- Return type:
Tensor
- sample_neighbors_one_layer(seed_nodes: List[int], num_neighbor: int) Tuple[Tensor, Tensor][source]¶
Sample neighbors for a given set of seed nodes.
- Parameters:
seed_nodes (List[int]) – The nodes to sample neighbors from.
num_neighbor (int) – The number of neighbors to sample for each node.
- Returns:
A tuple containing the sampled source nodes and destination nodes.
- Return type:
Tuple[Tensor, Tensor]