graphium.ipu¶

Code for adapting to run on IPU

Contents

IPU Dataloader
IPU Losses
IPU Metrics
IPU Simple Lightning
IPU Utils
IPU Wrapper
To Dense Batch

IPU Dataloader¶

`graphium.ipu.ipu_dataloader` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`CombinedBatchingCollator` ¶

Collator object that manages the combined batch size defined as:

combined_batch_size = batch_size * device_iterations
                     * replication_factor * gradient_accumulation

This is intended to be used in combination with the poptorch.DataLoader

`call(batch)` ¶

Stack tensors, batch the pyg graphs, and pad each tensor to be same size.

Parameters:

Name	Type	Description	Default
`batch`	`List[Dict[str, Union[Data, Dict[str, Tensor]]]]`	The batch of data, including pyg-graphs `Data` and labels `Dict[str, Tensor]` to be padded	required

Returns:

Name	Type	Description
`out_batch`	`Dict[str, Union[Batch, Dict[str, Tensor], Any]]`	A dictionary where the graphs are batched and the labels or other Tensors are stacked

`init(batch_size, max_num_nodes, max_num_edges, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, collate_fn=None)` ¶

Parameters:

Name	Type	Description	Default
`batch_size`	`int`	mini batch size used by the model	required
`max_num_nodes`	`int`	Maximum number of nodes in the batched padded graph	required
`max_num_edges`	`int`	Maximum number of edges in the batched padded graph	required
`dataset_max_nodes_per_graph`	`int`	Maximum number of nodes per graph in the full dataset	required
`dataset_max_edges_per_graph`	`int`	Maximum number of edges per graph in the full dataset	required
`collate_fn`	`Optional[Callable]`	Function used to collate (or batch) the single data or graphs together	`None`

`IPUDataloaderOptions` `dataclass` ¶

This data class stores the arguments necessary to instantiate a model for the Predictor.

Parameters:

Name	Type	Description	Default
`model_class`		pytorch module used to create a model	required
`model_kwargs`		Key-word arguments used to initialize the model from `model_class`.	required

`Pad` ¶

Bases: BaseTransform

Data transform that applies padding to enforce consistent tensor shapes.

`init(max_num_nodes, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, max_num_edges=None, node_value=0, edge_value=0)` ¶

Parameters:

Name	Type	Description	Default
`max_num_nodes`	`int`	The maximum number of nodes for the total padded graph	required
`dataset_max_nodes_per_graph`		the maximum number of nodes per graph in the dataset	required
`dataset_max_edges_per_graph`		the maximum number of edges per graph in the dataset	required
`max_num_edges`	`Optional[int]`	The maximum number of edges for the total padded graph	`None`
`node_value`	`float`	Value to add to the node padding	`0`
`edge_value`	`float`	Value to add to the edge padding	`0`

`validate(data)` ¶

Validates that the input graph does not exceed the constraints that:

the number of nodes must be <= max_num_nodes
the number of edges must be <= max_num_edges

Returns:

Type	Description
	Tuple containing the number nodes and the number of edges

`create_ipu_dataloader(dataset, ipu_dataloader_options, ipu_options=None, batch_size=1, collate_fn=None, num_workers=0, **kwargs)` ¶

Creates a poptorch.DataLoader for graph datasets Applies the mini-batching method of concatenating multiple graphs into a single graph with multiple disconnected subgraphs. See: https://pytorch-geometric.readthedocs.io/en/2.0.2/notes/batching.html

Parameters:

dataset: The torch_geometric.data.Dataset instance from which to
    load the graph examples for the IPU.
ipu_dataloader_options: The options to initialize the Dataloader for IPU
ipu_options: The poptorch.Options used by the
    poptorch.DataLoader. Will use the default options if not provided.
batch_size: How many graph examples to load in each batch
    (default: 1).
collate_fn: The function used to collate batches
**kwargs (optional): Additional arguments of :class:`poptorch.DataLoader`.

Returns:

Type	Description
`DataLoader`	The dataloader

IPU Losses¶

`graphium.ipu.ipu_losses` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`BCELossIPU` ¶

Bases: BCELoss

A modified version of the torch.nn.BCELoss that can ignore NaNs by giving them a weight of 0. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`BCEWithLogitsLossIPU` ¶

Bases: BCEWithLogitsLoss

A modified version of the torch.nn.BCEWithLogitsLoss that can ignore NaNs by giving them a weight of 0. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`HybridCELossIPU` ¶

Bases: HybridCELoss

`init(n_brackets, alpha=0.5)` ¶

Parameters:

Name	Type	Description	Default
`n_brackets`		the number of brackets that will be used to group the regression targets. Expected to have the same size as the number of classes in the transformed regression task.	required

`forward(input, target)` ¶

Parameters:

Name	Type	Description	Default
`input`	`Tensor`	(batch_size x n_classes) tensor of logits predicted for each bracket.	required
`target`	`Tensor`	(batch_size) or (batch_size, 1) tensor of target brackets in {0, 1, ..., self.n_brackets}.	required

`L1LossIPU` ¶

Bases: L1Loss

A modified version of the torch.nn.L1Loss that can ignore NaNs by giving them the same value for both input and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`MSELossIPU` ¶

Bases: MSELoss

A modified version of the torch.nn.MSELoss that can ignore NaNs by giving them the same value for both input and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

IPU Metrics¶

`graphium.ipu.ipu_metrics` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`NaNTensor` ¶

Bases: Tensor

Class to create and manage a NaN tensor along it's properties

The goal of the class is to override the regular tensor such that the basic operations (sum, mean, max, etc) ignore the NaNs in the input. It also supports NaNs in integer tensors (as the lowest integer possible).

`get_nans: BoolTensor` `property` ¶

Gets the boolean Tensor containing the location of NaNs. In the case of an integer tensor, this returns where the tensor is equal to its minimal value In the case of a boolean tensor, this returns a Tensor filled with False

`lt(other)` ¶

Stupid fix that allows the code to work with r2_score, since it requires the size to be > 2. But since self.size now returns a Tensor instead of a value, we check that all elements are > 2.

`__torch_function__(func, types, args=(), kwargs=None)` `classmethod` ¶

This torch_function implementation wraps subclasses such that methods called on subclasses return a subclass instance instead of a torch.Tensor instance.

One corollary to this is that you need coverage for torch.Tensor methods if implementing torch_function for subclasses.

Affects the call torch.sum() as to behave the same way as NaNTensor.sum()

We recommend always calling super().__torch_function__ as the base case when doing the above.

While not mandatory, we recommend making __torch_function__ a classmethod.

`argsort(dim=-1, descending=False)` ¶

Return the indices that sort the tensor, while putting all the NaNs to the end of the sorting.

`max(*args, **kwargs)` ¶

Returns the max vale of a tensor whitout NaNs

`mean(*args, **kwargs)` ¶

Overloads the traditional mean to ignore the NaNs

`min(*args, **kwargs)` ¶

Returns the min vale of a tensor whitout NaNs

`numel()` ¶

Returns the number of non-NaN elements.

`size(dim)` ¶

Instead of returning the size, return the number of non-NaN elements in a specific dimension. Useful for the r2_score metric.

`sum(*args, **kwargs)` ¶

Overloads the traditional sum to ignore the NaNs

`accuracy_ipu(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)` ¶

A modified version of the torchmetrics.functional.accuracy that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor`	Predictions from model (probabilities, logits or labels)	required
`target`	`Tensor`	Ground truth labels	required
`average`	`Optional[str]`	Defines the reduction that is applied. Should be one of the following: `'micro'` [default]: Calculate the metric globally, across all samples and classes. `'macro'`: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class). `'weighted'`: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (`tp + fn`). `'none'` or `None`: Calculate the metric for each class separately, and return the metric for every class. `'samples'`: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample). .. note:: What is considered a sample in the multi-dimensional multi-class case depends on the value of `mdmc_average`. .. note:: If `'none'` and a given class doesn't occur in the `preds` or `target`, the value for the class will be `nan`.	`'micro'`
`mdmc_average`	`Optional[str]`	Defines how averaging is done for multi-dimensional multi-class inputs (on top of the `average` parameter). Should be one of the following: `None` [default]: Should be left unchanged if your data is not multi-dimensional multi-class. `'samplewise'`: In this case, the statistics are computed separately for each sample on the `N` axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes `...` (see :ref:`pages/classification:input types`) as the `N` dimension within the sample, and computing the metric for the sample based on that. `'global'`: In this case the `N` and `...` dimensions of the inputs (see :ref:`pages/classification:input types`) are flattened into a new `N_X` sample axis, i.e. the inputs are treated as if they were `(N_X, C)`. From here on the `average` parameter applies as usual.	`'global'`
`num_classes`	`Optional[int]`	Number of classes. Necessary for `'macro'`, `'weighted'` and `None` average methods.	`None`
`threshold`	`float`	Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.	`0.5`
`top_k`	`Optional[int]`	Number of the highest probability or logit score predictions considered finding the correct label, relevant only for (multi-dimensional) multi-class inputs. The default value (`None`) will be interpreted as 1 for these inputs. Should be left at default (`None`) for all other types of inputs.	`None`
`multiclass`	`Optional[bool]`	Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter's :ref:`documentation section <pages/classification:using the multiclass parameter>` for a more detailed explanation and examples.	`None`
`ignore_index`	`Optional[int]`	Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, and `average=None` or `'none'`, the score for the ignored class will be returned as `nan`.	`None`
`subset_accuracy`	`bool`	Whether to compute subset accuracy for multi-label and multi-dimensional multi-class inputs (has no effect for other input types). For multi-label inputs, if the parameter is set to `True`, then all labels for each sample must be correctly predicted for the sample to count as correct. If it is set to `False`, then all labels are counted separately - this is equivalent to flattening inputs beforehand (i.e. `preds = preds.flatten()` and same for `target`). For multi-dimensional multi-class inputs, if the parameter is set to `True`, then all sub-sample (on the extra axis) must be correct for the sample to be counted as correct. If it is set to `False`, then all sub-samples are counter separately - this is equivalent, in the case of label predictions, to flattening the inputs beforehand (i.e. `preds = preds.flatten()` and same for `target`). Note that the `top_k` parameter still applies in both cases, if set.	`False`

Raises:

Type	Description
`ValueError`	If `top_k` parameter is set for `multi-label` inputs.
`ValueError`	If `average` is none of `"micro"`, `"macro"`, `"weighted"`, `"samples"`, `"none"`, `None`.
`ValueError`	If `mdmc_average` is not one of `None`, `"samplewise"`, `"global"`.
`ValueError`	If `average` is set but `num_classes` is not provided.
`ValueError`	If `num_classes` is set and `ignore_index` is not in the range `[0, num_classes)`.
`ValueError`	If `top_k` is not an `integer` larger than `0`.

`auroc_ipu(preds, target, num_classes=None, task=None, pos_label=None, average='macro', max_fpr=None, sample_weights=None)` ¶

A modified version of the torchmetrics.functional.auroc that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`average_precision_ipu(preds, target, num_classes=None, task=None, ignore_index=None, pos_label=None, average='macro', sample_weights=None)` ¶

A modified version of the torchmetrics.functional.average_precision that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`f1_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

A modified version of the torchmetrics.functional.classification.f_beta._fbeta_compute that can ignore NaNs by giving them the same value for both preds and target. Used to calculate the f1_score on IPU with beta parameter equal to 1.0 This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

Computes f_beta metric from stat scores: true positives, false positives, true negatives, false negatives.

Parameters:

Name	Type	Description	Default
`tp`		True positives	required
`fp`		False positives	required
`tn`		True negatives	required
`fn`		False negatives	required
`beta`	`float`	The parameter `beta` (which determines the weight of recall in the combined score)	`1.0`
`ignore_index`	`Optional[int]`	Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method	`None`
`average`	`Optional[str]`	Defines the reduction that is applied	`'micro'`
`mdmc_average`	`Optional[str]`	Defines how averaging is done for multi-dimensional multi-class inputs (on top of the `average` parameter)	`None`

`fbeta_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

A modified version of the torchmetrics.functional.classification.f_beta._fbeta_compute that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor`	Predictions from model (probabilities, logits or labels)	required
`target`	`Tensor`	Ground truth labels	required
`average`	`Optional[str]`	Defines the reduction that is applied. Should be one of the following: `'micro'` [default]: Calculate the metric globally, across all samples and classes. `'macro'`: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class). `'weighted'`: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (`tp + fn`). `'none'` or `None`: Calculate the metric for each class separately, and return the metric for every class. `'samples'`: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample). .. note:: What is considered a sample in the multi-dimensional multi-class case depends on the value of `mdmc_average`. .. note:: If `'none'` and a given class doesn't occur in the `preds` or `target`, the value for the class will be `nan`.	`'micro'`
`mdmc_average`	`Optional[str]`	Defines how averaging is done for multi-dimensional multi-class inputs (on top of the `average` parameter). Should be one of the following: `None` [default]: Should be left unchanged if your data is not multi-dimensional multi-class. `'samplewise'`: In this case, the statistics are computed separately for each sample on the `N` axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes `...` (see :ref:`pages/classification:input types`) as the `N` dimension within the sample, and computing the metric for the sample based on that. `'global'`: In this case the `N` and `...` dimensions of the inputs (see :ref:`pages/classification:input types`) are flattened into a new `N_X` sample axis, i.e. the inputs are treated as if they were `(N_X, C)`. From here on the `average` parameter applies as usual.	`None`
`num_classes`	`Optional[int]`	Number of classes. Necessary for `'macro'`, `'weighted'` and `None` average methods.	`None`
`threshold`	`float`	Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.	`0.5`
`top_k`	`Optional[int]`	Number of the highest probability or logit score predictions considered finding the correct label, relevant only for (multi-dimensional) multi-class inputs. The default value (`None`) will be interpreted as 1 for these inputs. Should be left at default (`None`) for all other types of inputs.	`None`
`multiclass`	`Optional[bool]`	Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter's :ref:`documentation section <pages/classification:using the multiclass parameter>` for a more detailed explanation and examples.	`None`
`ignore_index`	`Optional[int]`	Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, and `average=None` or `'none'`, the score for the ignored class will be returned as `nan`.	`None`
`subset_accuracy`		Whether to compute subset accuracy for multi-label and multi-dimensional multi-class inputs (has no effect for other input types). For multi-label inputs, if the parameter is set to `True`, then all labels for each sample must be correctly predicted for the sample to count as correct. If it is set to `False`, then all labels are counted separately - this is equivalent to flattening inputs beforehand (i.e. `preds = preds.flatten()` and same for `target`). For multi-dimensional multi-class inputs, if the parameter is set to `True`, then all sub-sample (on the extra axis) must be correct for the sample to be counted as correct. If it is set to `False`, then all sub-samples are counter separately - this is equivalent, in the case of label predictions, to flattening the inputs beforehand (i.e. `preds = preds.flatten()` and same for `target`). Note that the `top_k` parameter still applies in both cases, if set.	required

Raises:

Type	Description
`ValueError`	If `top_k` parameter is set for `multi-label` inputs.
`ValueError`	If `average` is none of `"micro"`, `"macro"`, `"weighted"`, `"samples"`, `"none"`, `None`.
`ValueError`	If `mdmc_average` is not one of `None`, `"samplewise"`, `"global"`.
`ValueError`	If `average` is set but `num_classes` is not provided.
`ValueError`	If `num_classes` is set and `ignore_index` is not in the range `[0, num_classes)`.
`ValueError`	If `top_k` is not an `integer` larger than `0`.

`get_confusion_matrix(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)` ¶

Calculates the confusion matrix according to the specified average method.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor`	Predictions from model (probabilities, logits or labels)	required
`target`	`Tensor`	Ground truth labels	required
`average`	`Optional[str]`	Defines the reduction that is applied. Should be one of the following: `'micro'` [default]: Calculate the metric globally, across all samples and classes. `'macro'`: Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class). `'weighted'`: Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (`tp + fn`). `'none'` or `None`: Calculate the metric for each class separately, and return the metric for every class. `'samples'`: Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample). .. note:: What is considered a sample in the multi-dimensional multi-class case depends on the value of `mdmc_average`. .. note:: If `'none'` and a given class doesn't occur in the `preds` or `target`, the value for the class will be `nan`.	`'micro'`
`mdmc_average`	`Optional[str]`	Defines how averaging is done for multi-dimensional multi-class inputs (on top of the `average` parameter). Should be one of the following: `None` [default]: Should be left unchanged if your data is not multi-dimensional multi-class. `'samplewise'`: In this case, the statistics are computed separately for each sample on the `N` axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes `...` (see :ref:`pages/classification:input types`) as the `N` dimension within the sample, and computing the metric for the sample based on that. `'global'`: In this case the `N` and `...` dimensions of the inputs (see :ref:`pages/classification:input types`) are flattened into a new `N_X` sample axis, i.e. the inputs are treated as if they were `(N_X, C)`. From here on the `average` parameter applies as usual.	`'global'`
`num_classes`	`Optional[int]`	Number of classes. Necessary for `'macro'`, `'weighted'` and `None` average methods.	`None`
`threshold`	`float`	Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.	`0.5`
`top_k`	`Optional[int]`	Number of the highest probability or logit score predictions considered finding the correct label, relevant only for (multi-dimensional) multi-class inputs. The default value (`None`) will be interpreted as 1 for these inputs. Should be left at default (`None`) for all other types of inputs.	`None`
`multiclass`	`Optional[bool]`	Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter's :ref:`documentation section <pages/classification:using the multiclass parameter>` for a more detailed explanation and examples.	`None`
`ignore_index`	`Optional[int]`	Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, and `average=None`	`None`

`mean_absolute_error_ipu(preds, target)` ¶

Computes mean absolute error.

Handles NaNs without reshaping tensors in order to work on IPU.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor`	estimated labels	required
`target`	`Tensor`	ground truth labels	required

Return

Tensor with MAE

`mean_squared_error_ipu(preds, target, squared)` ¶

Computes mean squared error.

Handles NaNs without reshaping tensors in order to work on IPU.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor`	estimated labels	required
`target`	`Tensor`	ground truth labels	required
`squared`	`bool`	returns RMSE value if set to False	required

Return

Tensor with MSE

`pearson_ipu(preds, target)` ¶

Computes pearson correlation coefficient.

Handles NaNs in the target without reshaping tensors in order to work on IPU.

Parameters:

Name	Type	Description	Default
`preds`		estimated scores	required
`target`		ground truth scores	required

`precision_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

A modified version of the torchmetrics.functional.precision that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`r2_score_ipu(preds, target, *args, **kwargs)` ¶

Computes r2 score also known as R2 Score_Coefficient Determination_:

.. math:: R^2 = 1 - rac{SS_{res}}{SS_{tot}}

where :math:SS_{res}=\sum_i (y_i - f(x_i))^2 is the sum of residual squares, and :math:SS_{tot}=\sum_i (y_i - ar{y})^2 is total sum of squares. Can also calculate adjusted r2 score given by

.. math:: R^2_{adj} = 1 - rac{(1-R^2)(n-1)}{n-k-1}

where the parameter :math:k (the number of independent regressors) should be provided as the adjusted argument. Handles NaNs without reshaping tensors in order to work on IPU.

Parameters:

Name	Description	Default
`preds`	estimated labels	required
`target`	ground truth labels	required
`adjusted`	number of independent regressors for calculating adjusted r2 score.	required
`multioutput`	Defines aggregation in the case of multiple output scores. Can be one of the following strings: `'raw_values'` returns full set of scores `'uniform_average'` scores are uniformly averaged `'variance_weighted'` scores are weighted by their individual variances	required

`recall_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

A modified version of the torchmetrics.functional.recall that can ignore NaNs by giving them the same value for both preds and target. This allows it to work with compilation and IPUs since it doesn't modify the tensor's shape.

`spearman_ipu(preds, target)` ¶

Computes spearman rank correlation coefficient.

Handles NaNs in the target without reshaping tensors in order to work on IPU.

Parameters:

Name	Type	Description	Default
`preds`		estimated scores	required
`target`		ground truth scores	required

IPU Simple Lightning¶

`graphium.ipu.ipu_simple_lightning` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

IPU Utils¶

`graphium.ipu.ipu_utils` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`import_poptorch(raise_error=True)` ¶

Import poptorch and returns it. It is wrapped in a function to avoid breaking the code for non-IPU devices which did not install poptorch.

Parameters:

Name	Type	Description	Default
`raise_error`		Whether to raise an error if poptorch is unavailable. If `False`, return `None`	`True`

Returns:

Type	Description
`Optional[ModuleType]`	The poptorch module

`ipu_options_list_to_file(ipu_opts)` ¶

Create a temporary file from a list of ipu configs, such that it can be read by poptorch.Options.loadFromFile

Parameters:

Name	Type	Description	Default
`ipu_opts`	`Optional[List[str]]`	The list configurations for the IPU, written as a list of strings to make use of `poptorch.Options.loadFromFile`	required

Returns: tmp_file: The temporary file of ipu configs

`is_running_on_ipu()` ¶

Returns whether the current module is running on ipu. Needs to be used in the forward or backward pass.

`load_ipu_options(ipu_opts, seed=None, model_name=None, gradient_accumulation=None, precision=None, ipu_inference_opts=None)` ¶

Load the IPU options from the config file.

Parameters:

Name	Type	Description	Default
`ipu_cfg`		The list configurations for the IPU, written as a list of strings to make use of `poptorch.Options.loadFromFile` write a temporary config gile, and read it. See `Options.loadFromFile` ? see the tutorial for IPU options here¶ https://github.com/graphcore/tutorials/tree/sdk-release-2.6/tutorials/pytorch/efficient_data_loading ¶ ? see the full documentation for ipu options here¶ https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html?highlight=options#poptorch.Options ¶ *minibatch size: The number of samples processed by one simple fwd/bwd pass. = # of samples in a minibatch device iterations: A device iteration corresponds to one iteration of the training loop executed on the IPU, starting with data-loading and ending with a weight update. In this simple case, when we set n deviceIterations, the host will prepare n mini-batches in an infeed queue so the IPU can perform efficiently n iterations. = # of minibatches to be processed at a time = # of training / backward pass in this call gradient accumulation factor: After each backward pass the gradients are accumulated together for K mini-batches. set K in the argument = # of minibatches to accumulate gradients from replication factor: Replication describes the process of running multiple instances of the same model simultaneously on different IPUs to achieve data parallelism. If the model requires N IPUs and the replication factor is M, N x M IPUs will be necessary. = # of times the model is copied to speed up computation, each replica of the model is sent a different subset of the dataset global batch size: In a single device iteration, many mini-batches may be processed and the resulting gradients accumulated. We call this total number of samples processed for one optimiser step the global batch size. = total number of samples processed for one optimiser step* = (minibatch size x Gradient accumulation factor) x Number of replicas	required
`seed`	`Optional[int]`	random seed for the IPU	`None`
`model_name`	`Optional[str]`	Name of the model, to be used for ipu profiling	`None`
`ipu_inference_opts`	`Optional[List[str]]`	optional IPU configuration overrides for inference. If this is provided, options in this file override those in `ipu_file` for inference.	`None`

Returns:

training_opts: IPU options for the training set.

inference_opts: IPU options for inference.
    It differs from the `training_opts` by enforcing `gradientAccumulation` to 1

IPU Wrapper¶

`graphium.ipu.ipu_wrapper` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`PredictorModuleIPU` ¶

Bases: PredictorModule

This class wraps around the PredictorModule to make it work with IPU and the IPUPluginGraphium.

`convert_from_fp16(data)` ¶

Converts tensors from FP16 to FP32. Useful to convert the IPU program output data

`get_num_graphs(data)` ¶

IPU specific method to compute the number of graphs in a Batch, that considers gradient accumulation, multiple IPUs and multiple device iterations. Essential to estimate throughput in graphs/s.

`PyGArgsParser` ¶

Bases: ICustomArgParser

This class is responsible for converting a PyG Batch from and to a tensor of tuples. This allows PyG Batch to be used as inputs to IPU programs. Copied from poppyg repo, in the future import from the repo directly.

`reconstruct(original_structure, tensor_iterator)` ¶

Create a new instance with the same class type as the original_structure. This new instance will be initialized with tensors from the provided iterator and uses the same sorted keys from the yieldTensors() implementation.

`sortedTensorKeys(struct)` `staticmethod` ¶

Find all the keys that map to a tensor value in struct. The keys are returned in sorted order.

`yieldTensors(struct)` ¶

yield every torch.Tensor in struct in sorted order

To Dense Batch¶

`graphium.ipu.to_dense_batch` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`to_dense_batch(x, batch=None, fill_value=0.0, max_num_nodes_per_graph=None, batch_size=None, drop_nodes_last_graph=False)` ¶

Given a sparse batch of node features :math:\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F} (with :math:N_i indicating the number of nodes in graph :math:i), creates a dense node feature tensor :math:\mathbf{X} \in \mathbb{R}^{B \times N_{\max} \times F} (with :math:N_{\max} = \max_i^B N_i). In addition, a mask of shape :math:\mathbf{M} \in \{ 0, 1 \}^{B \times N_{\max}} is returned, holding information about the existence of fake-nodes in the dense representation.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Node feature matrix :math:`\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}`.	required
`batch`	`Optional[Tensor]`	Batch vector :math:`\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N`, which assigns each node to a specific example. Must be ordered. (default: :obj:`None`)	`None`
`fill_value`	`float`	The value for invalid entries in the resulting dense output tensor. (default: :obj:`0`)	`0.0`
`max_num_nodes_per_graph`	`Optional[int]`	The size of the output node dimension. (default: :obj:`None`)	`None`
`batch_size`	`Optional[int]`	The batch size. (default: :obj:`None`)	`None`
`drop_nodes_last_graph`		Whether to drop the nodes of the last graphs that exceed the `max_num_nodes_per_graph`. Useful when the last graph is a padding.	`False`

:rtype: (:class:Tensor, :class:BoolTensor)

`to_packed_dense_batch(x, pack_from_node_idx, pack_attn_mask, fill_value=0.0, max_num_nodes_per_pack=None)` ¶

Given a sparse batch of node features :math:\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F} (with :math:N_i indicating the number of nodes in graph :math:i), creates a dense node feature tensor :math:\mathbf{X} \in \mathbb{R}^{B \times N_{\max} \times F} (with :math:N_{\max} = \max_i^B N_i). In addition, a mask of shape :math:\mathbf{M} \in \{ 0, 1 \}^{B \times N_{\max}} is returned, holding information about the existence of fake-nodes in the dense representation.

# TODO: Update docstring

Name	Type	Description	Default
`x`	`Tensor`	Node feature matrix :math:`\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}`.	required
`batch`		Batch vector :math:`\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N`, which assigns each node to a specific example. Must be ordered. (default: :obj:`None`)	required
`fill_value`	`float`	The value for invalid entries in the resulting dense output tensor. (default: :obj:`0`)	`0.0`
`max_num_nodes_per_graph`		The size of the output node dimension. (default: :obj:`None`)	required
`batch_size`		The batch size. (default: :obj:`None`)	required
`drop_nodes_last_graph`		Whether to drop the nodes of the last graphs that exceed the `max_num_nodes_per_graph`. Useful when the last graph is a padding.	required

:rtype: (:class:Tensor, :class:BoolTensor)

`to_sparse_batch(x, mask_idx)` ¶

Reverse function of to_dense_batch

`to_sparse_batch_from_packed(x, pack_from_node_idx)` ¶

Reverse function of to_packed_dense_batch

graphium.ipu¶

IPU Dataloader¶

graphium.ipu.ipu_dataloader ¶

CombinedBatchingCollator ¶

__call__(batch) ¶

__init__(batch_size, max_num_nodes, max_num_edges, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, collate_fn=None) ¶

IPUDataloaderOptions dataclass ¶

Pad ¶

__init__(max_num_nodes, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, max_num_edges=None, node_value=0, edge_value=0) ¶

validate(data) ¶

create_ipu_dataloader(dataset, ipu_dataloader_options, ipu_options=None, batch_size=1, collate_fn=None, num_workers=0, **kwargs) ¶

IPU Losses¶

graphium.ipu.ipu_losses ¶

BCELossIPU ¶

BCEWithLogitsLossIPU ¶

HybridCELossIPU ¶

__init__(n_brackets, alpha=0.5) ¶

forward(input, target) ¶

L1LossIPU ¶

MSELossIPU ¶

IPU Metrics¶

graphium.ipu.ipu_metrics ¶

NaNTensor ¶

get_nans: BoolTensor property ¶

__lt__(other) ¶

__torch_function__(func, types, args=(), kwargs=None) classmethod ¶

argsort(dim=-1, descending=False) ¶

max(*args, **kwargs) ¶

mean(*args, **kwargs) ¶

min(*args, **kwargs) ¶

numel() ¶

size(dim) ¶

sum(*args, **kwargs) ¶

accuracy_ipu(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None) ¶

auroc_ipu(preds, target, num_classes=None, task=None, pos_label=None, average='macro', max_fpr=None, sample_weights=None) ¶

average_precision_ipu(preds, target, num_classes=None, task=None, ignore_index=None, pos_label=None, average='macro', sample_weights=None) ¶

f1_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None) ¶

fbeta_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None) ¶

get_confusion_matrix(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None) ¶

mean_absolute_error_ipu(preds, target) ¶

mean_squared_error_ipu(preds, target, squared) ¶

pearson_ipu(preds, target) ¶

precision_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None) ¶

r2_score_ipu(preds, target, *args, **kwargs) ¶

recall_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None) ¶

spearman_ipu(preds, target) ¶

IPU Simple Lightning¶

graphium.ipu.ipu_simple_lightning ¶

IPU Utils¶

graphium.ipu.ipu_utils ¶

import_poptorch(raise_error=True) ¶

ipu_options_list_to_file(ipu_opts) ¶

is_running_on_ipu() ¶

load_ipu_options(ipu_opts, seed=None, model_name=None, gradient_accumulation=None, precision=None, ipu_inference_opts=None) ¶

? see the tutorial for IPU options here¶

https://github.com/graphcore/tutorials/tree/sdk-release-2.6/tutorials/pytorch/efficient_data_loading¶

? see the full documentation for ipu options here¶

https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html?highlight=options#poptorch.Options¶

IPU Wrapper¶

graphium.ipu.ipu_wrapper ¶

PredictorModuleIPU ¶

convert_from_fp16(data) ¶

get_num_graphs(data) ¶

PyGArgsParser ¶

reconstruct(original_structure, tensor_iterator) ¶

sortedTensorKeys(struct) staticmethod ¶

yieldTensors(struct) ¶

To Dense Batch¶

graphium.ipu.to_dense_batch ¶

to_dense_batch(x, batch=None, fill_value=0.0, max_num_nodes_per_graph=None, batch_size=None, drop_nodes_last_graph=False) ¶

to_packed_dense_batch(x, pack_from_node_idx, pack_attn_mask, fill_value=0.0, max_num_nodes_per_pack=None) ¶

to_sparse_batch(x, mask_idx) ¶

to_sparse_batch_from_packed(x, pack_from_node_idx) ¶

`graphium.ipu.ipu_dataloader` ¶

`CombinedBatchingCollator` ¶

`call(batch)` ¶

`init(batch_size, max_num_nodes, max_num_edges, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, collate_fn=None)` ¶

`IPUDataloaderOptions` `dataclass` ¶

`Pad` ¶

`init(max_num_nodes, dataset_max_nodes_per_graph, dataset_max_edges_per_graph, max_num_edges=None, node_value=0, edge_value=0)` ¶

`validate(data)` ¶

`create_ipu_dataloader(dataset, ipu_dataloader_options, ipu_options=None, batch_size=1, collate_fn=None, num_workers=0, **kwargs)` ¶

`graphium.ipu.ipu_losses` ¶

`BCELossIPU` ¶

`BCEWithLogitsLossIPU` ¶

`HybridCELossIPU` ¶

`init(n_brackets, alpha=0.5)` ¶

`forward(input, target)` ¶

`L1LossIPU` ¶

`MSELossIPU` ¶

`graphium.ipu.ipu_metrics` ¶

`NaNTensor` ¶

`get_nans: BoolTensor` `property` ¶

`lt(other)` ¶

`__torch_function__(func, types, args=(), kwargs=None)` `classmethod` ¶

`argsort(dim=-1, descending=False)` ¶

`max(*args, **kwargs)` ¶

`mean(*args, **kwargs)` ¶

`min(*args, **kwargs)` ¶

`numel()` ¶

`size(dim)` ¶

`sum(*args, **kwargs)` ¶

`accuracy_ipu(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)` ¶

`auroc_ipu(preds, target, num_classes=None, task=None, pos_label=None, average='macro', max_fpr=None, sample_weights=None)` ¶

`average_precision_ipu(preds, target, num_classes=None, task=None, ignore_index=None, pos_label=None, average='macro', sample_weights=None)` ¶

`f1_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

`fbeta_score_ipu(preds, target, beta=1.0, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

`get_confusion_matrix(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)` ¶

`mean_absolute_error_ipu(preds, target)` ¶

`mean_squared_error_ipu(preds, target, squared)` ¶

`pearson_ipu(preds, target)` ¶

`precision_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

`r2_score_ipu(preds, target, *args, **kwargs)` ¶

`recall_ipu(preds, target, average='micro', mdmc_average=None, ignore_index=None, num_classes=None, threshold=0.5, top_k=None, multiclass=None)` ¶

`spearman_ipu(preds, target)` ¶

`graphium.ipu.ipu_simple_lightning` ¶

`graphium.ipu.ipu_utils` ¶

`import_poptorch(raise_error=True)` ¶

`ipu_options_list_to_file(ipu_opts)` ¶

`is_running_on_ipu()` ¶

`load_ipu_options(ipu_opts, seed=None, model_name=None, gradient_accumulation=None, precision=None, ipu_inference_opts=None)` ¶

https://github.com/graphcore/tutorials/tree/sdk-release-2.6/tutorials/pytorch/efficient_data_loading ¶

https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/reference.html?highlight=options#poptorch.Options ¶

`graphium.ipu.ipu_wrapper` ¶

`PredictorModuleIPU` ¶

`convert_from_fp16(data)` ¶

`get_num_graphs(data)` ¶

`PyGArgsParser` ¶

`reconstruct(original_structure, tensor_iterator)` ¶

`sortedTensorKeys(struct)` `staticmethod` ¶

`yieldTensors(struct)` ¶

`graphium.ipu.to_dense_batch` ¶

`to_dense_batch(x, batch=None, fill_value=0.0, max_num_nodes_per_graph=None, batch_size=None, drop_nodes_last_graph=False)` ¶

`to_packed_dense_batch(x, pack_from_node_idx, pack_attn_mask, fill_value=0.0, max_num_nodes_per_pack=None)` ¶

`to_sparse_batch(x, mask_idx)` ¶

`to_sparse_batch_from_packed(x, pack_from_node_idx)` ¶