graphium.nn.architectures¶

High level architectures in the library

Contents

Global Architectures
PyG Architectures
Encoder Manager

Global Architectures¶

`graphium.nn.architectures.global_architectures` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`EnsembleFeedForwardNN` ¶

Bases: FeedForwardNN

`init(in_dim, out_dim, hidden_dims, num_ensemble, reduction, subset_in_dim=1.0, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='ens-fc', layer_kwargs=None, last_layer_is_readout=False)` ¶

An ensemble of flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    Either an integer specifying all the hidden dimensions,
    or a list of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

num_ensemble:
    Number of MLPs that run in parallel.

reduction:
    Reduction to use at the end of the MLP. Choices:

    - "none" or `None`: No reduction
    - "mean": Mean reduction
    - "sum": Sum reduction
    - "max": Max reduction
    - "min": Min reduction
    - "median": Median reduction
    - `Callable`: Any callable function. Must take `dim` as a keyword argument.

subset_in_dim:
    If float, ratio of the subset of the ensemble to use. Must be between 0 and 1.
    If int, number of elements to subset from in_dim.
    If `None`, the subset_in_dim is set to `1.0`.
    A different subset is used for each ensemble.
    Only valid if the input shape is `[B, Din]`.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use.
    If `hidden_dims` is a list, then
    `depth` must be `None` or equal to `len(hidden_dims) + 1`

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last_layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    Either "ens-fc" as the `EnsembleFCLayer`, or a class representing the `nn.Module`
    to use.

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup

`repr()` ¶

Controls how the class is printed

`forward(h)` ¶

Subset the hidden dimension for each MLP, forward the ensemble MLP on the input features, then reduce the output if specified.

Parameters:

h: `torch.Tensor[B, Din]` or `torch.Tensor[..., 1, B, Din]` or `torch.Tensor[..., L, B, Din]`:

    Input feature tensor, before the MLP.
    `Din` is the number of input features, `B` is the batch size, and `L` is the number of ensembles.

Returns:

`torch.Tensor[..., L, B, Dout]` or `torch.Tensor[..., B, Dout]`:

    Output feature tensor, after the MLP.
    `Dout` is the number of output features, `B` is the batch size, and `L` is the number of ensembles.
    `L` is removed if a reduction is specified.

`get_init_kwargs()` ¶

Get a dictionary that can be used to instanciate a new object with identical parameters.

`FeedForwardGraph` ¶

Bases: FeedForwardNN

`init(in_dim, out_dim, hidden_dims, layer_type, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, in_dim_edges=0, hidden_dims_edges=[], out_dim_edges=None, name='GNN', layer_kwargs=None, virtual_node='none', use_virtual_edges=False, last_layer_is_readout=False)` ¶

A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

This class is meant to work with different graph neural networks layers. Any layer must inherit from graphium.nn.base_graph_layer.BaseGraphStructure or graphium.nn.base_graph_layer.BaseGraphLayer.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    List of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

layer_type:
    Type of layer to use. Can be a string or nn.Module.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use. If `hidden_dims` is a `list`, `depth` must
    be `None`.

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

in_dim_edges:
    Input edge-feature dimensions of the network. Keep at 0 if not using
    edge features, or if the layer doesn't support edges.

hidden_dims_edges:
    Hidden dimensions for the edges. Most models don't support it, so it
    should only be used for those that do, i.e. `GatedGCNLayer`

out_dim_edges:
    Output edge-feature dimensions of the network. Keep at 0 if not using
    edge features, or if the layer doesn't support edges. Defaults to the
    last value of hidden_dims_edges.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    A class that inherits from `graphium.nn.base_graph_layer.BaseGraphStructure`,
    or one of the following strings

    - "pyg:gin": GINConvPyg
    - "pyg:gine": GINEConvPyg
    - "pyg:gated-gcn": GatedGCNPyg
    - "pyg:pna-msgpass": PNAMessagePassingPyg

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

virtual_node:
    A string associated to the type of virtual node to use,
    either `None`, "none", "mean", "sum", "max", "logsum".
    See `graphium.nn.pooling_pyg.VirtualNode`.

    The virtual node will not use any residual connection if `residual_type`
    is "none". Otherwise, it will use a simple ResNet like residual
    connection.

use_virtual_edges:
    A bool flag used to select if the virtual node should use the edges or not

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup

`repr()` ¶

Controls how the class is printed

`forward(g)` ¶

Apply the full graph neural network on the input graph and node features.

Parameters:

g:
    pyg Batch graph on which the convolution is done with the keys:

    - `"feat"`: torch.Tensor[..., N, Din]
      Node feature tensor, before convolution.
      `N` is the number of nodes, `Din` is the input features

    - `"edge_feat"` (torch.Tensor[..., N, Ein]):
      Edge feature tensor, before convolution.
      `N` is the number of nodes, `Ein` is the input edge features

Returns:

`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
    Node or graph feature tensor, after the network.
    `N` is the number of nodes, `M` is the number of graphs,
    `Dout` is the output dimension ``self.out_dim``
    If the `self.pooling` is [`None`], then it returns node features and the output dimension is `N`,
    otherwise it returns graph features and the output dimension is `M`

`get_init_kwargs()` ¶

Get a dictionary that can be used to instanciate a new object with identical parameters.

`get_nested_key(d, target_key)` ¶

Get the value associated with a key in a nested dictionary.

Parameters: - d: The dictionary to search in - target_key: The key to search for

Returns: - The value associated with the key if found, None otherwise

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension for the nodes

Returns:

Name	Type	Description
`kwargs`	`Dict[str, Any]`	Dictionary of parameters to be used to instanciate the base model divided by the factor

`FeedForwardNN` ¶

Bases: Module, MupMixin

`cache_readouts: bool` `property` ¶

Whether the readout cache is enabled

`init(in_dim, out_dim, hidden_dims, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='fc', layer_kwargs=None, last_layer_is_readout=False)` ¶

A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    Either an integer specifying all the hidden dimensions,
    or a list of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use.
    If `hidden_dims` is a list, then
    `depth` must be `None` or equal to `len(hidden_dims) + 1`

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last_layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    Either "fc" as the `FCLayer`, or a class representing the `nn.Module`
    to use.

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup

`repr()` ¶

Controls how the class is printed

`add_layers(layers)` ¶

Add layers to the end of the model.

`drop_layers(depth)` ¶

Remove the last layers of the model part.

`forward(h)` ¶

Apply the neural network on the input features.

Parameters:

h: `torch.Tensor[..., Din]`:
    Input feature tensor, before the network.
    `Din` is the number of input features

Returns:

`torch.Tensor[..., Dout]`:
    Output feature tensor, after the network.
    `Dout` is the number of output features

`get_init_kwargs()` ¶

Get a dictionary that can be used to instanciate a new object with identical parameters.

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

`FullGraphMultiTaskNetwork` ¶

Bases: Module, MupMixin

`in_dim: int` `property` ¶

Returns the input dimension of the network

`in_dim_edges: int` `property` ¶

Returns the input edge dimension of the network

`out_dim: int` `property` ¶

Returns the output dimension of the network

`out_dim_edges: int` `property` ¶

Returns the output dimension of the edges of the network.

`init(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN')` ¶

Class that allows to implement a full graph neural network architecture, including the pre-processing MLP and the post processing MLP.

Parameters:

gnn_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    GNN network using the class `FeedForwardGraph`.
    It must respect the following criteria:

    - gnn_kwargs["in_dim"] must be equal to pre_nn_kwargs["out_dim"]
    - gnn_kwargs["out_dim"] must be equal to graph_output_nn_kwargs["in_dim"]

pe_encoders_kwargs:
    key-word arguments to use for the initialization of all positional encoding encoders.
    See the class `EncoderManager` for more details.

pre_nn_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    MLP network of the node features before the GNN, using the class `FeedForwardNN`.
    If `None`, there won't be a pre-processing MLP.

pre_nn_edges_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    MLP network of the edge features before the GNN, using the class `FeedForwardNN`.
    If `None`, there won't be a pre-processing MLP.

task_heads_kwargs:
    This argument is a list of dictionaries containing the arguments for task heads. Each argument is used to
    initialize a task-specific MLP.

graph_output_nn_kwargs:
    This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
    Each dict of arguments is used to initialize a shared MLP.

accelerator_kwargs:
    key-word arguments specific to the accelerator being used,
    e.g. pipeline split points

num_inference_to_average:
    Number of inferences to average at val/test time. This is used to avoid the noise introduced
    by positional encodings with sign-flips. In case no such encoding is given,
    this parameter is ignored.
    NOTE: The inference time will be slowed-down proportionaly to this parameter.

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup

name:
    Name attributed to the current network, for display and printing
    purposes.

`repr()` ¶

Controls how the class is printed

`create_module_map(level='layers')` ¶

Function to create mapping between each (sub)module name and corresponding nn.ModuleList() (if possible); Used for finetuning when (partially) loading or freezing specific modules of the pretrained model

Parameters:

Name	Type	Description	Default
`level`	`Union[Literal['layers'], Literal['module']]`	Whether to map to the module object or the layers of the module object	`'layers'`

`forward(g)` ¶

Apply the pre-processing neural network, the graph neural network, and the post-processing neural network on the graph features.

Parameters:

g:
    pyg Batch graph on which the convolution is done.
    Must contain the following elements:

    - Node key `"feat"`: `torch.Tensor[..., N, Din]`.
      Input node feature tensor, before the network.
      `N` is the number of nodes, `Din` is the input features dimension ``self.pre_nn.in_dim``

    - Edge key `"edge_feat"`: `torch.Tensor[..., N, Ein]` **Optional**.
      The edge features to use. It will be ignored if the
      model doesn't supporte edge features or if
      `self.in_dim_edges==0`.

    - Other keys related to positional encodings `"pos_enc_feats_sign_flip"`,
      `"pos_enc_feats_no_flip"`.

Returns:

`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
    Node or graph feature tensor, after the network.
    `N` is the number of nodes, `M` is the number of graphs,
    `Dout` is the output dimension ``self.graph_output_nn.out_dim``
    If the `self.gnn.pooling` is [`None`], then it returns node features and the output dimension is `N`,
    otherwise it returns graph features and the output dimension is `M`

`make_mup_base_kwargs(divide_factor=2.0)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width.

Returns:

Type	Description
`Dict[str, Any]`	Dictionary with the kwargs to create the base model.

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name	Type	Description	Default
`max_nodes`	`Optional[int]`	Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.	required
`max_edges`	`Optional[int]`	Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.	required

`GraphOutputNN` ¶

Bases: Module, MupMixin

`concat_last_layers: Optional[Iterable[int]]` `property` `writable` ¶

Property to control the output of the self.forward. If set to a list of integer, the forward function will concatenate the output of different layers.

If set to None, the output of the last layer is returned.

NOTE: The indexes are inverted. 0 is the last layer, 1 is the second last, etc.

`out_dim: int` `property` ¶

Returns the output dimension of the network

`init(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs)` ¶

Parameters:

Name	Type	Description	Default
`in_dim`	`int`	Input feature dimensions of the layer	required
`in_dim_edges`	`int`	Input edge feature dimensions of the layer	required
`task_level`	`str`	graph/node/edge/nodepair depending on wether it is graph/node/edge/nodepair level task	required
`graph_output_nn_kwargs`	`Dict[str, Any]`	key-word arguments to use for the initialization of the post-processing MLP network after the GNN, using the class `FeedForwardNN`.	required

`compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False)` ¶

Vectorized implementation of nodepair-level task: Parameters: node_feats: Node features batch: Batch vector max_num_nodes: The maximum number of nodes per graph fill_value: The value for invalid entries in the resulting dense output tensor. (default: :obj:NaN) batch_size: The batch size. (default: :obj:None) drop_nodes_last_graph: Whether to drop the nodes of the last graphs that exceed the max_num_nodes_per_graph. Useful when the last graph is a padding. Returns: result: concatenated node features of shape B * max_num_nodes * 2*h, where B is number of graphs, max_num_nodes is the chosen maximum number nodes, and h is the feature dim

`drop_graph_output_nn_layers(num_layers_to_drop)` ¶

Remove the last layers of the model. Useful for Transfer Learning. Parameters: num_layers_to_drop: The number of layers to drop from the self.graph_output_nn network.

`extend_graph_output_nn_layers(layers)` ¶

Add layers at the end of the model. Useful for Transfer Learning. Parameters: layers: A ModuleList of all the layers to extend

`forward(g)` ¶

Parameters:

Name	Type	Description	Default
`g`	`Batch`	pyg Batch graph	required

Returns: h: Output features after applying graph_output_nn

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns:

Type	Description
`Dict[str, Any]`	Dictionary with the kwargs to create the base model.

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name	Type	Description	Default
`max_nodes`	`Optional[int]`	Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.	required
`max_edges`	`Optional[int]`	Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.	required

`TaskHeads` ¶

Bases: Module, MupMixin

`out_dim: Dict[str, int]` `property` ¶

Returns the output dimension of each task head

`init(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True)` ¶

Class that groups all multi-task output heads together to provide the task-specific outputs. Parameters: in_dim: Input feature dimensions of the layer

in_dim_edges:
    Input edge feature dimensions of the layer
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method
task_heads_kwargs:
    This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
    Each dict of arguments is used to initialize a task-specific MLP.
graph_output_nn_kwargs:
    key-word arguments to use for the initialization of the post-processing
    MLP network after the GNN, using the class `FeedForwardNN`.

`repr()` ¶

Returns a string representation of the task heads

`forward(g)` ¶

forward function of the task head Parameters: g: pyg Batch graph Returns: task_head_outputs: Return a dictionary: Dict[task_name, Tensor]

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns:

Name	Type	Description
`kwargs`	`Dict[str, Any]`	Dictionary of arguments to be used to initialize the base model

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name	Type	Description	Default
`max_nodes`	`Optional[int]`	Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.	required
`max_edges`	`Optional[int]`	Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.	required

PyG Architectures¶

`graphium.nn.architectures.pyg_architectures` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`FeedForwardPyg` ¶

Bases: FeedForwardGraph

Encoder Manager¶

`graphium.nn.architectures.encoder_manager` ¶

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.

`EncoderManager` ¶

Bases: Module

`in_dims: Iterable[int]` `property` ¶

Returns the input dimensions for all pe-encoders

Returns:

Name	Type	Description
`in_dims`	`Iterable[int]`	the input dimensions for all pe-encoders

`input_keys: Iterable[str]` `property` ¶

Returns the input keys for all pe-encoders

Returns:

Name	Type	Description
`input_keys`	`Iterable[str]`	the input keys for all pe-encoders

`out_dim: int` `property` ¶

Returns the output dimension of the pooled embedding from all the pe encoders

Returns:

Name	Type	Description
`out_dim`	`int`	the output dimension of the pooled embedding from all the pe encoders

`init(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager')` ¶

Class that allows to runs multiple encoders in parallel and concatenate / pool their outputs. Parameters:

pe_encoders_kwargs:
    key-word arguments to use for the initialization of all positional encoding encoders
    can use the class PE_ENCODERS_DICT: "la_encoder"(tested) , "mlp_encoder" (not tested), "signnet_encoder" (not tested)

name:
    Name attributed to the current network, for display and printing
    purposes.

`forward(g)` ¶

forward pass of the pe encoders and pooling

Parameters:

Name	Type	Description	Default
`g`	`Batch`	ptg Batch on which the convolution is done. Must contain the following elements: Node key `"feat"`: `torch.Tensor[..., N, Din]`. Input node feature tensor, before the network. `N` is the number of nodes, `Din` is the input features dimension `self.pre_nn.in_dim` Edge key `"edge_feat"`: `torch.Tensor[..., N, Ein]` Optional. The edge features to use. It will be ignored if the model doesn't supporte edge features or if `self.in_dim_edges==0`. Other keys related to positional encodings `"pos_enc_feats_sign_flip"`, `"pos_enc_feats_no_flip"`.	required

Returns:

Name	Type	Description
`g`	`Batch`	pyg Batch with the positional encodings added to the graph

`forward_positional_encoding(g)` ¶

Forward pass for the positional encodings (PE), with each PE having it's own encoder defined in self.pe_encoders. All the positional encodings with the same keys are pooled together using self.pe_pooling.

Parameters:

Name	Type	Description	Default
`g`	`Batch`	pyg Batch containing the node positional encodings	required

Returns:

Name	Type	Description
`pe_node_pooled`	`Dict[str, Tensor]`	The positional / structural encodings go through
	`Dict[str, Tensor]`	encoders, then are pooled together according to their keys.

`forward_simple_pooling(h, pooling, dim)` ¶

Apply sum, mean, or max pooling on a Tensor. Parameters: h: the Tensor to pool pooling: string specifiying the pooling method dim: the dimension to pool over

Returns:

Name	Type	Description
`pooled`	`Tensor`	the pooled Tensor

`make_mup_base_kwargs(divide_factor=2.0)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width.

Returns:

Name	Type	Description
`pe_kw`	`Dict[str, Any]`	the model kwargs where the dimensions are divided by the factor

graphium.nn.architectures¶

Global Architectures¶

graphium.nn.architectures.global_architectures ¶

EnsembleFeedForwardNN ¶

__repr__() ¶

forward(h) ¶

get_init_kwargs() ¶

FeedForwardGraph ¶

__repr__() ¶

forward(g) ¶

get_init_kwargs() ¶

get_nested_key(d, target_key) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

FeedForwardNN ¶

cache_readouts: bool property ¶

__repr__() ¶

add_layers(layers) ¶

drop_layers(depth) ¶

forward(h) ¶

get_init_kwargs() ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

FullGraphMultiTaskNetwork ¶

in_dim: int property ¶

in_dim_edges: int property ¶

out_dim: int property ¶

out_dim_edges: int property ¶

__init__(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN') ¶

__repr__() ¶

create_module_map(level='layers') ¶

forward(g) ¶

make_mup_base_kwargs(divide_factor=2.0) ¶

set_max_num_nodes_edges_per_graph(max_nodes, max_edges) ¶

GraphOutputNN ¶

concat_last_layers: Optional[Iterable[int]] property writable ¶

out_dim: int property ¶

__init__(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs) ¶

compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False) ¶

drop_graph_output_nn_layers(num_layers_to_drop) ¶

extend_graph_output_nn_layers(layers) ¶

forward(g) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

set_max_num_nodes_edges_per_graph(max_nodes, max_edges) ¶

TaskHeads ¶

out_dim: Dict[str, int] property ¶

__init__(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True) ¶

__repr__() ¶

forward(g) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

set_max_num_nodes_edges_per_graph(max_nodes, max_edges) ¶

PyG Architectures¶

graphium.nn.architectures.pyg_architectures ¶

FeedForwardPyg ¶

Encoder Manager¶

graphium.nn.architectures.encoder_manager ¶

EncoderManager ¶

in_dims: Iterable[int] property ¶

input_keys: Iterable[str] property ¶

out_dim: int property ¶

__init__(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager') ¶

forward(g) ¶

forward_positional_encoding(g) ¶

forward_simple_pooling(h, pooling, dim) ¶

make_mup_base_kwargs(divide_factor=2.0) ¶

`graphium.nn.architectures.global_architectures` ¶

`EnsembleFeedForwardNN` ¶

`repr()` ¶

`forward(h)` ¶

`get_init_kwargs()` ¶

`FeedForwardGraph` ¶

`repr()` ¶

`forward(g)` ¶

`get_init_kwargs()` ¶

`get_nested_key(d, target_key)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`FeedForwardNN` ¶

`cache_readouts: bool` `property` ¶

`repr()` ¶

`add_layers(layers)` ¶

`drop_layers(depth)` ¶

`forward(h)` ¶

`get_init_kwargs()` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`FullGraphMultiTaskNetwork` ¶

`in_dim: int` `property` ¶

`in_dim_edges: int` `property` ¶

`out_dim: int` `property` ¶

`out_dim_edges: int` `property` ¶

`init(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN')` ¶

`repr()` ¶

`create_module_map(level='layers')` ¶

`forward(g)` ¶

`make_mup_base_kwargs(divide_factor=2.0)` ¶

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

`GraphOutputNN` ¶

`concat_last_layers: Optional[Iterable[int]]` `property` `writable` ¶

`out_dim: int` `property` ¶

`init(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs)` ¶

`compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False)` ¶

`drop_graph_output_nn_layers(num_layers_to_drop)` ¶

`extend_graph_output_nn_layers(layers)` ¶

`forward(g)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

`TaskHeads` ¶

`out_dim: Dict[str, int]` `property` ¶

`init(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True)` ¶

`repr()` ¶

`forward(g)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`set_max_num_nodes_edges_per_graph(max_nodes, max_edges)` ¶

`graphium.nn.architectures.pyg_architectures` ¶

`FeedForwardPyg` ¶

`graphium.nn.architectures.encoder_manager` ¶

`EncoderManager` ¶

`in_dims: Iterable[int]` `property` ¶

`input_keys: Iterable[str]` `property` ¶

`out_dim: int` `property` ¶

`init(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager')` ¶

`forward(g)` ¶

`forward_positional_encoding(g)` ¶

`forward_simple_pooling(h, pooling, dim)` ¶

`make_mup_base_kwargs(divide_factor=2.0)` ¶