Skip to content

graphium.nn.architectures

High level architectures in the library

Global Architectures


graphium.nn.architectures.global_architectures


Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.


EnsembleFeedForwardNN

Bases: FeedForwardNN

__init__(in_dim, out_dim, hidden_dims, num_ensemble, reduction, subset_in_dim=1.0, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='ens-fc', layer_kwargs=None, last_layer_is_readout=False)

An ensemble of flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    Either an integer specifying all the hidden dimensions,
    or a list of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

num_ensemble:
    Number of MLPs that run in parallel.

reduction:
    Reduction to use at the end of the MLP. Choices:

    - "none" or `None`: No reduction
    - "mean": Mean reduction
    - "sum": Sum reduction
    - "max": Max reduction
    - "min": Min reduction
    - "median": Median reduction
    - `Callable`: Any callable function. Must take `dim` as a keyword argument.

subset_in_dim:
    If float, ratio of the subset of the ensemble to use. Must be between 0 and 1.
    If int, number of elements to subset from in_dim.
    If `None`, the subset_in_dim is set to `1.0`.
    A different subset is used for each ensemble.
    Only valid if the input shape is `[B, Din]`.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use.
    If `hidden_dims` is a list, then
    `depth` must be `None` or equal to `len(hidden_dims) + 1`

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last_layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    Either "ens-fc" as the `EnsembleFCLayer`, or a class representing the `nn.Module`
    to use.

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()

Controls how the class is printed

forward(h)

Subset the hidden dimension for each MLP, forward the ensemble MLP on the input features, then reduce the output if specified.

Parameters:

h: `torch.Tensor[B, Din]` or `torch.Tensor[..., 1, B, Din]` or `torch.Tensor[..., L, B, Din]`:

    Input feature tensor, before the MLP.
    `Din` is the number of input features, `B` is the batch size, and `L` is the number of ensembles.

Returns:

`torch.Tensor[..., L, B, Dout]` or `torch.Tensor[..., B, Dout]`:

    Output feature tensor, after the MLP.
    `Dout` is the number of output features, `B` is the batch size, and `L` is the number of ensembles.
    `L` is removed if a reduction is specified.
get_init_kwargs()

Get a dictionary that can be used to instanciate a new object with identical parameters.

FeedForwardGraph

Bases: FeedForwardNN

__init__(in_dim, out_dim, hidden_dims, layer_type, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, in_dim_edges=0, hidden_dims_edges=[], out_dim_edges=None, name='GNN', layer_kwargs=None, virtual_node='none', use_virtual_edges=False, last_layer_is_readout=False)

A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

This class is meant to work with different graph neural networks layers. Any layer must inherit from graphium.nn.base_graph_layer.BaseGraphStructure or graphium.nn.base_graph_layer.BaseGraphLayer.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    List of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

layer_type:
    Type of layer to use. Can be a string or nn.Module.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use. If `hidden_dims` is a `list`, `depth` must
    be `None`.

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

in_dim_edges:
    Input edge-feature dimensions of the network. Keep at 0 if not using
    edge features, or if the layer doesn't support edges.

hidden_dims_edges:
    Hidden dimensions for the edges. Most models don't support it, so it
    should only be used for those that do, i.e. `GatedGCNLayer`

out_dim_edges:
    Output edge-feature dimensions of the network. Keep at 0 if not using
    edge features, or if the layer doesn't support edges. Defaults to the
    last value of hidden_dims_edges.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    A class that inherits from `graphium.nn.base_graph_layer.BaseGraphStructure`,
    or one of the following strings

    - "pyg:gin": GINConvPyg
    - "pyg:gine": GINEConvPyg
    - "pyg:gated-gcn": GatedGCNPyg
    - "pyg:pna-msgpass": PNAMessagePassingPyg

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

virtual_node:
    A string associated to the type of virtual node to use,
    either `None`, "none", "mean", "sum", "max", "logsum".
    See `graphium.nn.pooling_pyg.VirtualNode`.

    The virtual node will not use any residual connection if `residual_type`
    is "none". Otherwise, it will use a simple ResNet like residual
    connection.

use_virtual_edges:
    A bool flag used to select if the virtual node should use the edges or not

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()

Controls how the class is printed

forward(g)

Apply the full graph neural network on the input graph and node features.

Parameters:

g:
    pyg Batch graph on which the convolution is done with the keys:

    - `"feat"`: torch.Tensor[..., N, Din]
      Node feature tensor, before convolution.
      `N` is the number of nodes, `Din` is the input features

    - `"edge_feat"` (torch.Tensor[..., N, Ein]):
      Edge feature tensor, before convolution.
      `N` is the number of nodes, `Ein` is the input edge features

Returns:

`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
    Node or graph feature tensor, after the network.
    `N` is the number of nodes, `M` is the number of graphs,
    `Dout` is the output dimension ``self.out_dim``
    If the `self.pooling` is [`None`], then it returns node features and the output dimension is `N`,
    otherwise it returns graph features and the output dimension is `M`
get_init_kwargs()

Get a dictionary that can be used to instanciate a new object with identical parameters.

get_nested_key(d, target_key)

Get the value associated with a key in a nested dictionary.

Parameters: - d: The dictionary to search in - target_key: The key to search for

Returns: - The value associated with the key if found, None otherwise

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension for the nodes

Returns:

Name Type Description
kwargs Dict[str, Any]

Dictionary of parameters to be used to instanciate the base model divided by the factor

FeedForwardNN

Bases: Module, MupMixin

cache_readouts: bool property

Whether the readout cache is enabled

__init__(in_dim, out_dim, hidden_dims, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='fc', layer_kwargs=None, last_layer_is_readout=False)

A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.

Parameters:

in_dim:
    Input feature dimensions of the layer

out_dim:
    Output feature dimensions of the layer

hidden_dims:
    Either an integer specifying all the hidden dimensions,
    or a list of dimensions in the hidden layers.
    Be careful, the "simple" residual type only supports
    hidden dimensions of the same value.

depth:
    If `hidden_dims` is an integer, `depth` is 1 + the number of
    hidden layers to use.
    If `hidden_dims` is a list, then
    `depth` must be `None` or equal to `len(hidden_dims) + 1`

activation:
    activation function to use in the hidden layers.

last_activation:
    activation function to use in the last layer.

dropout:
    The ratio of units to dropout. Must be between 0 and 1

last_dropout:
    The ratio of units to dropout for the last_layer. Must be between 0 and 1

normalization:
    Normalization to use. Choices:

    - "none" or `None`: No normalization
    - "batch_norm": Batch normalization
    - "layer_norm": Layer normalization
    - `Callable`: Any callable function

first_normalization:
    Whether to use batch normalization **before** the first layer

last_normalization:
    Whether to use batch normalization in the last layer

residual_type:
    - "none": No residual connection
    - "simple": Residual connection similar to the ResNet architecture.
      See class `ResidualConnectionSimple`
    - "weighted": Residual connection similar to the Resnet architecture,
      but with weights applied before the summation. See class `ResidualConnectionWeighted`
    - "concat": Residual connection where the residual is concatenated instead
      of being added.
    - "densenet": Residual connection where the residual of all previous layers
      are concatenated. This leads to a strong increase in the number of parameters
      if there are multiple hidden layers.

residual_skip_steps:
    The number of steps to skip between each residual connection.
    If `1`, all the layers are connected. If `2`, half of the
    layers are connected.

name:
    Name attributed to the current network, for display and printing
    purposes.

layer_type:
    The type of layers to use in the network.
    Either "fc" as the `FCLayer`, or a class representing the `nn.Module`
    to use.

layer_kwargs:
    The arguments to be used in the initialization of the layer provided by `layer_type`

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()

Controls how the class is printed

add_layers(layers)

Add layers to the end of the model.

drop_layers(depth)

Remove the last layers of the model part.

forward(h)

Apply the neural network on the input features.

Parameters:

h: `torch.Tensor[..., Din]`:
    Input feature tensor, before the network.
    `Din` is the number of input features

Returns:

`torch.Tensor[..., Dout]`:
    Output feature tensor, after the network.
    `Dout` is the number of output features
get_init_kwargs()

Get a dictionary that can be used to instanciate a new object with identical parameters.

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

FullGraphMultiTaskNetwork

Bases: Module, MupMixin

in_dim: int property

Returns the input dimension of the network

in_dim_edges: int property

Returns the input edge dimension of the network

out_dim: int property

Returns the output dimension of the network

out_dim_edges: int property

Returns the output dimension of the edges of the network.

__init__(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN')

Class that allows to implement a full graph neural network architecture, including the pre-processing MLP and the post processing MLP.

Parameters:

gnn_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    GNN network using the class `FeedForwardGraph`.
    It must respect the following criteria:

    - gnn_kwargs["in_dim"] must be equal to pre_nn_kwargs["out_dim"]
    - gnn_kwargs["out_dim"] must be equal to graph_output_nn_kwargs["in_dim"]

pe_encoders_kwargs:
    key-word arguments to use for the initialization of all positional encoding encoders.
    See the class `EncoderManager` for more details.

pre_nn_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    MLP network of the node features before the GNN, using the class `FeedForwardNN`.
    If `None`, there won't be a pre-processing MLP.

pre_nn_edges_kwargs:
    key-word arguments to use for the initialization of the pre-processing
    MLP network of the edge features before the GNN, using the class `FeedForwardNN`.
    If `None`, there won't be a pre-processing MLP.

task_heads_kwargs:
    This argument is a list of dictionaries containing the arguments for task heads. Each argument is used to
    initialize a task-specific MLP.

graph_output_nn_kwargs:
    This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
    Each dict of arguments is used to initialize a shared MLP.

accelerator_kwargs:
    key-word arguments specific to the accelerator being used,
    e.g. pipeline split points

num_inference_to_average:
    Number of inferences to average at val/test time. This is used to avoid the noise introduced
    by positional encodings with sign-flips. In case no such encoding is given,
    this parameter is ignored.
    NOTE: The inference time will be slowed-down proportionaly to this parameter.

last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup

name:
    Name attributed to the current network, for display and printing
    purposes.
__repr__()

Controls how the class is printed

create_module_map(level='layers')

Function to create mapping between each (sub)module name and corresponding nn.ModuleList() (if possible); Used for finetuning when (partially) loading or freezing specific modules of the pretrained model

Parameters:

Name Type Description Default
level Union[Literal['layers'], Literal['module']]

Whether to map to the module object or the layers of the module object

'layers'
forward(g)

Apply the pre-processing neural network, the graph neural network, and the post-processing neural network on the graph features.

Parameters:

g:
    pyg Batch graph on which the convolution is done.
    Must contain the following elements:

    - Node key `"feat"`: `torch.Tensor[..., N, Din]`.
      Input node feature tensor, before the network.
      `N` is the number of nodes, `Din` is the input features dimension ``self.pre_nn.in_dim``

    - Edge key `"edge_feat"`: `torch.Tensor[..., N, Ein]` **Optional**.
      The edge features to use. It will be ignored if the
      model doesn't supporte edge features or if
      `self.in_dim_edges==0`.

    - Other keys related to positional encodings `"pos_enc_feats_sign_flip"`,
      `"pos_enc_feats_no_flip"`.

Returns:

`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
    Node or graph feature tensor, after the network.
    `N` is the number of nodes, `M` is the number of graphs,
    `Dout` is the output dimension ``self.graph_output_nn.out_dim``
    If the `self.gnn.pooling` is [`None`], then it returns node features and the output dimension is `N`,
    otherwise it returns graph features and the output dimension is `M`
make_mup_base_kwargs(divide_factor=2.0)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width.

Returns:

Type Description
Dict[str, Any]

Dictionary with the kwargs to create the base model.

set_max_num_nodes_edges_per_graph(max_nodes, max_edges)

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name Type Description Default
max_nodes Optional[int]

Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.

required
max_edges Optional[int]

Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.

required

GraphOutputNN

Bases: Module, MupMixin

concat_last_layers: Optional[Iterable[int]] property writable

Property to control the output of the self.forward. If set to a list of integer, the forward function will concatenate the output of different layers.

If set to None, the output of the last layer is returned.

NOTE: The indexes are inverted. 0 is the last layer, 1 is the second last, etc.

out_dim: int property

Returns the output dimension of the network

__init__(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs)

Parameters:

Name Type Description Default
in_dim int

Input feature dimensions of the layer

required
in_dim_edges int

Input edge feature dimensions of the layer

required
task_level str

graph/node/edge/nodepair depending on wether it is graph/node/edge/nodepair level task

required
graph_output_nn_kwargs Dict[str, Any]

key-word arguments to use for the initialization of the post-processing MLP network after the GNN, using the class FeedForwardNN.

required
compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False)

Vectorized implementation of nodepair-level task: Parameters: node_feats: Node features batch: Batch vector max_num_nodes: The maximum number of nodes per graph fill_value: The value for invalid entries in the resulting dense output tensor. (default: :obj:NaN) batch_size: The batch size. (default: :obj:None) drop_nodes_last_graph: Whether to drop the nodes of the last graphs that exceed the max_num_nodes_per_graph. Useful when the last graph is a padding. Returns: result: concatenated node features of shape B * max_num_nodes * 2*h, where B is number of graphs, max_num_nodes is the chosen maximum number nodes, and h is the feature dim

drop_graph_output_nn_layers(num_layers_to_drop)

Remove the last layers of the model. Useful for Transfer Learning. Parameters: num_layers_to_drop: The number of layers to drop from the self.graph_output_nn network.

extend_graph_output_nn_layers(layers)

Add layers at the end of the model. Useful for Transfer Learning. Parameters: layers: A ModuleList of all the layers to extend

forward(g)

Parameters:

Name Type Description Default
g Batch

pyg Batch graph

required

Returns: h: Output features after applying graph_output_nn

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns:

Type Description
Dict[str, Any]

Dictionary with the kwargs to create the base model.

set_max_num_nodes_edges_per_graph(max_nodes, max_edges)

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name Type Description Default
max_nodes Optional[int]

Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.

required
max_edges Optional[int]

Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.

required

TaskHeads

Bases: Module, MupMixin

out_dim: Dict[str, int] property

Returns the output dimension of each task head

__init__(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True)

Class that groups all multi-task output heads together to provide the task-specific outputs. Parameters: in_dim: Input feature dimensions of the layer

in_dim_edges:
    Input edge feature dimensions of the layer
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
    Allows to use the `mup.MuReadout` from the muTransfer method
task_heads_kwargs:
    This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
    Each dict of arguments is used to initialize a task-specific MLP.
graph_output_nn_kwargs:
    key-word arguments to use for the initialization of the post-processing
    MLP network after the GNN, using the class `FeedForwardNN`.
__repr__()

Returns a string representation of the task heads

forward(g)

forward function of the task head Parameters: g: pyg Batch graph Returns: task_head_outputs: Return a dictionary: Dict[task_name, Tensor]

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns:

Name Type Description
kwargs Dict[str, Any]

Dictionary of arguments to be used to initialize the base model

set_max_num_nodes_edges_per_graph(max_nodes, max_edges)

Set the maximum number of nodes and edges for all gnn layers and encoder layers

Parameters:

Name Type Description Default
max_nodes Optional[int]

Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others.

required
max_edges Optional[int]

Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others.

required

PyG Architectures


graphium.nn.architectures.pyg_architectures


Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.


FeedForwardPyg

Encoder Manager


graphium.nn.architectures.encoder_manager


Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.

Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.

Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.


EncoderManager

Bases: Module

in_dims: Iterable[int] property

Returns the input dimensions for all pe-encoders

Returns:

Name Type Description
in_dims Iterable[int]

the input dimensions for all pe-encoders

input_keys: Iterable[str] property

Returns the input keys for all pe-encoders

Returns:

Name Type Description
input_keys Iterable[str]

the input keys for all pe-encoders

out_dim: int property

Returns the output dimension of the pooled embedding from all the pe encoders

Returns:

Name Type Description
out_dim int

the output dimension of the pooled embedding from all the pe encoders

__init__(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager')

Class that allows to runs multiple encoders in parallel and concatenate / pool their outputs. Parameters:

pe_encoders_kwargs:
    key-word arguments to use for the initialization of all positional encoding encoders
    can use the class PE_ENCODERS_DICT: "la_encoder"(tested) , "mlp_encoder" (not tested), "signnet_encoder" (not tested)

name:
    Name attributed to the current network, for display and printing
    purposes.
forward(g)

forward pass of the pe encoders and pooling

Parameters:

Name Type Description Default
g Batch

ptg Batch on which the convolution is done. Must contain the following elements:

  • Node key "feat": torch.Tensor[..., N, Din]. Input node feature tensor, before the network. N is the number of nodes, Din is the input features dimension self.pre_nn.in_dim

  • Edge key "edge_feat": torch.Tensor[..., N, Ein] Optional. The edge features to use. It will be ignored if the model doesn't supporte edge features or if self.in_dim_edges==0.

  • Other keys related to positional encodings "pos_enc_feats_sign_flip", "pos_enc_feats_no_flip".

required

Returns:

Name Type Description
g Batch

pyg Batch with the positional encodings added to the graph

forward_positional_encoding(g)

Forward pass for the positional encodings (PE), with each PE having it's own encoder defined in self.pe_encoders. All the positional encodings with the same keys are pooled together using self.pe_pooling.

Parameters:

Name Type Description Default
g Batch

pyg Batch containing the node positional encodings

required

Returns:

Name Type Description
pe_node_pooled Dict[str, Tensor]

The positional / structural encodings go through

Dict[str, Tensor]

encoders, then are pooled together according to their keys.

forward_simple_pooling(h, pooling, dim)

Apply sum, mean, or max pooling on a Tensor. Parameters: h: the Tensor to pool pooling: string specifiying the pooling method dim: the dimension to pool over

Returns:

Name Type Description
pooled Tensor

the pooled Tensor

make_mup_base_kwargs(divide_factor=2.0)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width.

Returns:

Name Type Description
pe_kw Dict[str, Any]

the model kwargs where the dimensions are divided by the factor