graphium.nn.architectures¶
High level architectures in the library
Global Architectures¶
graphium.nn.architectures.global_architectures
¶
Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.
Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.
Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.
EnsembleFeedForwardNN
¶
Bases: FeedForwardNN
__init__(in_dim, out_dim, hidden_dims, num_ensemble, reduction, subset_in_dim=1.0, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='ens-fc', layer_kwargs=None, last_layer_is_readout=False)
¶
An ensemble of flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.
Parameters:
in_dim:
Input feature dimensions of the layer
out_dim:
Output feature dimensions of the layer
hidden_dims:
Either an integer specifying all the hidden dimensions,
or a list of dimensions in the hidden layers.
Be careful, the "simple" residual type only supports
hidden dimensions of the same value.
num_ensemble:
Number of MLPs that run in parallel.
reduction:
Reduction to use at the end of the MLP. Choices:
- "none" or `None`: No reduction
- "mean": Mean reduction
- "sum": Sum reduction
- "max": Max reduction
- "min": Min reduction
- "median": Median reduction
- `Callable`: Any callable function. Must take `dim` as a keyword argument.
subset_in_dim:
If float, ratio of the subset of the ensemble to use. Must be between 0 and 1.
If int, number of elements to subset from in_dim.
If `None`, the subset_in_dim is set to `1.0`.
A different subset is used for each ensemble.
Only valid if the input shape is `[B, Din]`.
depth:
If `hidden_dims` is an integer, `depth` is 1 + the number of
hidden layers to use.
If `hidden_dims` is a list, then
`depth` must be `None` or equal to `len(hidden_dims) + 1`
activation:
activation function to use in the hidden layers.
last_activation:
activation function to use in the last layer.
dropout:
The ratio of units to dropout. Must be between 0 and 1
last_dropout:
The ratio of units to dropout for the last_layer. Must be between 0 and 1
normalization:
Normalization to use. Choices:
- "none" or `None`: No normalization
- "batch_norm": Batch normalization
- "layer_norm": Layer normalization
- `Callable`: Any callable function
first_normalization:
Whether to use batch normalization **before** the first layer
last_normalization:
Whether to use batch normalization in the last layer
residual_type:
- "none": No residual connection
- "simple": Residual connection similar to the ResNet architecture.
See class `ResidualConnectionSimple`
- "weighted": Residual connection similar to the Resnet architecture,
but with weights applied before the summation. See class `ResidualConnectionWeighted`
- "concat": Residual connection where the residual is concatenated instead
of being added.
- "densenet": Residual connection where the residual of all previous layers
are concatenated. This leads to a strong increase in the number of parameters
if there are multiple hidden layers.
residual_skip_steps:
The number of steps to skip between each residual connection.
If `1`, all the layers are connected. If `2`, half of the
layers are connected.
name:
Name attributed to the current network, for display and printing
purposes.
layer_type:
The type of layers to use in the network.
Either "ens-fc" as the `EnsembleFCLayer`, or a class representing the `nn.Module`
to use.
layer_kwargs:
The arguments to be used in the initialization of the layer provided by `layer_type`
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()
¶
Controls how the class is printed
forward(h)
¶
Subset the hidden dimension for each MLP, forward the ensemble MLP on the input features, then reduce the output if specified.
Parameters:
h: `torch.Tensor[B, Din]` or `torch.Tensor[..., 1, B, Din]` or `torch.Tensor[..., L, B, Din]`:
Input feature tensor, before the MLP.
`Din` is the number of input features, `B` is the batch size, and `L` is the number of ensembles.
Returns:
`torch.Tensor[..., L, B, Dout]` or `torch.Tensor[..., B, Dout]`:
Output feature tensor, after the MLP.
`Dout` is the number of output features, `B` is the batch size, and `L` is the number of ensembles.
`L` is removed if a reduction is specified.
get_init_kwargs()
¶
Get a dictionary that can be used to instanciate a new object with identical parameters.
FeedForwardGraph
¶
Bases: FeedForwardNN
__init__(in_dim, out_dim, hidden_dims, layer_type, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, in_dim_edges=0, hidden_dims_edges=[], out_dim_edges=None, name='GNN', layer_kwargs=None, virtual_node='none', use_virtual_edges=False, last_layer_is_readout=False)
¶
A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.
This class is meant to work with different graph neural networks
layers. Any layer must inherit from graphium.nn.base_graph_layer.BaseGraphStructure
or graphium.nn.base_graph_layer.BaseGraphLayer
.
Parameters:
in_dim:
Input feature dimensions of the layer
out_dim:
Output feature dimensions of the layer
hidden_dims:
List of dimensions in the hidden layers.
Be careful, the "simple" residual type only supports
hidden dimensions of the same value.
layer_type:
Type of layer to use. Can be a string or nn.Module.
depth:
If `hidden_dims` is an integer, `depth` is 1 + the number of
hidden layers to use. If `hidden_dims` is a `list`, `depth` must
be `None`.
activation:
activation function to use in the hidden layers.
last_activation:
activation function to use in the last layer.
dropout:
The ratio of units to dropout. Must be between 0 and 1
last_dropout:
The ratio of units to dropout for the last layer. Must be between 0 and 1
normalization:
Normalization to use. Choices:
- "none" or `None`: No normalization
- "batch_norm": Batch normalization
- "layer_norm": Layer normalization
- `Callable`: Any callable function
first_normalization:
Whether to use batch normalization **before** the first layer
last_normalization:
Whether to use batch normalization in the last layer
residual_type:
- "none": No residual connection
- "simple": Residual connection similar to the ResNet architecture.
See class `ResidualConnectionSimple`
- "weighted": Residual connection similar to the Resnet architecture,
but with weights applied before the summation. See class `ResidualConnectionWeighted`
- "concat": Residual connection where the residual is concatenated instead
of being added.
- "densenet": Residual connection where the residual of all previous layers
are concatenated. This leads to a strong increase in the number of parameters
if there are multiple hidden layers.
residual_skip_steps:
The number of steps to skip between each residual connection.
If `1`, all the layers are connected. If `2`, half of the
layers are connected.
in_dim_edges:
Input edge-feature dimensions of the network. Keep at 0 if not using
edge features, or if the layer doesn't support edges.
hidden_dims_edges:
Hidden dimensions for the edges. Most models don't support it, so it
should only be used for those that do, i.e. `GatedGCNLayer`
out_dim_edges:
Output edge-feature dimensions of the network. Keep at 0 if not using
edge features, or if the layer doesn't support edges. Defaults to the
last value of hidden_dims_edges.
name:
Name attributed to the current network, for display and printing
purposes.
layer_type:
The type of layers to use in the network.
A class that inherits from `graphium.nn.base_graph_layer.BaseGraphStructure`,
or one of the following strings
- "pyg:gin": GINConvPyg
- "pyg:gine": GINEConvPyg
- "pyg:gated-gcn": GatedGCNPyg
- "pyg:pna-msgpass": PNAMessagePassingPyg
layer_kwargs:
The arguments to be used in the initialization of the layer provided by `layer_type`
virtual_node:
A string associated to the type of virtual node to use,
either `None`, "none", "mean", "sum", "max", "logsum".
See `graphium.nn.pooling_pyg.VirtualNode`.
The virtual node will not use any residual connection if `residual_type`
is "none". Otherwise, it will use a simple ResNet like residual
connection.
use_virtual_edges:
A bool flag used to select if the virtual node should use the edges or not
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()
¶
Controls how the class is printed
forward(g)
¶
Apply the full graph neural network on the input graph and node features.
Parameters:
g:
pyg Batch graph on which the convolution is done with the keys:
- `"feat"`: torch.Tensor[..., N, Din]
Node feature tensor, before convolution.
`N` is the number of nodes, `Din` is the input features
- `"edge_feat"` (torch.Tensor[..., N, Ein]):
Edge feature tensor, before convolution.
`N` is the number of nodes, `Ein` is the input edge features
Returns:
`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
Node or graph feature tensor, after the network.
`N` is the number of nodes, `M` is the number of graphs,
`Dout` is the output dimension ``self.out_dim``
If the `self.pooling` is [`None`], then it returns node features and the output dimension is `N`,
otherwise it returns graph features and the output dimension is `M`
get_init_kwargs()
¶
Get a dictionary that can be used to instanciate a new object with identical parameters.
get_nested_key(d, target_key)
¶
Get the value associated with a key in a nested dictionary.
Parameters: - d: The dictionary to search in - target_key: The key to search for
Returns: - The value associated with the key if found, None otherwise
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension for the nodes
Returns:
Name | Type | Description |
---|---|---|
kwargs |
Dict[str, Any]
|
Dictionary of parameters to be used to instanciate the base model divided by the factor |
FeedForwardNN
¶
Bases: Module
, MupMixin
cache_readouts: bool
property
¶
Whether the readout cache is enabled
__init__(in_dim, out_dim, hidden_dims, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='fc', layer_kwargs=None, last_layer_is_readout=False)
¶
A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.
Parameters:
in_dim:
Input feature dimensions of the layer
out_dim:
Output feature dimensions of the layer
hidden_dims:
Either an integer specifying all the hidden dimensions,
or a list of dimensions in the hidden layers.
Be careful, the "simple" residual type only supports
hidden dimensions of the same value.
depth:
If `hidden_dims` is an integer, `depth` is 1 + the number of
hidden layers to use.
If `hidden_dims` is a list, then
`depth` must be `None` or equal to `len(hidden_dims) + 1`
activation:
activation function to use in the hidden layers.
last_activation:
activation function to use in the last layer.
dropout:
The ratio of units to dropout. Must be between 0 and 1
last_dropout:
The ratio of units to dropout for the last_layer. Must be between 0 and 1
normalization:
Normalization to use. Choices:
- "none" or `None`: No normalization
- "batch_norm": Batch normalization
- "layer_norm": Layer normalization
- `Callable`: Any callable function
first_normalization:
Whether to use batch normalization **before** the first layer
last_normalization:
Whether to use batch normalization in the last layer
residual_type:
- "none": No residual connection
- "simple": Residual connection similar to the ResNet architecture.
See class `ResidualConnectionSimple`
- "weighted": Residual connection similar to the Resnet architecture,
but with weights applied before the summation. See class `ResidualConnectionWeighted`
- "concat": Residual connection where the residual is concatenated instead
of being added.
- "densenet": Residual connection where the residual of all previous layers
are concatenated. This leads to a strong increase in the number of parameters
if there are multiple hidden layers.
residual_skip_steps:
The number of steps to skip between each residual connection.
If `1`, all the layers are connected. If `2`, half of the
layers are connected.
name:
Name attributed to the current network, for display and printing
purposes.
layer_type:
The type of layers to use in the network.
Either "fc" as the `FCLayer`, or a class representing the `nn.Module`
to use.
layer_kwargs:
The arguments to be used in the initialization of the layer provided by `layer_type`
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
__repr__()
¶
Controls how the class is printed
add_layers(layers)
¶
Add layers to the end of the model.
drop_layers(depth)
¶
Remove the last layers of the model part.
forward(h)
¶
Apply the neural network on the input features.
Parameters:
h: `torch.Tensor[..., Din]`:
Input feature tensor, before the network.
`Din` is the number of input features
Returns:
`torch.Tensor[..., Dout]`:
Output feature tensor, after the network.
`Dout` is the number of output features
get_init_kwargs()
¶
Get a dictionary that can be used to instanciate a new object with identical parameters.
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
FullGraphMultiTaskNetwork
¶
Bases: Module
, MupMixin
in_dim: int
property
¶
Returns the input dimension of the network
in_dim_edges: int
property
¶
Returns the input edge dimension of the network
out_dim: int
property
¶
Returns the output dimension of the network
out_dim_edges: int
property
¶
Returns the output dimension of the edges of the network.
__init__(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN')
¶
Class that allows to implement a full graph neural network architecture, including the pre-processing MLP and the post processing MLP.
Parameters:
gnn_kwargs:
key-word arguments to use for the initialization of the pre-processing
GNN network using the class `FeedForwardGraph`.
It must respect the following criteria:
- gnn_kwargs["in_dim"] must be equal to pre_nn_kwargs["out_dim"]
- gnn_kwargs["out_dim"] must be equal to graph_output_nn_kwargs["in_dim"]
pe_encoders_kwargs:
key-word arguments to use for the initialization of all positional encoding encoders.
See the class `EncoderManager` for more details.
pre_nn_kwargs:
key-word arguments to use for the initialization of the pre-processing
MLP network of the node features before the GNN, using the class `FeedForwardNN`.
If `None`, there won't be a pre-processing MLP.
pre_nn_edges_kwargs:
key-word arguments to use for the initialization of the pre-processing
MLP network of the edge features before the GNN, using the class `FeedForwardNN`.
If `None`, there won't be a pre-processing MLP.
task_heads_kwargs:
This argument is a list of dictionaries containing the arguments for task heads. Each argument is used to
initialize a task-specific MLP.
graph_output_nn_kwargs:
This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
Each dict of arguments is used to initialize a shared MLP.
accelerator_kwargs:
key-word arguments specific to the accelerator being used,
e.g. pipeline split points
num_inference_to_average:
Number of inferences to average at val/test time. This is used to avoid the noise introduced
by positional encodings with sign-flips. In case no such encoding is given,
this parameter is ignored.
NOTE: The inference time will be slowed-down proportionaly to this parameter.
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
Allows to use the `mup.MuReadout` from the muTransfer method https://github.com/microsoft/mup
name:
Name attributed to the current network, for display and printing
purposes.
__repr__()
¶
Controls how the class is printed
create_module_map(level='layers')
¶
Function to create mapping between each (sub)module name and corresponding nn.ModuleList() (if possible); Used for finetuning when (partially) loading or freezing specific modules of the pretrained model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
level |
Union[Literal['layers'], Literal['module']]
|
Whether to map to the module object or the layers of the module object |
'layers'
|
forward(g)
¶
Apply the pre-processing neural network, the graph neural network, and the post-processing neural network on the graph features.
Parameters:
g:
pyg Batch graph on which the convolution is done.
Must contain the following elements:
- Node key `"feat"`: `torch.Tensor[..., N, Din]`.
Input node feature tensor, before the network.
`N` is the number of nodes, `Din` is the input features dimension ``self.pre_nn.in_dim``
- Edge key `"edge_feat"`: `torch.Tensor[..., N, Ein]` **Optional**.
The edge features to use. It will be ignored if the
model doesn't supporte edge features or if
`self.in_dim_edges==0`.
- Other keys related to positional encodings `"pos_enc_feats_sign_flip"`,
`"pos_enc_feats_no_flip"`.
Returns:
`torch.Tensor[..., M, Dout]` or `torch.Tensor[..., N, Dout]`:
Node or graph feature tensor, after the network.
`N` is the number of nodes, `M` is the number of graphs,
`Dout` is the output dimension ``self.graph_output_nn.out_dim``
If the `self.gnn.pooling` is [`None`], then it returns node features and the output dimension is `N`,
otherwise it returns graph features and the output dimension is `M`
make_mup_base_kwargs(divide_factor=2.0)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary with the kwargs to create the base model. |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
GraphOutputNN
¶
Bases: Module
, MupMixin
concat_last_layers: Optional[Iterable[int]]
property
writable
¶
Property to control the output of the self.forward
.
If set to a list of integer, the forward
function will
concatenate the output of different layers.
If set to None
, the output of the last layer is returned.
NOTE: The indexes are inverted. 0 is the last layer, 1 is the second last, etc.
out_dim: int
property
¶
Returns the output dimension of the network
__init__(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_dim |
int
|
Input feature dimensions of the layer |
required |
in_dim_edges |
int
|
Input edge feature dimensions of the layer |
required |
task_level |
str
|
graph/node/edge/nodepair depending on wether it is graph/node/edge/nodepair level task |
required |
graph_output_nn_kwargs |
Dict[str, Any]
|
key-word arguments to use for the initialization of the post-processing
MLP network after the GNN, using the class |
required |
compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False)
¶
Vectorized implementation of nodepair-level task:
Parameters:
node_feats: Node features
batch: Batch vector
max_num_nodes: The maximum number of nodes per graph
fill_value: The value for invalid entries in the
resulting dense output tensor. (default: :obj:NaN
)
batch_size: The batch size. (default: :obj:None
)
drop_nodes_last_graph: Whether to drop the nodes of the last graphs that exceed
the max_num_nodes_per_graph
. Useful when the last graph is a padding.
Returns:
result: concatenated node features of shape B * max_num_nodes * 2*h,
where B is number of graphs, max_num_nodes is the chosen maximum number nodes, and h is the feature dim
drop_graph_output_nn_layers(num_layers_to_drop)
¶
Remove the last layers of the model. Useful for Transfer Learning.
Parameters:
num_layers_to_drop: The number of layers to drop from the self.graph_output_nn
network.
extend_graph_output_nn_layers(layers)
¶
Add layers at the end of the model. Useful for Transfer Learning. Parameters: layers: A ModuleList of all the layers to extend
forward(g)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch graph |
required |
Returns: h: Output features after applying graph_output_nn
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary with the kwargs to create the base model. |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
TaskHeads
¶
Bases: Module
, MupMixin
out_dim: Dict[str, int]
property
¶
Returns the output dimension of each task head
__init__(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True)
¶
Class that groups all multi-task output heads together to provide the task-specific outputs. Parameters: in_dim: Input feature dimensions of the layer
in_dim_edges:
Input edge feature dimensions of the layer
last_layer_is_readout: Whether the last layer should be treated as a readout layer.
Allows to use the `mup.MuReadout` from the muTransfer method
task_heads_kwargs:
This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN.
Each dict of arguments is used to initialize a task-specific MLP.
graph_output_nn_kwargs:
key-word arguments to use for the initialization of the post-processing
MLP network after the GNN, using the class `FeedForwardNN`.
__repr__()
¶
Returns a string representation of the task heads
forward(g)
¶
forward function of the task head Parameters: g: pyg Batch graph Returns: task_head_outputs: Return a dictionary: Dict[task_name, Tensor]
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns:
Name | Type | Description |
---|---|---|
kwargs |
Dict[str, Any]
|
Dictionary of arguments to be used to initialize the base model |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
PyG Architectures¶
graphium.nn.architectures.pyg_architectures
¶
Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.
Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.
Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.
FeedForwardPyg
¶
Bases: FeedForwardGraph
Encoder Manager¶
graphium.nn.architectures.encoder_manager
¶
Copyright (c) 2023 Valence Labs, Recursion Pharmaceuticals and Graphcore Limited.
Use of this software is subject to the terms and conditions outlined in the LICENSE file. Unauthorized modification, distribution, or use is prohibited. Provided 'as is' without warranties of any kind.
Valence Labs, Recursion Pharmaceuticals and Graphcore Limited are not liable for any damages arising from its use. Refer to the LICENSE file for the full terms and conditions.
EncoderManager
¶
Bases: Module
in_dims: Iterable[int]
property
¶
Returns the input dimensions for all pe-encoders
Returns:
Name | Type | Description |
---|---|---|
in_dims |
Iterable[int]
|
the input dimensions for all pe-encoders |
input_keys: Iterable[str]
property
¶
Returns the input keys for all pe-encoders
Returns:
Name | Type | Description |
---|---|---|
input_keys |
Iterable[str]
|
the input keys for all pe-encoders |
out_dim: int
property
¶
Returns the output dimension of the pooled embedding from all the pe encoders
Returns:
Name | Type | Description |
---|---|---|
out_dim |
int
|
the output dimension of the pooled embedding from all the pe encoders |
__init__(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager')
¶
Class that allows to runs multiple encoders in parallel and concatenate / pool their outputs. Parameters:
pe_encoders_kwargs:
key-word arguments to use for the initialization of all positional encoding encoders
can use the class PE_ENCODERS_DICT: "la_encoder"(tested) , "mlp_encoder" (not tested), "signnet_encoder" (not tested)
name:
Name attributed to the current network, for display and printing
purposes.
forward(g)
¶
forward pass of the pe encoders and pooling
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
ptg Batch on which the convolution is done. Must contain the following elements:
|
required |
Returns:
Name | Type | Description |
---|---|---|
g |
Batch
|
pyg Batch with the positional encodings added to the graph |
forward_positional_encoding(g)
¶
Forward pass for the positional encodings (PE),
with each PE having it's own encoder defined in self.pe_encoders
.
All the positional encodings with the same keys are pooled together
using self.pe_pooling
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch containing the node positional encodings |
required |
Returns:
Name | Type | Description |
---|---|---|
pe_node_pooled |
Dict[str, Tensor]
|
The positional / structural encodings go through |
Dict[str, Tensor]
|
encoders, then are pooled together according to their keys. |
forward_simple_pooling(h, pooling, dim)
¶
Apply sum, mean, or max pooling on a Tensor. Parameters: h: the Tensor to pool pooling: string specifiying the pooling method dim: the dimension to pool over
Returns:
Name | Type | Description |
---|---|---|
pooled |
Tensor
|
the pooled Tensor |
make_mup_base_kwargs(divide_factor=2.0)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width.
Returns:
Name | Type | Description |
---|---|---|
pe_kw |
Dict[str, Any]
|
the model kwargs where the dimensions are divided by the factor |