graphium.nn.encoders¶

Implementations of positional encoders in the library

Contents

Base Encoder
Gaussian Kernal Positional Encoder
Laplacian Positional Encoder
MLP Encoder
Signnet Positional Encoder

Base Encoder¶

`graphium.nn.encoders.base_encoder` ¶

`BaseEncoder` ¶

Bases: Module, MupMixin

`init(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True)` ¶

Base class for all positional and structural encoders. Initialize the encoder with the following arguments: Parameters: input_keys: The keys from the graph to use as input output_keys: The keys to return as output encodings in_dim: The input dimension for the encoder out_dim: The output dimension of the encodings num_layers: The number of layers of the encoder activation: The activation function to use first_normalization: The normalization to use before the first layer use_input_keys_prefix: Whether to use the key_prefix argument in the forward method. This is useful when the encodings are categorized by the function get_all_positional_encoding

`forward(graph, key_prefix=None)` `abstractmethod` ¶

Forward pass of the encoder on a graph. This is a method to be implemented by the child class. Parameters: graph: The input pyg Batch

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameters:

Name	Type	Description	Default
`divide_factor`	`float`	Factor by which to divide the width.	`2.0`
`factor_in_dim`	`bool`	Whether to factor the input dimension	`False`

Returns: A dictionary with the base model arguments

`parse_input_keys(input_keys)` `abstractmethod` ¶

Parse the input_keys argument. This is a method to be implemented by the child class. Parameters: input_keys: The input keys to parse

`parse_input_keys_with_prefix(key_prefix)` ¶

Parse the input_keys argument, given a certain prefix. If the prefix is None, it is ignored

`parse_output_keys(output_keys)` `abstractmethod` ¶

Parse the output_keys argument. This is a method to be implemented by the child class. Parameters: output_keys: The output keys to parse

Gaussian Kernal Positional Encoder¶

`graphium.nn.encoders.gaussian_kernel_pos_encoder` ¶

`GaussianKernelPosEncoder` ¶

Bases: BaseEncoder

`init(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1)` ¶

Configurable gaussian kernel-based Positional Encoding node and edge encoder. Useful for encoding 3D conformation positions.

Parameters:

Name	Type	Description	Default
`input_keys`	`List[str]`	The keys from the pyg graph to use as input	required
`output_keys`	`List[str]`	The keys to return corresponding to the output encodings	required
`in_dim`	`int`	The input dimension for the encoder	required
`out_dim`	`int`	The output dimension of the encodings	required
`embed_dim`	`int`	The dimension of the embedding	required
`num_layers`	`int`	The number of layers of the encoder	required
`max_num_nodes_per_graph`	`Optional[int]`	The maximum number of nodes per graph	`None`
`activation`	`Union[str, Callable]`	The activation function to use	`'gelu'`
`first_normalization`		The normalization to use before the first layer	`'none'`
`use_input_keys_prefix`	`bool`	Whether to use the `key_prefix` argument in the `forward` method.	`True`
`num_heads`	`int`	The number of heads to use for the multi-head attention	`1`

`forward(batch, key_prefix=None)` ¶

forward function of the GaussianKernelPosEncoder class Parameters: batch: The batch of pyg graphs key_prefix: The prefix to use for the input keys Returns: A dictionary of the output encodings with keys specified by output_keys

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: A dictionary of the base model kwargs

`parse_input_keys(input_keys)` ¶

Parse the input_keys. Parameters: input_keys: The input keys to parse Returns: The parsed input keys

`parse_output_keys(output_keys)` ¶

Parse the output_keys. Parameters: output_keys: The output keys to parse Returns: The parsed output keys

Laplacian Positional Encoder¶

`graphium.nn.encoders.laplace_pos_encoder` ¶

`LapPENodeEncoder` ¶

Bases: BaseEncoder

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs)` ¶

Laplace Positional Embedding node encoder. LapPE of size dim_pe will get appended to each node feature vector.

Parameters:

Name	Type	Description	Default
`input_keys`	`List[str]`	List of input keys to use from the data object.	required
`output_keys`	`List[str]`	List of output keys to add to the data object.	required
`in_dim`		Size of Laplace PE embedding. Only used by the MLP model	required
`hidden_dim`	`int`	Size of hidden layer	required
`out_dim`	`int`	Size of final node embedding	required
`num_layers`	`int`	Number of layers in the MLP	required
`activation`	`Optional[Union[str, Callable]]`	Activation function to use.	`'relu'`
`model_type`	`str`	'Transformer' or 'DeepSet' or 'MLP'	`'DeepSet'`
`num_layers_post`		Number of layers to apply after pooling	`1`
`dropout`		Dropout rate	`0.0`
`first_normalization`		Normalization to apply to the first layer.	`None`

`forward(batch, key_prefix=None)` ¶

Forward pass of the encoder. Parameters: batch: pyg Batches of graphs key_prefix: Prefix to use for the input and output keys. Returns: output dictionary with keys as specified in output_keys and their output embeddings.

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: Dictionary of kwargs to be used to create the base model.

`parse_input_keys(input_keys)` ¶

Parse the input keys and make sure they are supported for this encoder Parameters: input_keys: List of input keys to use from the data object. Returns: List of parsed input keys

`parse_output_keys(output_keys)` ¶

parse the output keys Parameters: output_keys: List of output keys to add to the data object. Returns: List of parsed output keys

MLP Encoder¶

`graphium.nn.encoders.mlp_encoder` ¶

`CatMLPEncoder` ¶

Bases: BaseEncoder

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)` ¶

Configurable kernel-based Positional Encoding node/edge-level encoder. Concatenates the list of input (node or edge) features in the feature dimension

Parameters:

Name	Type	Description	Default
`input_keys`	`List[str]`	List of input keys; inputs are concatenated in feat dimension and passed through mlp	required
`output_keys`	`str`	List of output keys to add to the pyg batch graph	required
`in_dim`		input dimension of the mlp encoder; sum of input dimensions of inputs	required
`hidden_dim`		hidden dimension of the mlp encoder	required
`out_dim`		output dimension of the mlp encoder	required
`num_layers`		number of layers of the mlp encoder	required
`activation`		activation function to use	`'relu'`
`dropout`		dropout to use	`0.0`
`normalization`		normalization to use	`'none'`
`first_normalization`		normalization to use before the first layer	`'none'`
`use_input_keys_prefix`	`bool`	Whether to use the `key_prefix` argument	`True`

`forward(batch, key_prefix=None)` ¶

forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: base_kwargs: Dictionary of kwargs to use for the base model

`parse_input_keys(input_keys)` ¶

Parse the input_keys. Parameters: input_keys: List of input keys to use from pyg batch graph Returns: parsed input_keys

`parse_output_keys(output_keys)` ¶

Parse the output_keys. Parameters: output_keys: List of output keys to add to the pyg batch graph Returns: parsed output_keys

`MLPEncoder` ¶

Bases: BaseEncoder

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)` ¶

Configurable kernel-based Positional Encoding node/edge-level encoder.

Parameters:

Name	Type	Description	Default
`input_keys`	`List[str]`	List of input keys to use from pyg batch graph	required
`output_keys`	`str`	List of output keys to add to the pyg batch graph	required
`in_dim`		input dimension of the mlp encoder	required
`hidden_dim`		hidden dimension of the mlp encoder	required
`out_dim`		output dimension of the mlp encoder	required
`num_layers`		number of layers of the mlp encoder	required
`activation`		activation function to use	`'relu'`
`dropout`		dropout to use	`0.0`
`normalization`		normalization to use	`'none'`
`first_normalization`		normalization to use before the first layer	`'none'`
`use_input_keys_prefix`	`bool`	Whether to use the `key_prefix` argument	`True`

`forward(batch, key_prefix=None)` ¶

forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: base_kwargs: Dictionary of kwargs to use for the base model

`parse_input_keys(input_keys)` ¶

Parse the input_keys. Parameters: input_keys: List of input keys to use from pyg batch graph Returns: parsed input_keys

`parse_output_keys(output_keys)` ¶

Parse the output_keys. Parameters: output_keys: List of output keys to add to the pyg batch graph Returns: parsed output_keys

Signnet Positional Encoder¶

`graphium.nn.encoders.signnet_pos_encoder` ¶

SignNet https://arxiv.org/abs/2202.13013 based on https://github.com/cptq/SignNet-BasisNet

`GINDeepSigns` ¶

Bases: Module

Sign invariant neural network with MLP aggregation. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))

`MaskedGINDeepSigns` ¶

Bases: Module

Sign invariant neural network with sum pooling and DeepSet. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))

`SignNetNodeEncoder` ¶

Bases: BaseEncoder

SignNet Positional Embedding node encoder. https://arxiv.org/abs/2202.13013 https://github.com/cptq/SignNet-BasisNet

Uses precomputated Laplacian eigen-decomposition, but instead of eigen-vector sign flipping + DeepSet/Transformer, computes the PE as: SignNetPE(v_1, ... , v_k) = \rho ( [\phi(v_i) + \rhi(-v_i)]^k_i=1 ) where \phi is GIN network applied to k first non-trivial eigenvectors, and \rho is an MLP if k is a constant, but if all eigenvectors are used then \rho is DeepSet with sum-pooling.

SignNetPE of size dim_pe will get appended to each node feature vector.

Parameters:

Name	Type	Description	Default
`dim_emb`		Size of final node embedding	required

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

TODO: Update this. It is broken¶

Parameters:

Name	Type	Description	Default
`divide_factor`	`float`	Factor by which to divide the width.	`2.0`
`factor_in_dim`	`bool`	Whether to factor the input dimension	`False`

`SimpleGIN` ¶

Bases: Module

`init(in_dim, hidden_dim, out_dim, num_layers, normalization='none', dropout=0.5, activation='relu')` ¶

not supported yet

graphium.nn.encoders¶

Base Encoder¶

graphium.nn.encoders.base_encoder ¶

BaseEncoder ¶

__init__(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True) ¶

forward(graph, key_prefix=None) abstractmethod ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

parse_input_keys(input_keys) abstractmethod ¶

parse_input_keys_with_prefix(key_prefix) ¶

parse_output_keys(output_keys) abstractmethod ¶

Gaussian Kernal Positional Encoder¶

graphium.nn.encoders.gaussian_kernel_pos_encoder ¶

GaussianKernelPosEncoder ¶

__init__(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1) ¶

forward(batch, key_prefix=None) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

parse_input_keys(input_keys) ¶

parse_output_keys(output_keys) ¶

Laplacian Positional Encoder¶

graphium.nn.encoders.laplace_pos_encoder ¶

LapPENodeEncoder ¶

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs) ¶

forward(batch, key_prefix=None) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

parse_input_keys(input_keys) ¶

parse_output_keys(output_keys) ¶

MLP Encoder¶

graphium.nn.encoders.mlp_encoder ¶

CatMLPEncoder ¶

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True) ¶

forward(batch, key_prefix=None) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

parse_input_keys(input_keys) ¶

parse_output_keys(output_keys) ¶

MLPEncoder ¶

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True) ¶

forward(batch, key_prefix=None) ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

parse_input_keys(input_keys) ¶

parse_output_keys(output_keys) ¶

Signnet Positional Encoder¶

graphium.nn.encoders.signnet_pos_encoder ¶

GINDeepSigns ¶

MaskedGINDeepSigns ¶

SignNetNodeEncoder ¶

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False) ¶

TODO: Update this. It is broken¶

SimpleGIN ¶

__init__(in_dim, hidden_dim, out_dim, num_layers, normalization='none', dropout=0.5, activation='relu') ¶

`graphium.nn.encoders.base_encoder` ¶

`BaseEncoder` ¶

`init(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True)` ¶

`forward(graph, key_prefix=None)` `abstractmethod` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`parse_input_keys(input_keys)` `abstractmethod` ¶

`parse_input_keys_with_prefix(key_prefix)` ¶

`parse_output_keys(output_keys)` `abstractmethod` ¶

`graphium.nn.encoders.gaussian_kernel_pos_encoder` ¶

`GaussianKernelPosEncoder` ¶

`init(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1)` ¶

`forward(batch, key_prefix=None)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`parse_input_keys(input_keys)` ¶

`parse_output_keys(output_keys)` ¶

`graphium.nn.encoders.laplace_pos_encoder` ¶

`LapPENodeEncoder` ¶

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs)` ¶

`forward(batch, key_prefix=None)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`parse_input_keys(input_keys)` ¶

`parse_output_keys(output_keys)` ¶

`graphium.nn.encoders.mlp_encoder` ¶

`CatMLPEncoder` ¶

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)` ¶

`forward(batch, key_prefix=None)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`parse_input_keys(input_keys)` ¶

`parse_output_keys(output_keys)` ¶

`MLPEncoder` ¶

`init(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)` ¶

`forward(batch, key_prefix=None)` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`parse_input_keys(input_keys)` ¶

`parse_output_keys(output_keys)` ¶

`graphium.nn.encoders.signnet_pos_encoder` ¶

`GINDeepSigns` ¶

`MaskedGINDeepSigns` ¶

`SignNetNodeEncoder` ¶

`make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)` ¶

`SimpleGIN` ¶

`init(in_dim, hidden_dim, out_dim, num_layers, normalization='none', dropout=0.5, activation='relu')` ¶