Skip to content

graphium.nn.encoders

Implementations of positional encoders in the library

Base Encoder


graphium.nn.encoders.base_encoder

BaseEncoder

Bases: Module, MupMixin

__init__(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True)

Base class for all positional and structural encoders. Initialize the encoder with the following arguments: Parameters: input_keys: The keys from the graph to use as input output_keys: The keys to return as output encodings in_dim: The input dimension for the encoder out_dim: The output dimension of the encodings num_layers: The number of layers of the encoder activation: The activation function to use first_normalization: The normalization to use before the first layer use_input_keys_prefix: Whether to use the key_prefix argument in the forward method. This is useful when the encodings are categorized by the function get_all_positional_encoding

forward(graph, key_prefix=None) abstractmethod

Forward pass of the encoder on a graph. This is a method to be implemented by the child class. Parameters: graph: The input pyg Batch

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameters:

Name Type Description Default
divide_factor float

Factor by which to divide the width.

2.0
factor_in_dim bool

Whether to factor the input dimension

False

Returns: A dictionary with the base model arguments

parse_input_keys(input_keys) abstractmethod

Parse the input_keys argument. This is a method to be implemented by the child class. Parameters: input_keys: The input keys to parse

parse_input_keys_with_prefix(key_prefix)

Parse the input_keys argument, given a certain prefix. If the prefix is None, it is ignored

parse_output_keys(output_keys) abstractmethod

Parse the output_keys argument. This is a method to be implemented by the child class. Parameters: output_keys: The output keys to parse

Gaussian Kernal Positional Encoder


graphium.nn.encoders.gaussian_kernel_pos_encoder

GaussianKernelPosEncoder

Bases: BaseEncoder

__init__(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1)

Configurable gaussian kernel-based Positional Encoding node and edge encoder. Useful for encoding 3D conformation positions.

Parameters:

Name Type Description Default
input_keys List[str]

The keys from the pyg graph to use as input

required
output_keys List[str]

The keys to return corresponding to the output encodings

required
in_dim int

The input dimension for the encoder

required
out_dim int

The output dimension of the encodings

required
embed_dim int

The dimension of the embedding

required
num_layers int

The number of layers of the encoder

required
max_num_nodes_per_graph Optional[int]

The maximum number of nodes per graph

None
activation Union[str, Callable]

The activation function to use

'gelu'
first_normalization

The normalization to use before the first layer

'none'
use_input_keys_prefix bool

Whether to use the key_prefix argument in the forward method.

True
num_heads int

The number of heads to use for the multi-head attention

1
forward(batch, key_prefix=None)

forward function of the GaussianKernelPosEncoder class Parameters: batch: The batch of pyg graphs key_prefix: The prefix to use for the input keys Returns: A dictionary of the output encodings with keys specified by output_keys

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: A dictionary of the base model kwargs

parse_input_keys(input_keys)

Parse the input_keys. Parameters: input_keys: The input keys to parse Returns: The parsed input keys

parse_output_keys(output_keys)

Parse the output_keys. Parameters: output_keys: The output keys to parse Returns: The parsed output keys

Laplacian Positional Encoder


graphium.nn.encoders.laplace_pos_encoder

LapPENodeEncoder

Bases: BaseEncoder

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs)

Laplace Positional Embedding node encoder. LapPE of size dim_pe will get appended to each node feature vector.

Parameters:

Name Type Description Default
input_keys List[str]

List of input keys to use from the data object.

required
output_keys List[str]

List of output keys to add to the data object.

required
in_dim

Size of Laplace PE embedding. Only used by the MLP model

required
hidden_dim int

Size of hidden layer

required
out_dim int

Size of final node embedding

required
num_layers int

Number of layers in the MLP

required
activation Optional[Union[str, Callable]]

Activation function to use.

'relu'
model_type str

'Transformer' or 'DeepSet' or 'MLP'

'DeepSet'
num_layers_post

Number of layers to apply after pooling

1
dropout

Dropout rate

0.0
first_normalization

Normalization to apply to the first layer.

None
forward(batch, key_prefix=None)

Forward pass of the encoder. Parameters: batch: pyg Batches of graphs key_prefix: Prefix to use for the input and output keys. Returns: output dictionary with keys as specified in output_keys and their output embeddings.

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: Dictionary of kwargs to be used to create the base model.

parse_input_keys(input_keys)

Parse the input keys and make sure they are supported for this encoder Parameters: input_keys: List of input keys to use from the data object. Returns: List of parsed input keys

parse_output_keys(output_keys)

parse the output keys Parameters: output_keys: List of output keys to add to the data object. Returns: List of parsed output keys

MLP Encoder


graphium.nn.encoders.mlp_encoder

CatMLPEncoder

Bases: BaseEncoder

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)

Configurable kernel-based Positional Encoding node/edge-level encoder. Concatenates the list of input (node or edge) features in the feature dimension

Parameters:

Name Type Description Default
input_keys List[str]

List of input keys; inputs are concatenated in feat dimension and passed through mlp

required
output_keys str

List of output keys to add to the pyg batch graph

required
in_dim

input dimension of the mlp encoder; sum of input dimensions of inputs

required
hidden_dim

hidden dimension of the mlp encoder

required
out_dim

output dimension of the mlp encoder

required
num_layers

number of layers of the mlp encoder

required
activation

activation function to use

'relu'
dropout

dropout to use

0.0
normalization

normalization to use

'none'
first_normalization

normalization to use before the first layer

'none'
use_input_keys_prefix bool

Whether to use the key_prefix argument

True
forward(batch, key_prefix=None)

forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: base_kwargs: Dictionary of kwargs to use for the base model

parse_input_keys(input_keys)

Parse the input_keys. Parameters: input_keys: List of input keys to use from pyg batch graph Returns: parsed input_keys

parse_output_keys(output_keys)

Parse the output_keys. Parameters: output_keys: List of output keys to add to the pyg batch graph Returns: parsed output_keys

MLPEncoder

Bases: BaseEncoder

__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)

Configurable kernel-based Positional Encoding node/edge-level encoder.

Parameters:

Name Type Description Default
input_keys List[str]

List of input keys to use from pyg batch graph

required
output_keys str

List of output keys to add to the pyg batch graph

required
in_dim

input dimension of the mlp encoder

required
hidden_dim

hidden dimension of the mlp encoder

required
out_dim

output dimension of the mlp encoder

required
num_layers

number of layers of the mlp encoder

required
activation

activation function to use

'relu'
dropout

dropout to use

0.0
normalization

normalization to use

'none'
first_normalization

normalization to use before the first layer

'none'
use_input_keys_prefix bool

Whether to use the key_prefix argument

True
forward(batch, key_prefix=None)

forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys

make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

Parameter

divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension

Returns: base_kwargs: Dictionary of kwargs to use for the base model

parse_input_keys(input_keys)

Parse the input_keys. Parameters: input_keys: List of input keys to use from pyg batch graph Returns: parsed input_keys

parse_output_keys(output_keys)

Parse the output_keys. Parameters: output_keys: List of output keys to add to the pyg batch graph Returns: parsed output_keys

Signnet Positional Encoder


graphium.nn.encoders.signnet_pos_encoder

SignNet https://arxiv.org/abs/2202.13013 based on https://github.com/cptq/SignNet-BasisNet

GINDeepSigns

Bases: Module

Sign invariant neural network with MLP aggregation. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))

MaskedGINDeepSigns

Bases: Module

Sign invariant neural network with sum pooling and DeepSet. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))

SignNetNodeEncoder

Bases: BaseEncoder

SignNet Positional Embedding node encoder. https://arxiv.org/abs/2202.13013 https://github.com/cptq/SignNet-BasisNet

Uses precomputated Laplacian eigen-decomposition, but instead of eigen-vector sign flipping + DeepSet/Transformer, computes the PE as: SignNetPE(v_1, ... , v_k) = \rho ( [\phi(v_i) + \rhi(-v_i)]^k_i=1 ) where \phi is GIN network applied to k first non-trivial eigenvectors, and \rho is an MLP if k is a constant, but if all eigenvectors are used then \rho is DeepSet with sum-pooling.

SignNetPE of size dim_pe will get appended to each node feature vector.

Parameters:

Name Type Description Default
dim_emb

Size of final node embedding

required
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)

Create a 'base' model to be used by the mup or muTransfer scaling of the model. The base model is usually identical to the regular model, but with the layers width divided by a given factor (2 by default)

TODO: Update this. It is broken

Parameters:

Name Type Description Default
divide_factor float

Factor by which to divide the width.

2.0
factor_in_dim bool

Whether to factor the input dimension

False

SimpleGIN

Bases: Module

__init__(in_dim, hidden_dim, out_dim, num_layers, normalization='none', dropout=0.5, activation='relu')

not supported yet