graphium.nn.encoders¶
Implementations of positional encoders in the library
Base Encoder¶
graphium.nn.encoders.base_encoder
¶
BaseEncoder
¶
Bases: Module
, MupMixin
__init__(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True)
¶
Base class for all positional and structural encoders.
Initialize the encoder with the following arguments:
Parameters:
input_keys: The keys from the graph to use as input
output_keys: The keys to return as output encodings
in_dim: The input dimension for the encoder
out_dim: The output dimension of the encodings
num_layers: The number of layers of the encoder
activation: The activation function to use
first_normalization: The normalization to use before the first layer
use_input_keys_prefix: Whether to use the key_prefix
argument in the forward
method.
This is useful when the encodings are categorized by the function get_all_positional_encoding
forward(graph, key_prefix=None)
abstractmethod
¶
Forward pass of the encoder on a graph. This is a method to be implemented by the child class. Parameters: graph: The input pyg Batch
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
divide_factor |
float
|
Factor by which to divide the width. |
2.0
|
factor_in_dim |
bool
|
Whether to factor the input dimension |
False
|
Returns: A dictionary with the base model arguments
parse_input_keys(input_keys)
abstractmethod
¶
Parse the input_keys
argument. This is a method to be implemented by the child class.
Parameters:
input_keys: The input keys to parse
parse_input_keys_with_prefix(key_prefix)
¶
Parse the input_keys
argument, given a certain prefix.
If the prefix is None
, it is ignored
parse_output_keys(output_keys)
abstractmethod
¶
Parse the output_keys
argument. This is a method to be implemented by the child class.
Parameters:
output_keys: The output keys to parse
Gaussian Kernal Positional Encoder¶
graphium.nn.encoders.gaussian_kernel_pos_encoder
¶
GaussianKernelPosEncoder
¶
Bases: BaseEncoder
__init__(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1)
¶
Configurable gaussian kernel-based Positional Encoding node and edge encoder. Useful for encoding 3D conformation positions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
List[str]
|
The keys from the pyg graph to use as input |
required |
output_keys |
List[str]
|
The keys to return corresponding to the output encodings |
required |
in_dim |
int
|
The input dimension for the encoder |
required |
out_dim |
int
|
The output dimension of the encodings |
required |
embed_dim |
int
|
The dimension of the embedding |
required |
num_layers |
int
|
The number of layers of the encoder |
required |
max_num_nodes_per_graph |
Optional[int]
|
The maximum number of nodes per graph |
None
|
activation |
Union[str, Callable]
|
The activation function to use |
'gelu'
|
first_normalization |
The normalization to use before the first layer |
'none'
|
|
use_input_keys_prefix |
bool
|
Whether to use the |
True
|
num_heads |
int
|
The number of heads to use for the multi-head attention |
1
|
forward(batch, key_prefix=None)
¶
forward function of the GaussianKernelPosEncoder class
Parameters:
batch: The batch of pyg graphs
key_prefix: The prefix to use for the input keys
Returns:
A dictionary of the output encodings with keys specified by output_keys
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: A dictionary of the base model kwargs
parse_input_keys(input_keys)
¶
Parse the input_keys
.
Parameters:
input_keys: The input keys to parse
Returns:
The parsed input keys
parse_output_keys(output_keys)
¶
Parse the output_keys
.
Parameters:
output_keys: The output keys to parse
Returns:
The parsed output keys
Laplacian Positional Encoder¶
graphium.nn.encoders.laplace_pos_encoder
¶
LapPENodeEncoder
¶
Bases: BaseEncoder
__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs)
¶
Laplace Positional Embedding node encoder. LapPE of size dim_pe will get appended to each node feature vector.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
List[str]
|
List of input keys to use from the data object. |
required |
output_keys |
List[str]
|
List of output keys to add to the data object. |
required |
in_dim |
Size of Laplace PE embedding. Only used by the MLP model |
required | |
hidden_dim |
int
|
Size of hidden layer |
required |
out_dim |
int
|
Size of final node embedding |
required |
num_layers |
int
|
Number of layers in the MLP |
required |
activation |
Optional[Union[str, Callable]]
|
Activation function to use. |
'relu'
|
model_type |
str
|
'Transformer' or 'DeepSet' or 'MLP' |
'DeepSet'
|
num_layers_post |
Number of layers to apply after pooling |
1
|
|
dropout |
Dropout rate |
0.0
|
|
first_normalization |
Normalization to apply to the first layer. |
None
|
forward(batch, key_prefix=None)
¶
Forward pass of the encoder.
Parameters:
batch: pyg Batches of graphs
key_prefix: Prefix to use for the input and output keys.
Returns:
output dictionary with keys as specified in output_keys
and their output embeddings.
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: Dictionary of kwargs to be used to create the base model.
parse_input_keys(input_keys)
¶
Parse the input keys and make sure they are supported for this encoder Parameters: input_keys: List of input keys to use from the data object. Returns: List of parsed input keys
parse_output_keys(output_keys)
¶
parse the output keys Parameters: output_keys: List of output keys to add to the data object. Returns: List of parsed output keys
MLP Encoder¶
graphium.nn.encoders.mlp_encoder
¶
CatMLPEncoder
¶
Bases: BaseEncoder
__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)
¶
Configurable kernel-based Positional Encoding node/edge-level encoder. Concatenates the list of input (node or edge) features in the feature dimension
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
List[str]
|
List of input keys; inputs are concatenated in feat dimension and passed through mlp |
required |
output_keys |
str
|
List of output keys to add to the pyg batch graph |
required |
in_dim |
input dimension of the mlp encoder; sum of input dimensions of inputs |
required | |
hidden_dim |
hidden dimension of the mlp encoder |
required | |
out_dim |
output dimension of the mlp encoder |
required | |
num_layers |
number of layers of the mlp encoder |
required | |
activation |
activation function to use |
'relu'
|
|
dropout |
dropout to use |
0.0
|
|
normalization |
normalization to use |
'none'
|
|
first_normalization |
normalization to use before the first layer |
'none'
|
|
use_input_keys_prefix |
bool
|
Whether to use the |
True
|
forward(batch, key_prefix=None)
¶
forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: base_kwargs: Dictionary of kwargs to use for the base model
parse_input_keys(input_keys)
¶
Parse the input_keys
.
Parameters:
input_keys: List of input keys to use from pyg batch graph
Returns:
parsed input_keys
parse_output_keys(output_keys)
¶
Parse the output_keys
.
Parameters:
output_keys: List of output keys to add to the pyg batch graph
Returns:
parsed output_keys
MLPEncoder
¶
Bases: BaseEncoder
__init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)
¶
Configurable kernel-based Positional Encoding node/edge-level encoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
List[str]
|
List of input keys to use from pyg batch graph |
required |
output_keys |
str
|
List of output keys to add to the pyg batch graph |
required |
in_dim |
input dimension of the mlp encoder |
required | |
hidden_dim |
hidden dimension of the mlp encoder |
required | |
out_dim |
output dimension of the mlp encoder |
required | |
num_layers |
number of layers of the mlp encoder |
required | |
activation |
activation function to use |
'relu'
|
|
dropout |
dropout to use |
0.0
|
|
normalization |
normalization to use |
'none'
|
|
first_normalization |
normalization to use before the first layer |
'none'
|
|
use_input_keys_prefix |
bool
|
Whether to use the |
True
|
forward(batch, key_prefix=None)
¶
forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: base_kwargs: Dictionary of kwargs to use for the base model
parse_input_keys(input_keys)
¶
Parse the input_keys
.
Parameters:
input_keys: List of input keys to use from pyg batch graph
Returns:
parsed input_keys
parse_output_keys(output_keys)
¶
Parse the output_keys
.
Parameters:
output_keys: List of output keys to add to the pyg batch graph
Returns:
parsed output_keys
Signnet Positional Encoder¶
graphium.nn.encoders.signnet_pos_encoder
¶
SignNet https://arxiv.org/abs/2202.13013 based on https://github.com/cptq/SignNet-BasisNet
GINDeepSigns
¶
Bases: Module
Sign invariant neural network with MLP aggregation. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))
MaskedGINDeepSigns
¶
Bases: Module
Sign invariant neural network with sum pooling and DeepSet. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))
SignNetNodeEncoder
¶
Bases: BaseEncoder
SignNet Positional Embedding node encoder. https://arxiv.org/abs/2202.13013 https://github.com/cptq/SignNet-BasisNet
Uses precomputated Laplacian eigen-decomposition, but instead of eigen-vector sign flipping + DeepSet/Transformer, computes the PE as: SignNetPE(v_1, ... , v_k) = \rho ( [\phi(v_i) + \rhi(-v_i)]^k_i=1 ) where \phi is GIN network applied to k first non-trivial eigenvectors, and \rho is an MLP if k is a constant, but if all eigenvectors are used then \rho is DeepSet with sum-pooling.
SignNetPE of size dim_pe will get appended to each node feature vector.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dim_emb |
Size of final node embedding |
required |
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
TODO: Update this. It is broken¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
divide_factor |
float
|
Factor by which to divide the width. |
2.0
|
factor_in_dim |
bool
|
Whether to factor the input dimension |
False
|