graphium.nn.encoders¶
Implementations of positional encoders in the library
Base Encoder¶
          graphium.nn.encoders.base_encoder
¶
  
          BaseEncoder
¶
  
            Bases: Module, MupMixin
          __init__(input_keys, output_keys, in_dim, out_dim, num_layers, activation='relu', first_normalization=None, use_input_keys_prefix=True)
¶
  Base class for all positional and structural encoders.
Initialize the encoder with the following arguments:
Parameters:
    input_keys: The keys from the graph to use as input
    output_keys: The keys to return as output encodings
    in_dim: The input dimension for the encoder
    out_dim: The output dimension of the encodings
    num_layers: The number of layers of the encoder
    activation: The activation function to use
    first_normalization: The normalization to use before the first layer
    use_input_keys_prefix: Whether to use the key_prefix argument in the forward method.
    This is useful when the encodings are categorized by the function get_all_positional_encoding
          forward(graph, key_prefix=None)
  
  
      abstractmethod
  
¶
  Forward pass of the encoder on a graph. This is a method to be implemented by the child class. Parameters: graph: The input pyg Batch
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
divide_factor | 
          
                float
           | 
          
             Factor by which to divide the width.  | 
          
                2.0
           | 
        
factor_in_dim | 
          
                bool
           | 
          
             Whether to factor the input dimension  | 
          
                False
           | 
        
Returns: A dictionary with the base model arguments
          parse_input_keys(input_keys)
  
  
      abstractmethod
  
¶
  Parse the input_keys argument. This is a method to be implemented by the child class.
Parameters:
    input_keys: The input keys to parse
          parse_input_keys_with_prefix(key_prefix)
¶
  Parse the input_keys argument, given a certain prefix.
If the prefix is None, it is ignored
          parse_output_keys(output_keys)
  
  
      abstractmethod
  
¶
  Parse the output_keys argument.  This is a method to be implemented by the child class.
Parameters:
    output_keys: The output keys to parse
Gaussian Kernal Positional Encoder¶
          graphium.nn.encoders.gaussian_kernel_pos_encoder
¶
  
          GaussianKernelPosEncoder
¶
  
            Bases: BaseEncoder
          __init__(input_keys, output_keys, in_dim, out_dim, embed_dim, num_layers, max_num_nodes_per_graph=None, activation='gelu', first_normalization='none', use_input_keys_prefix=True, num_heads=1)
¶
  Configurable gaussian kernel-based Positional Encoding node and edge encoder. Useful for encoding 3D conformation positions.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
input_keys | 
          
                List[str]
           | 
          
             The keys from the pyg graph to use as input  | 
          required | 
output_keys | 
          
                List[str]
           | 
          
             The keys to return corresponding to the output encodings  | 
          required | 
in_dim | 
          
                int
           | 
          
             The input dimension for the encoder  | 
          required | 
out_dim | 
          
                int
           | 
          
             The output dimension of the encodings  | 
          required | 
embed_dim | 
          
                int
           | 
          
             The dimension of the embedding  | 
          required | 
num_layers | 
          
                int
           | 
          
             The number of layers of the encoder  | 
          required | 
max_num_nodes_per_graph | 
          
                Optional[int]
           | 
          
             The maximum number of nodes per graph  | 
          
                None
           | 
        
activation | 
          
                Union[str, Callable]
           | 
          
             The activation function to use  | 
          
                'gelu'
           | 
        
first_normalization | 
          
             The normalization to use before the first layer  | 
          
                'none'
           | 
        |
use_input_keys_prefix | 
          
                bool
           | 
          
             Whether to use the   | 
          
                True
           | 
        
num_heads | 
          
                int
           | 
          
             The number of heads to use for the multi-head attention  | 
          
                1
           | 
        
          forward(batch, key_prefix=None)
¶
  forward function of the GaussianKernelPosEncoder class
Parameters:
    batch: The batch of pyg graphs
    key_prefix: The prefix to use for the input keys
Returns:
    A dictionary of the output encodings with keys specified by output_keys
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: A dictionary of the base model kwargs
          parse_input_keys(input_keys)
¶
  Parse the input_keys.
Parameters:
    input_keys: The input keys to parse
Returns:
    The parsed input keys
          parse_output_keys(output_keys)
¶
  Parse the output_keys.
Parameters:
    output_keys: The output keys to parse
Returns:
    The parsed output keys
Laplacian Positional Encoder¶
          graphium.nn.encoders.laplace_pos_encoder
¶
  
          LapPENodeEncoder
¶
  
            Bases: BaseEncoder
          __init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', model_type='DeepSet', num_layers_post=1, dropout=0.0, first_normalization=None, use_input_keys_prefix=True, **model_kwargs)
¶
  Laplace Positional Embedding node encoder. LapPE of size dim_pe will get appended to each node feature vector.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
input_keys | 
          
                List[str]
           | 
          
             List of input keys to use from the data object.  | 
          required | 
output_keys | 
          
                List[str]
           | 
          
             List of output keys to add to the data object.  | 
          required | 
in_dim | 
          
             Size of Laplace PE embedding. Only used by the MLP model  | 
          required | |
hidden_dim | 
          
                int
           | 
          
             Size of hidden layer  | 
          required | 
out_dim | 
          
                int
           | 
          
             Size of final node embedding  | 
          required | 
num_layers | 
          
                int
           | 
          
             Number of layers in the MLP  | 
          required | 
activation | 
          
                Optional[Union[str, Callable]]
           | 
          
             Activation function to use.  | 
          
                'relu'
           | 
        
model_type | 
          
                str
           | 
          
             'Transformer' or 'DeepSet' or 'MLP'  | 
          
                'DeepSet'
           | 
        
num_layers_post | 
          
             Number of layers to apply after pooling  | 
          
                1
           | 
        |
dropout | 
          
             Dropout rate  | 
          
                0.0
           | 
        |
first_normalization | 
          
             Normalization to apply to the first layer.  | 
          
                None
           | 
        
          forward(batch, key_prefix=None)
¶
  Forward pass of the encoder.
Parameters:
    batch: pyg Batches of graphs
    key_prefix: Prefix to use for the input and output keys.
Returns:
    output dictionary with keys as specified in output_keys and their output embeddings.
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: Dictionary of kwargs to be used to create the base model.
          parse_input_keys(input_keys)
¶
  Parse the input keys and make sure they are supported for this encoder Parameters: input_keys: List of input keys to use from the data object. Returns: List of parsed input keys
          parse_output_keys(output_keys)
¶
  parse the output keys Parameters: output_keys: List of output keys to add to the data object. Returns: List of parsed output keys
MLP Encoder¶
          graphium.nn.encoders.mlp_encoder
¶
  
          CatMLPEncoder
¶
  
            Bases: BaseEncoder
          __init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)
¶
  Configurable kernel-based Positional Encoding node/edge-level encoder. Concatenates the list of input (node or edge) features in the feature dimension
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
input_keys | 
          
                List[str]
           | 
          
             List of input keys; inputs are concatenated in feat dimension and passed through mlp  | 
          required | 
output_keys | 
          
                str
           | 
          
             List of output keys to add to the pyg batch graph  | 
          required | 
in_dim | 
          
             input dimension of the mlp encoder; sum of input dimensions of inputs  | 
          required | |
hidden_dim | 
          
             hidden dimension of the mlp encoder  | 
          required | |
out_dim | 
          
             output dimension of the mlp encoder  | 
          required | |
num_layers | 
          
             number of layers of the mlp encoder  | 
          required | |
activation | 
          
             activation function to use  | 
          
                'relu'
           | 
        |
dropout | 
          
             dropout to use  | 
          
                0.0
           | 
        |
normalization | 
          
             normalization to use  | 
          
                'none'
           | 
        |
first_normalization | 
          
             normalization to use before the first layer  | 
          
                'none'
           | 
        |
use_input_keys_prefix | 
          
                bool
           | 
          
             Whether to use the   | 
          
                True
           | 
        
          forward(batch, key_prefix=None)
¶
  forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: base_kwargs: Dictionary of kwargs to use for the base model
          parse_input_keys(input_keys)
¶
  Parse the input_keys.
Parameters:
    input_keys: List of input keys to use from pyg batch graph
Returns:
    parsed input_keys
          parse_output_keys(output_keys)
¶
  Parse the output_keys.
Parameters:
    output_keys: List of output keys to add to the pyg batch graph
Returns:
    parsed output_keys
          MLPEncoder
¶
  
            Bases: BaseEncoder
          __init__(input_keys, output_keys, in_dim, hidden_dim, out_dim, num_layers, activation='relu', dropout=0.0, normalization='none', first_normalization='none', use_input_keys_prefix=True)
¶
  Configurable kernel-based Positional Encoding node/edge-level encoder.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
input_keys | 
          
                List[str]
           | 
          
             List of input keys to use from pyg batch graph  | 
          required | 
output_keys | 
          
                str
           | 
          
             List of output keys to add to the pyg batch graph  | 
          required | 
in_dim | 
          
             input dimension of the mlp encoder  | 
          required | |
hidden_dim | 
          
             hidden dimension of the mlp encoder  | 
          required | |
out_dim | 
          
             output dimension of the mlp encoder  | 
          required | |
num_layers | 
          
             number of layers of the mlp encoder  | 
          required | |
activation | 
          
             activation function to use  | 
          
                'relu'
           | 
        |
dropout | 
          
             dropout to use  | 
          
                0.0
           | 
        |
normalization | 
          
             normalization to use  | 
          
                'none'
           | 
        |
first_normalization | 
          
             normalization to use before the first layer  | 
          
                'none'
           | 
        |
use_input_keys_prefix | 
          
                bool
           | 
          
             Whether to use the   | 
          
                True
           | 
        
          forward(batch, key_prefix=None)
¶
  forward function of the mlp encoder Parameters: batch: pyg batch graph key_prefix: Prefix to use for the input keys Returns: output: Dictionary of output embeddings with keys specified by input_keys
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns: base_kwargs: Dictionary of kwargs to use for the base model
          parse_input_keys(input_keys)
¶
  Parse the input_keys.
Parameters:
    input_keys: List of input keys to use from pyg batch graph
Returns:
    parsed input_keys
          parse_output_keys(output_keys)
¶
  Parse the output_keys.
Parameters:
    output_keys: List of output keys to add to the pyg batch graph
Returns:
    parsed output_keys
Signnet Positional Encoder¶
          graphium.nn.encoders.signnet_pos_encoder
¶
  SignNet https://arxiv.org/abs/2202.13013 based on https://github.com/cptq/SignNet-BasisNet
          GINDeepSigns
¶
  
            Bases: Module
Sign invariant neural network with MLP aggregation. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))
          MaskedGINDeepSigns
¶
  
            Bases: Module
Sign invariant neural network with sum pooling and DeepSet. f(v1, ..., vk) = rho(enc(v1) + enc(-v1), ..., enc(vk) + enc(-vk))
          SignNetNodeEncoder
¶
  
            Bases: BaseEncoder
SignNet Positional Embedding node encoder. https://arxiv.org/abs/2202.13013 https://github.com/cptq/SignNet-BasisNet
Uses precomputated Laplacian eigen-decomposition, but instead of eigen-vector sign flipping + DeepSet/Transformer, computes the PE as: SignNetPE(v_1, ... , v_k) = \rho ( [\phi(v_i) + \rhi(-v_i)]^k_i=1 ) where \phi is GIN network applied to k first non-trivial eigenvectors, and \rho is an MLP if k is a constant, but if all eigenvectors are used then \rho is DeepSet with sum-pooling.
SignNetPE of size dim_pe will get appended to each node feature vector.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
dim_emb | 
          
             Size of final node embedding  | 
          required | 
          make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
  Create a 'base' model to be used by the mup or muTransfer scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
TODO: Update this. It is broken¶
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
divide_factor | 
          
                float
           | 
          
             Factor by which to divide the width.  | 
          
                2.0
           | 
        
factor_in_dim | 
          
                bool
           | 
          
             Whether to factor the input dimension  | 
          
                False
           |