graphium.nn.architectures¶
High level architectures in the library
Global Architectures¶
graphium.nn.architectures.global_architectures
¶
FeedForwardGraph
¶
Bases: FeedForwardNN
__init__(in_dim, out_dim, hidden_dims, layer_type, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, in_dim_edges=0, hidden_dims_edges=[], name='GNN', layer_kwargs=None, virtual_node='none', use_virtual_edges=False, last_layer_is_readout=False)
¶
A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.
This class is meant to work with different graph neural networks
layers. Any layer must inherit from graphium.nn.base_graph_layer.BaseGraphStructure
or graphium.nn.base_graph_layer.BaseGraphLayer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_dim |
int
|
Input feature dimensions of the layer |
required |
out_dim |
int
|
Output feature dimensions of the layer |
required |
hidden_dims |
Union[List[int], int]
|
List of dimensions in the hidden layers. Be careful, the "simple" residual type only supports hidden dimensions of the same value. |
required |
layer_type |
Union[str, nn.Module]
|
Type of layer to use. Can be a string or nn.Module. |
required |
depth |
Optional[int]
|
If |
None
|
activation |
Union[str, Callable]
|
activation function to use in the hidden layers. |
'relu'
|
last_activation |
Union[str, Callable]
|
activation function to use in the last layer. |
'none'
|
dropout |
float
|
The ratio of units to dropout. Must be between 0 and 1 |
0.0
|
last_dropout |
float
|
The ratio of units to dropout for the last layer. Must be between 0 and 1 |
0.0
|
normalization |
Union[str, Callable]
|
Normalization to use. Choices:
|
'none'
|
first_normalization |
Union[str, Callable]
|
Whether to use batch normalization before the first layer |
'none'
|
last_normalization |
Union[str, Callable]
|
Whether to use batch normalization in the last layer |
'none'
|
residual_type |
str
|
|
'none'
|
residual_skip_steps |
int
|
The number of steps to skip between each residual connection.
If |
1
|
in_dim_edges |
int
|
Input edge-feature dimensions of the network. Keep at 0 if not using edge features, or if the layer doesn't support edges. |
0
|
hidden_dims_edges |
List[int]
|
Hidden dimensions for the edges. Most models don't support it, so it
should only be used for those that do, i.e. |
[]
|
name |
str
|
Name attributed to the current network, for display and printing purposes. |
'GNN'
|
layer_type |
Union[str, nn.Module]
|
The type of layers to use in the network.
A class that inherits from
|
required |
layer_kwargs |
Optional[Dict]
|
The arguments to be used in the initialization of the layer provided by |
None
|
virtual_node |
str
|
A string associated to the type of virtual node to use,
either The virtual node will not use any residual connection if |
'none'
|
use_virtual_edges |
bool
|
A bool flag used to select if the virtual node should use the edges or not |
False
|
last_layer_is_readout |
bool
|
Whether the last layer should be treated as a readout layer.
Allows to use the |
False
|
__repr__()
¶
Controls how the class is printed
forward(g)
¶
Apply the full graph neural network on the input graph and node features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch graph on which the convolution is done with the keys:
|
required |
Returns:
Type | Description |
---|---|
torch.Tensor
|
|
get_init_kwargs()
¶
Get a dictionary that can be used to instanciate a new object with identical parameters.
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension for the nodes
Returns:
Name | Type | Description |
---|---|---|
kwargs |
Dict[str, Any]
|
Dictionary of parameters to be used to instanciate the base model divided by the factor |
FeedForwardNN
¶
Bases: nn.Module
, MupMixin
__init__(in_dim, out_dim, hidden_dims, depth=None, activation='relu', last_activation='none', dropout=0.0, last_dropout=0.0, normalization='none', first_normalization='none', last_normalization='none', residual_type='none', residual_skip_steps=1, name='LNN', layer_type='fc', layer_kwargs=None, last_layer_is_readout=False)
¶
A flexible neural network architecture, with variable hidden dimensions, support for multiple layer types, and support for different residual connections.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_dim |
int
|
Input feature dimensions of the layer |
required |
out_dim |
int
|
Output feature dimensions of the layer |
required |
hidden_dims |
Union[List[int], int]
|
Either an integer specifying all the hidden dimensions, or a list of dimensions in the hidden layers. Be careful, the "simple" residual type only supports hidden dimensions of the same value. |
required |
depth |
Optional[int]
|
If |
None
|
activation |
Union[str, Callable]
|
activation function to use in the hidden layers. |
'relu'
|
last_activation |
Union[str, Callable]
|
activation function to use in the last layer. |
'none'
|
dropout |
float
|
The ratio of units to dropout. Must be between 0 and 1 |
0.0
|
last_dropout |
float
|
The ratio of units to dropout for the last_layer. Must be between 0 and 1 |
0.0
|
normalization |
Union[str, Callable]
|
Normalization to use. Choices:
|
'none'
|
first_normalization |
Union[str, Callable]
|
Whether to use batch normalization before the first layer |
'none'
|
last_normalization |
Union[str, Callable]
|
Whether to use batch normalization in the last layer |
'none'
|
residual_type |
str
|
|
'none'
|
residual_skip_steps |
int
|
The number of steps to skip between each residual connection.
If |
1
|
name |
str
|
Name attributed to the current network, for display and printing purposes. |
'LNN'
|
layer_type |
Union[str, nn.Module]
|
The type of layers to use in the network.
Either "fc" as the |
'fc'
|
layer_kwargs |
Optional[Dict]
|
The arguments to be used in the initialization of the layer provided by |
None
|
last_layer_is_readout |
bool
|
Whether the last layer should be treated as a readout layer.
Allows to use the |
False
|
__repr__()
¶
Controls how the class is printed
forward(h)
¶
Apply the neural network on the input features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
h |
torch.Tensor
|
|
required |
Returns:
Type | Description |
---|---|
torch.Tensor
|
|
get_init_kwargs()
¶
Get a dictionary that can be used to instanciate a new object with identical parameters.
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
FullGraphMultiTaskNetwork
¶
Bases: nn.Module
, MupMixin
in_dim: int
property
¶
Returns the input dimension of the network
in_dim_edges: int
property
¶
Returns the input edge dimension of the network
out_dim: int
property
¶
Returns the output dimension of the network
out_dim_edges: int
property
¶
Returns the output dimension of the edges of the network.
__init__(gnn_kwargs, pre_nn_kwargs=None, pre_nn_edges_kwargs=None, pe_encoders_kwargs=None, task_heads_kwargs=None, graph_output_nn_kwargs=None, accelerator_kwargs=None, num_inference_to_average=1, last_layer_is_readout=False, name='FullGNN')
¶
Class that allows to implement a full graph neural network architecture, including the pre-processing MLP and the post processing MLP.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gnn_kwargs |
Dict[str, Any]
|
key-word arguments to use for the initialization of the pre-processing
GNN network using the class
|
required |
pe_encoders_kwargs |
Optional[Dict[str, Any]]
|
key-word arguments to use for the initialization of all positional encoding encoders.
See the class |
None
|
pre_nn_kwargs |
Optional[Dict[str, Any]]
|
key-word arguments to use for the initialization of the pre-processing
MLP network of the node features before the GNN, using the class |
None
|
pre_nn_edges_kwargs |
Optional[Dict[str, Any]]
|
key-word arguments to use for the initialization of the pre-processing
MLP network of the edge features before the GNN, using the class |
None
|
task_heads_kwargs |
Optional[Dict[str, Any]]
|
This argument is a list of dictionaries containing the arguments for task heads. Each argument is used to initialize a task-specific MLP. |
None
|
graph_output_nn_kwargs |
Optional[Dict[str, Any]]
|
This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN. Each dict of arguments is used to initialize a shared MLP. |
None
|
accelerator_kwargs |
Optional[Dict[str, Any]]
|
key-word arguments specific to the accelerator being used, e.g. pipeline split points |
None
|
num_inference_to_average |
int
|
Number of inferences to average at val/test time. This is used to avoid the noise introduced by positional encodings with sign-flips. In case no such encoding is given, this parameter is ignored. NOTE: The inference time will be slowed-down proportionaly to this parameter. |
1
|
last_layer_is_readout |
bool
|
Whether the last layer should be treated as a readout layer.
Allows to use the |
False
|
name |
str
|
Name attributed to the current network, for display and printing purposes. |
'FullGNN'
|
__repr__()
¶
Controls how the class is printed
forward(g)
¶
Apply the pre-processing neural network, the graph neural network, and the post-processing neural network on the graph features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch graph on which the convolution is done. Must contain the following elements:
|
required |
Returns:
Type | Description |
---|---|
Tensor
|
|
make_mup_base_kwargs(divide_factor=2.0)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary with the kwargs to create the base model. |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
GraphOutputNN
¶
Bases: nn.Module
, MupMixin
concat_last_layers: Optional[Iterable[int]]
property
writable
¶
Property to control the output of the self.forward
.
If set to a list of integer, the forward
function will
concatenate the output of different layers.
If set to None
, the output of the last layer is returned.
NOTE: The indexes are inverted. 0 is the last layer, 1 is the second last, etc.
out_dim: int
property
¶
Returns the output dimension of the network
__init__(in_dim, in_dim_edges, task_level, graph_output_nn_kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_dim |
int
|
Input feature dimensions of the layer |
required |
in_dim_edges |
int
|
Input edge feature dimensions of the layer |
required |
task_level |
str
|
graph/node/edge/nodepair depending on wether it is graph/node/edge/nodepair level task |
required |
graph_output_nn_kwargs |
Dict[str, Any]
|
key-word arguments to use for the initialization of the post-processing
MLP network after the GNN, using the class |
required |
compute_nodepairs(node_feats, batch, max_num_nodes=None, fill_value=float('nan'), batch_size=None, drop_nodes_last_graph=False)
¶
Vectorized implementation of nodepair-level task:
Parameters:
Name | Type | Description | Default |
---|---|---|---|
node_feats |
torch.Tensor
|
Node features |
required |
batch |
torch.Tensor
|
Batch vector |
required |
max_num_nodes |
int
|
The maximum number of nodes per graph |
None
|
fill_value |
float
|
The value for invalid entries in the
resulting dense output tensor. (default: :obj: |
float('nan')
|
batch_size |
int
|
The batch size. (default: :obj: |
None
|
drop_nodes_last_graph |
bool
|
Whether to drop the nodes of the last graphs that exceed
the |
False
|
Returns:
Name | Type | Description |
---|---|---|
result |
torch.Tensor
|
concatenated node features of shape B * max_num_nodes * 2*h, |
torch.Tensor
|
where B is number of graphs, max_num_nodes is the chosen maximum number nodes, and h is the feature dim |
drop_graph_output_nn_layers(num_layers_to_drop)
¶
Remove the last layers of the model. Useful for Transfer Learning.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_layers_to_drop |
int
|
The number of layers to drop from the |
required |
extend_graph_output_nn_layers(layers)
¶
Add layers at the end of the model. Useful for Transfer Learning.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
layers |
nn.ModuleList
|
A ModuleList of all the layers to extend |
required |
forward(g)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch graph |
required |
Returns:
Name | Type | Description |
---|---|---|
h |
Output features after applying graph_output_nn |
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dictionary with the kwargs to create the base model. |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
TaskHeads
¶
Bases: nn.Module
, MupMixin
out_dim: Dict[str, int]
property
¶
Returns the output dimension of each task head
__init__(in_dim, in_dim_edges, task_heads_kwargs, graph_output_nn_kwargs, last_layer_is_readout=True)
¶
Class that groups all multi-task output heads together to provide the task-specific outputs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
in_dim |
int
|
Input feature dimensions of the layer |
required |
in_dim_edges |
int
|
Input edge feature dimensions of the layer |
required |
last_layer_is_readout |
bool
|
Whether the last layer should be treated as a readout layer.
Allows to use the |
True
|
task_heads_kwargs |
Dict[str, Any]
|
This argument is a list of dictionaries corresponding to the arguments for a FeedForwardNN. Each dict of arguments is used to initialize a task-specific MLP. |
required |
graph_output_nn_kwargs |
Dict[str, Any]
|
key-word arguments to use for the initialization of the post-processing
MLP network after the GNN, using the class |
required |
__repr__()
¶
Returns a string representation of the task heads
forward(g)
¶
forward function of the task head
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch graph |
required |
Returns:
Name | Type | Description |
---|---|---|
task_head_outputs |
Dict[str, torch.Tensor]
|
Return a dictionary: Dict[task_name, Tensor] |
make_mup_base_kwargs(divide_factor=2.0, factor_in_dim=False)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width. factor_in_dim: Whether to factor the input dimension
Returns:
Name | Type | Description |
---|---|---|
kwargs |
Dict[str, Any]
|
Dictionary of arguments to be used to initialize the base model |
set_max_num_nodes_edges_per_graph(max_nodes, max_edges)
¶
Set the maximum number of nodes and edges for all gnn layers and encoder layers
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_nodes |
Optional[int]
|
Maximum number of nodes in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
max_edges |
Optional[int]
|
Maximum number of edges in the dataset. This will be useful for certain architecture, but ignored by others. |
required |
PyG Architectures¶
graphium.nn.architectures.pyg_architectures
¶
FeedForwardPyg
¶
Bases: FeedForwardGraph
Encoder Manager¶
graphium.nn.architectures.encoder_manager
¶
EncoderManager
¶
Bases: nn.Module
in_dims: Iterable[int]
property
¶
Returns the input dimensions for all pe-encoders
Returns:
Name | Type | Description |
---|---|---|
in_dims |
Iterable[int]
|
the input dimensions for all pe-encoders |
input_keys: Iterable[str]
property
¶
Returns the input keys for all pe-encoders
Returns:
Name | Type | Description |
---|---|---|
input_keys |
Iterable[str]
|
the input keys for all pe-encoders |
out_dim: int
property
¶
Returns the output dimension of the pooled embedding from all the pe encoders
Returns:
Name | Type | Description |
---|---|---|
out_dim |
int
|
the output dimension of the pooled embedding from all the pe encoders |
__init__(pe_encoders_kwargs=None, max_num_nodes_per_graph=None, name='encoder_manager')
¶
Class that allows to runs multiple encoders in parallel and concatenate / pool their outputs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pe_encoders_kwargs |
Optional[Dict[str, Any]]
|
key-word arguments to use for the initialization of all positional encoding encoders can use the class PE_ENCODERS_DICT: "la_encoder"(tested) , "mlp_encoder" (not tested), "signnet_encoder" (not tested) |
None
|
name |
str
|
Name attributed to the current network, for display and printing purposes. |
'encoder_manager'
|
forward(g)
¶
forward pass of the pe encoders and pooling
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
ptg Batch on which the convolution is done. Must contain the following elements:
|
required |
Returns:
Name | Type | Description |
---|---|---|
g |
Batch
|
pyg Batch with the positional encodings added to the graph |
forward_positional_encoding(g)
¶
Forward pass for the positional encodings (PE),
with each PE having it's own encoder defined in self.pe_encoders
.
All the positional encodings with the same keys are pooled together
using self.pe_pooling
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
g |
Batch
|
pyg Batch containing the node positional encodings |
required |
Returns:
Name | Type | Description |
---|---|---|
pe_node_pooled |
Dict[str, Tensor]
|
The positional / structural encodings go through |
Dict[str, Tensor]
|
encoders, then are pooled together according to their keys. |
forward_simple_pooling(h, pooling, dim)
¶
Apply sum, mean, or max pooling on a Tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
h |
Tensor
|
the Tensor to pool |
required |
pooling |
str
|
string specifiying the pooling method |
required |
dim |
int
|
the dimension to pool over |
required |
Returns:
Name | Type | Description |
---|---|---|
pooled |
Tensor
|
the pooled Tensor |
make_mup_base_kwargs(divide_factor=2.0)
¶
Create a 'base' model to be used by the mup
or muTransfer
scaling of the model.
The base model is usually identical to the regular model, but with the
layers width divided by a given factor (2 by default)
Parameter
divide_factor: Factor by which to divide the width.
Returns:
Name | Type | Description |
---|---|---|
pe_kw |
Dict[str, Any]
|
the model kwargs where the dimensions are divided by the factor |