Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FeaturePropagation as a transform #5387

Merged
merged 24 commits into from
Sep 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## [2.2.0] - 2022-MM-DD
### Added
- Added `FeaturePropagation` transform ([#5387](https://github.com/pyg-team/pytorch_geometric/pull/5387))
- Added `PositionalEncoding` ([#5381](https://github.com/pyg-team/pytorch_geometric/pull/5381))
- Consolidated sampler routines behind `torch_geometric.sampler`, enabling ease of extensibility in the future ([#5312](https://github.com/pyg-team/pytorch_geometric/pull/5312), [#5365](https://github.com/pyg-team/pytorch_geometric/pull/5365), [#5402](https://github.com/pyg-team/pytorch_geometric/pull/5402), [#5404](https://github.com/pyg-team/pytorch_geometric/pull/5404)), [#5418](https://github.com/pyg-team/pytorch_geometric/pull/5418))
- Added `pyg-lib` neighbor sampling ([#5384](https://github.com/pyg-team/pytorch_geometric/pull/5384), [#5388](https://github.com/pyg-team/pytorch_geometric/pull/5388))
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,7 @@ They follow an extensible design: It is easy to apply these operators and graph
* **[CorrectAndSmooth](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.models.CorrectAndSmooth)** from Huang *et al.*: [Combining Label Propagation And Simple Models Out-performs Graph Neural Networks](https://arxiv.org/abs/2010.13993) (CoRR 2020) [[**Example**](https://github.com/pyg-team/pytorch_geometric/blob/master/examples/correct_and_smooth.py)]
* **[Gini](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.functional.gini)** and **[BRO](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.functional.bro)** regularization from Henderson *et al.*: [Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity](https://arxiv.org/abs/2105.04854) (ICML 2021)
* **[RootedEgoNets](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.transforms.RootedEgoNets)** and **[RootedRWSubgraph](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.transforms.RootedRWSubgraph)** from Zhao *et al.*: [From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness](https://arxiv.org/abs/2110.03753) (ICLR 2022)
* **[FeaturePropagation](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.transforms.FeaturePropagation)** from Rossi *et al.*: [On the Unreasonable Effectiveness of Feature Propagation in Learning on Graphs with Missing Node Features](https://arxiv.org/abs/2111.12128) (CoRR 2021)
</details>

**Scalable GNNs:**
Expand Down
30 changes: 30 additions & 0 deletions test/transforms/test_feature_propagation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import torch

from torch_geometric.data import Data
from torch_geometric.transforms import FeaturePropagation, ToSparseTensor


def test_feature_propagation():
x = torch.randn(6, 4)
x[0, 1] = float('nan')
x[2, 3] = float('nan')
missing_mask = torch.isnan(x)
edge_index = torch.tensor([[0, 1, 0, 4, 1, 4, 2, 3, 3, 5],
[1, 0, 4, 0, 4, 1, 3, 2, 5, 3]])

transform = FeaturePropagation(missing_mask)
assert str(transform) == ('FeaturePropagation(missing_features=8.3%, '
'num_iterations=40)')

data1 = Data(x=x, edge_index=edge_index)
assert torch.isnan(data1.x).sum() == 2
data1 = FeaturePropagation(missing_mask)(data1)
assert torch.isnan(data1.x).sum() == 0
assert data1.x.size() == x.size()

data2 = Data(x=x, edge_index=edge_index)
assert torch.isnan(data2.x).sum() == 2
data2 = ToSparseTensor()(data2)
data2 = transform(data2)
assert torch.isnan(data2.x).sum() == 0
assert torch.allclose(data1.x, data2.x)
1 change: 1 addition & 0 deletions torch_geometric/graphgym/config_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

MAPPING = {
torch.nn.Module: Any,
torch.Tensor: Any,
}


Expand Down
2 changes: 2 additions & 0 deletions torch_geometric/transforms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
from .largest_connected_components import LargestConnectedComponents
from .virtual_node import VirtualNode
from .add_positional_encoding import AddLaplacianEigenvectorPE, AddRandomWalkPE
from .feature_propagation import FeaturePropagation

__all__ = [
'BaseTransform',
Expand Down Expand Up @@ -106,6 +107,7 @@
'VirtualNode',
'AddLaplacianEigenvectorPE',
'AddRandomWalkPE',
'FeaturePropagation',
]

classes = __all__
Expand Down
81 changes: 81 additions & 0 deletions torch_geometric/transforms/feature_propagation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
from torch import Tensor
from torch_sparse import SparseTensor

from torch_geometric.data import Data
from torch_geometric.data.datapipes import functional_transform
from torch_geometric.nn.conv.gcn_conv import gcn_norm
from torch_geometric.transforms import BaseTransform


@functional_transform('feature_propagation')
class FeaturePropagation(BaseTransform):
r"""The feature propagation operator from the `"On the Unreasonable
Effectiveness of Feature propagation in Learning on Graphs with Missing
Node Features" <https://arxiv.org/abs/2111.12128>`_ paper
(functional name: :obj:`feature_propagation`)

.. math::
\mathbf{X}^{(0)} &= (1 - \mathbf{M}) \cdot \mathbf{X}

\mathbf{X}^{(\ell + 1)} &= \mathbf{X}^{(0)} + \mathbf{M} \cdot
(\mathbf{D}^{-1/2} \mathbf{A} \mathbf{D}^{-1/2} \mathbf{X}^{(\ell)})

where missing node features are inferred by known features via propagation.

Example:

.. code-block::

from torch_geometric.transforms import FeaturePropagation

transform = FeaturePropagation(missing_mask=torch.isnan(data.x))
data = transform(data)

Args:
missing_mask (torch.Tensor): Mask matrix
:math:`\mathbf{M} \in {\{ 0, 1 \}}^{N\times F}` indicating missing
node features.
num_iterations (int, optional): The number of propagations.
(default: :obj:`40`)
"""
def __init__(self, missing_mask: Tensor, num_iterations: int = 40):
self.missing_mask = missing_mask
self.num_iterations = num_iterations

def __call__(self, data: Data) -> Data:
assert 'edge_index' in data or 'adj_t' in data
assert data.x.size() == self.missing_mask.size()

missing_mask = self.missing_mask.to(data.x.device)
known_mask = ~missing_mask

if 'edge_index' in data:
edge_weight = data.edge_attr
if 'edge_weight' in data:
edge_weight = data.edge_weight
edge_index = data.edge_index
adj_t = SparseTensor(row=edge_index[1], col=edge_index[0],
value=edge_weight,
sparse_sizes=data.size()[::-1],
is_sorted=False, trust_data=True)
else:
adj_t = data.adj_t

adj_t = gcn_norm(adj_t, add_self_loops=False)

x = data.x.clone()
x[missing_mask] = 0.

out = x
for _ in range(self.num_iterations):
out = adj_t @ out
out[known_mask] = x[known_mask] # Reset.
data.x = out

return data

def __repr__(self) -> str:
na_values = int(self.missing_mask.sum()) / self.missing_mask.numel()
return (f'{self.__class__.__name__}('
f'missing_features={100 * na_values:.1f}%, '
f'num_iterations={self.num_iterations})')