You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I pulled out one of the attention value matrices and observed, but I found that all the edges of a node are almost the same.
And the value is 1/n, n is the number of edges on that node. This indicates that GAT has not learned the importance of the corresponding edge, which is almost similar to GCN.
The figure below shows the corresponding attention values for adjacent edges of some nodes in the Cora dataset (the same is true for the Citeseer dataset).
Why does this happen
The text was updated successfully, but these errors were encountered:
The effect here usually occurs on homophilous datasets (of which Cora and Citeseer are definite cases). By definition, in such datasets, the edges often merely indicate that classes should be shared, so most of the performance can be recovered by something that resembles simple averaging. It is therefore in GATs' interest to learn a distribution that is close to uniform.
This effect does not happen on PPI.
Check out this blog-post from the Deep Graph Library team, which explores this effect detailedly:
hello.
I pulled out one of the attention value matrices and observed, but I found that all the edges of a node are almost the same.
And the value is 1/n, n is the number of edges on that node.
This indicates that GAT has not learned the importance of the corresponding edge, which is almost similar to GCN.
The figure below shows the corresponding attention values for adjacent edges of some nodes in the Cora dataset (the same is true for the Citeseer dataset).
Why does this happen
The text was updated successfully, but these errors were encountered: