# An Introduction to Graph Neural Networks (Basics and DeepWalks)

Graph Neural Network

(GNN), has been gaining popularity in a variety of domains such as social network, knowledge graphs, recommender systems, and even life sciences. GNN’s ability to model dependencies between nodes within a graph has enabled breakthroughs in graph analysis research. This article will introduce the basics and two more advanced algorithms, DeepWalk (GraphSage), to Graph Neural Network.

### Graph:

Before we dive into GNN let’s first define *Graph*. A graph in Computer Science is a data structure that consists of two components: vertices (or edges). The set of edges E and vertices V can help you to describe a graph G.

Dependent on the direction dependencies between vertices, edges can be directed or undirected.

# Graph Neural Network

Graph Neural Networks are a type Neural Network that directly works on graph structure. Node classification is a common application of GNN.We want to predict the labels of nodes in the graph without using ground truth.

Each node v in the node classification problem setup is identified by its feature (x_v) and associated with a ground truth label t_v. The goal is to use these partially labeled graphs G to predict the labels for the unlabeled. It then learns how to represent each node using a d-dimensional vector (state), h_v that contains information about its neighbor.

https://arxiv.org/pdf/1812.08434

Where *x_co[v] *denotes the characteristics of the edges connecting with *v, [h_ne] *denotes the embedding the neighboring nodes *v *And *x_ne[v] *denotes the features found in the neighboring nodes*V. *The function *F , *This is the transition function that projects these inputs onto an d-dimensional space . Fixed point theorem As an iterative update, rewrite the equation. This operation is commonly referred to as message passing Or Neighborhood aggregation.

The graph neural network (GNN) output is computed by passing both the state h_v and the feature x_v into an output function, g.

https://arxiv.org/pdf/1812.08434

However, there are three main limitations with this original proposal of GNN pointed out by this paper:

- Multi-layer Perceptron can be used to create a stable representation by removing iterative updates. This is because the original proposal used the same transition function parameters f in all iterations, while different MLP layers allow for hierarchical feature extracting.
- Edge information cannot be processed by the computer (e.g. Different relationships between nodes may be indicated by different edges in the knowledge graph.
- Fixed point may discourage diversification of node distribution and therefore, may not be appropriate for learning how to represent nodes.

To address the above problem, several variants of GNN were proposed. They are not discussed here as they are not central to the discussion.

# DeepWalk

DeepWalk is the first algorithm to propose node embedding that can be learned unsupervised. The training process is very similar to word embedding.

Two steps are required to complete the algorithm:

- Random walks on nodes of a graph are performed to generate node sequences
- Skip-gram can be used to determine the embedding of each Node, based on the sequences of nodes generated in step 1.

The next node in the random walk is randomly selected at each time step. Hierarchical softmax is used in this paper to solve the problem of expensive softmax computations due to the large number of nodes. We must calculate all the k values for each output element in order to compute the softmax value.

Hierarchical softmax uses a binary tree as a solution to the problem. All the leaves in this binary tree (v1, V2,…, v8 in above graph) represent the graph’s neural network vertices. Each inner node has a binary classifier that helps you choose the right path. The probability of a given vertex, v_k can be computed by simply computing the probability for each sub-path that runs from the root node up to the leave v_k. The probability of each node having children equals 1, so the hierarchical softmax property that the sum total probability of all vertices equals one still holds. V

https://arxiv.org/pdf/1812.08434

The model is now able to represent each node accurately using DeepWalk GNN. Different colors signify different labels in an input graph. In the output graph (embedding in 2 dimensions), we can see that nodes with the same labels are grouped together while those with different labels are properly separated.

DeepWalk’s main problem is its inability to generalize. It must retrain the model every time a new node is added to the system to be able to represent it.**Transductive**). Such GNN is therefore not appropriate for dynamic graphs in which the nodes are constantly changing.

Read more : **Simple CNN base model from scratch**