Networks: Fundamentals

Nodes and Edges

A graph is a mathematical structure used to model pairwise relations between objects. It consists of nodes (also called vertices) and edges (also called links) that connect pairs of nodes (Newman 2018).

In this page, we will explore the basic elements of a graph using the networkx library in Python (Hagberg, Schult, and Swart 2008). We will cover:

First, let’s import the necessary library:

If you get a ModuleNotFoundError for networkx, you may need to install it first.

If you are working on Google Colab, you can run:

!pip install networkx

If you are working in a local Python environment, use conda or run:

pip install networkx

And now we can initialize an empty graph:

G = nx.Graph()

print(G)
Graph with 0 nodes and 0 edges

Our variable G is now an empty undirected graph object. We can add nodes and edges to it, which we will see in the next sections.

Nodes (Vertices)

Nodes represent the entities in a graph. They can be anything: people in a social network, airports in a flight network, or web pages in the internet.

In order to add nodes to our graph, we can use the add_node(<id>) method. The <id> can be any hashable Python object. We can see the list of nodes in the graph using the nodes() method.

# Add three nodes to the graph
G.add_node("Spain")
G.add_node("Portugal")
G.add_node("France")

# Show the nodes in the graph
print(G.nodes())
['Spain', 'Portugal', 'France']

In Python, a hashable object is an object that has a hash value that remains constant during its lifetime. This means that the object can be used as a key in a dictionary or as an element in a set. Examples of hashable objects include integers, strings, and tuples (as long as they contain only hashable types). Lists and dictionaries are not hashable because they are mutable (their contents can change).

Node Attributes

Each node can have attributes that provide additional information about it. For example, in a social network, a node might represent a person and have attributes like name, age, or location.

NetworkX allows us to store attributes into nodes. Think of G.nodes as a dictionary where the keys are the node IDs and the values are dictionaries of attributes. We can add attributes to a node by accessing it through G.nodes[<id>] and assigning values to the attributes.

For example, we can add a “population” attribute to our country nodes:

# Add population attribute to the nodes
G.nodes["Spain"]["population"] = 47_000_000
G.nodes["Portugal"]["population"] = 10_000_000
G.nodes["France"]["population"] = 67_000_000

# Show the nodes with their attributes
population = nx.get_node_attributes(G, 'population')
for node, pop in population.items():
    print(f"{node}: {pop} inhabitants")
Spain: 47000000 inhabitants
Portugal: 10000000 inhabitants
France: 67000000 inhabitants

Using get_node_attributes(G, 'population'), we can retrieve the population attribute for all nodes in the graph as a dictionary.

If you are including a new node and want to add attributes at the same time, you can use the add_node() method with keyword arguments. For example:

# Add a new node with attributes
G.add_node("Italy", population=60_000_000)

# Show the nodes with their attributes
population = nx.get_node_attributes(G, 'population')
for node, pop in population.items():
    print(f"{node}: {pop} inhabitants")
Spain: 47000000 inhabitants
Portugal: 10000000 inhabitants
France: 67000000 inhabitants
Italy: 60000000 inhabitants

Visualization

Printing the graph object gives us a summary of its structure, but it doesn’t show us the actual connections. To visualize the graph, we can use the draw() function from networkx, which uses Matplotlib to display the graph.

import matplotlib.pyplot as plt

# Draw the graph
nx.draw(
    G,
    with_labels=True,  # show node labels (IDs)
    node_color='lightblue',  # color of the nodes (vertices)
    edge_color='gray',  # color of the edges (links)
    node_size=2000,  # size of the nodes (vertices)
    font_size=12  # size of the labels (IDs)
    )
plt.show()

Layouts

The draw() function has a pos parameter that allows us to specify the layout of the graph. A layout is a way to position the nodes in the graph for visualization. networkx provides several built-in layouts, such as spring_layout, circular_layout, and shell_layout.

Circular layout arranges the nodes in a circle. You can control the distance between the nodes using the scale parameter, higher values will make the nodes farther apart.

# Use the circular layout for visualization
pos = nx.circular_layout(G, scale=2)  # scale controls the distance between the nodes
nx.draw(
    G,
    pos=pos,  # specify the layout
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=2000,
    font_size=12
    )
plt.show()

Spring layout uses a force-directed algorithm to position the nodes in a way that minimizes edge crossings and evenly distributes the nodes. You can control the distance between the nodes using the k parameter, which is a scaling factor for the optimal distance between nodes. Higher values will make the nodes farther apart. You can also control the number of iterations of the algorithm using the iterations parameter.

# Use the spring layout for visualization
pos = nx.spring_layout(G, k=0.5, iterations=20)
nx.draw(
    G,
    pos=pos,  # specify the layout
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=2000,
    font_size=12
    )
plt.show()

Shell layout arranges the nodes in concentric circles. You can specify which nodes belong to which circle using the nlist parameter, which is a list of lists of nodes.

nlist = [["Spain", "Portugal", "France", "Italy"], ["Netherlands", "Germany", "USA", "Canada"]]
pos = nx.shell_layout(G, nlist=nlist)
nx.draw(
    G,
    pos=pos,  # specify the layout
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=2000,
    font_size=12
    )
plt.show()

Visualizing Node Attributes

We can also visualize the attributes of nodes and edges by using different colors or sizes. For example, we can color the nodes based on their population attribute:

# Get the population attribute for each node
population = nx.get_node_attributes(G, 'population')
# Draw the graph with node sizes proportional to population
node_sizes = [population[node] / 1_000_000 for node in G.nodes()]  # scale down for visualization

pos = nx.circular_layout(G)

nx.draw(
    G,
    pos=pos,
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=node_sizes,  # size of the nodes (vertices) proportional to population
    font_size=12,
    )
plt.show()

In this case, the population dictionary will not have an entry for that node, and trying to access it will raise a KeyError. To avoid this, we can use the get() method of the dictionary, which allows us to specify a default value if the key is not found. For example:

# Add a new node without the population attribute
G.add_edge("Denmark", "Germany", distance=400)
# Get the population attribute for each node, using 0 as default if not found
population = nx.get_node_attributes(G, 'population')
# Draw the graph with node sizes proportional to population
node_sizes = [population.get(node, 0) / 1_000_000 for node in G.nodes()]  # scale down for visualization

pos = nx.circular_layout(G)

nx.draw(
    G,
    pos=pos,
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=node_sizes,  # size of the nodes (vertices) proportional to population
    font_size=12,
    )
plt.show()

Exercise: Add a new attribute to the nodes, called “visited”, which is a boolean that indicates whether you have visited that country or not. Then, visualize the graph by coloring the nodes differently based on whether you have visited them or not: use blue for visited countries and red for unvisited countries.

# Add the "visited" attribute to the nodes
G.nodes["Spain"]["visited"] = True
G.nodes["Portugal"]["visited"] = True
G.nodes["France"]["visited"] = True
G.nodes["Italy"]["visited"] = True
G.nodes["USA"]["visited"] = False
G.nodes["Canada"]["visited"] = True

# Get the "visited" attribute for each node
visited = nx.get_node_attributes(G, 'visited')
# Define node colors based on the "visited" attribute
node_colors = ['blue' if visited.get(node, False) else 'red' for node in G.nodes()]

pos = nx.circular_layout(G)

# Draw the graph with node colors based on the "visited" attribute
nx.draw(
    G,
    pos=pos,
    with_labels=True,
    node_color=node_colors,  # color of the nodes based on "visited" attribute
    edge_color='gray',
    node_size=2000,
    font_size=12,
    )
plt.show()

Visualizing Edge Attributes

We can also visualize edge attributes by showing them as labels on the edges. For example, we can show the distance attribute on the edges:

# Get the distance attribute for each edge
distance = nx.get_edge_attributes(G, 'distance')
# Draw the graph
pos = nx.circular_layout(G)
nx.draw(
    G,
    pos=pos,
    with_labels=True,
    node_color='lightblue',
    edge_color='gray',
    node_size=2000,
    font_size=12,
    )
# Draw edge labels for the distance attribute
nx.draw_networkx_edge_labels(G, pos, edge_labels=distance)
plt.show()

Creating a Graph from an Edge List

In practice, we often have data in the form of an edge list, which is a list of pairs of nodes that are connected by edges. We can create a graph directly from an edge list using the from_edgelist() method. For example:

# Define our edge list (actors that have worked together in movies)
edge_list = [
    ("Antonio Banderas", "Brad Pitt"),  # Interview with the Vampire (1994)
    ("Antonio Banderas", "Javier Bardem"),  # Automata (2014)
    ("Antonio Banderas", "Penelope Cruz"),  # Dolor y Gloria (2019)
    ("Antonio Banderas", "Tom Holland"),  # Uncharted (2022)
    ("Brad Pitt", "Javier Bardem"),  # F1 (2025)
    ("Javier Bardem", "Timothée Chalamet"),  # Dune (2021)
    ("Timothée Chalamet", "Zendaya"),  # Dune (2021)
    ("Tom Holland", "Zendaya"),  # Spider-Man: No Way Home (2021)
]

# Create a graph from the edge list
G_actors = nx.from_edgelist(edge_list)
# Draw the graph
pos = nx.spring_layout(G_actors, k=0.15, iterations=20)
# k controls the distance between the nodes and varies between 0 and 1
# iterations is the number of times simulated annealing is run
# default k=0.1 and iterations=50

nx.draw(
    G_actors,
    pos=pos,
    with_labels=True,
    node_color='lightgreen',
    edge_color='gray',
    node_size=2000,
    font_size=12
    )
plt.show()

Exercise: In the code above, I included the movies in the comments next to the edges. Can you create a graph where the edges are labeled with the movie titles?

What’s Next?

In the next page, we will learn how to analyze the structure of a graph by looking at its connectivity. We will learn about degree, path lengths, and connected components.

Network Connectivity

References

Hagberg, Aric A., Daniel A. Schult, and Pieter J. Swart. 2008. “Exploring Network Structure, Dynamics, and Function Using NetworkX.” In Proceedings of the 7th Python in Science Conference (SciPy 2008), 11–15.
Newman, Mark. 2018. Networks. 2nd ed. Oxford University Press.