Chapter2 - The Basics - Graphs and Trees(1)
Book - Python Algorithms by Magnus Lie Hetland
Dict and Set
Hashing
Hashing involves computing seemingly random integer value from an arbitrary object.
This value can be used as an index into an array.
This mechanism is used in dictionaries, which are implemented using so-called hash tables.
Sets are implemented using the same mechanism.
The important thing is that the hash value can be constructed in essentially constant time.
Itβs constant with respect to the hash table size but linear as a function of the size of the object being hashed.
In short, dict and set are having \(O(n)\) time complextity
Adjacency List and the Like
Adjacency Sets
A Straightforward Adjacency Set Representation
a, b, c, d, e, f, g, h = range(8)
N = [
{b, c, d, e, f}, # a
{c, e}, # b
{d}, # c
{e}, # d
{f}, # e
{c, g, h}, # f
{f, h}, # g
{f, g} # h
]
N[v] is a set of vβs neighbors.
>>> b in N[a]Β Β # Neighborhood membership
True
>>> len(N[f]) # Degree
3
Adjacency Lists
Overheadλ₯Ό λ μ€μ΄λ λ°©λ²μ€ νλλ
Adjacency sets λμ adjacency listsλ₯Ό μ¬μ©νλ κ²μ΄λ€.
Overhead
In computer science, overhead is any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to perform a specific task. It is a special case of engineering overhead. Overhead can be a deciding factor in software design, with regard to structure, error correction, and feature inclusion. Examples of computing overhead may be found in functional programming[citation needed], data transfer, and data structures.
μ€λ²ν€λ(overhead)λ μ΄λ€ μ²λ¦¬λ₯Ό νκΈ° μν΄ λ€μ΄κ°λ κ°μ μ μΈ μ²λ¦¬ μκ° Β· λ©λͺ¨λ¦¬ λ±μ λ§νλ€.
https://en.wikipedia.org/wiki/Overhead_(computing)
https://ko.wikipedia.org/wiki/%EC%98%A4%EB%B2%84%ED%97%A4%EB%93%9C
a, b, c, d, e, f, g, h = range(8)
N = [
[b, c, d, e, f], # a
[c, e], # b
[d], # c
[e], # d
[f], # e
[c, g, h], # f
[f, h], # g
[f, g] # h
]
Adjacency Dicts with Edge Weights
a, b, c, d, e, f, g, h = range(8)
N = [
{b:2, c:1, d:3, e:9, f:4}, # a
{c:4, e:3}, # b
{d:8}, # c
{e:7}, # d
{f:5}, # e
{c:2, g:2, h:2}, # f
{f:1, h:6}, # g
{f:9, g:8} # h
]
>>> b in N[a]Β Β # Neighborhood membership
True
>>> len(N[f]) # Degree
3
>>> N[a][b] # Edge weight for (a, b)
2
A more flexible approach is to use a dict as main structure allowing us to use arbitrary, hashable, node labels.
N = {
'a': set('bcdef'),
'b': set('ce'),
'c': set('d'),
'd': set('e'),
'e': set('f'),
'f': set('cgh'),
'g': set('fh'),
'h': set('fg')
}
If you drop the set in above dict, itβs adjacency strings.
Adjacency Matrices
a, b, c, d, e, f, g, h = range(8) # a b c d e f g h
N = [
[0,1,1,1,1,1,0,0], # a
[0,0,1,0,1,0,0,0], # b
[0,0,0,1,0,0,0,0], # c
[0,0,0,0,1,0,0,0], # d
[0,0,0,0,0,1,0,0], # e
[0,0,1,0,0,0,1,1], # f
[0,0,0,0,0,1,0,1], # g
[0,0,0,0,0,1,1,0]
] # h
Instead of checking whether b is in N[a], you would check whether the matric cell N[a][b] is βTrueβ.
>>> N[a][b] # Neighborhood membership
1
>>> sum(N[f]) # Degree
3
Advantages of Adjacency Matrices:
- Not allowing self-loops, the diagonal is all False.
- For an undirected graph, the adjacency matrix will be symmetric.
Expending adjacency matrices to store weights, instead of βTrueβ, simply store weights. For an edge (u,v), let N[u][v] be the weight w(u,v), instead of βTrueβ. For practical reasons, we let nonexistent edges get an infinite weight, as βinf = float(βinfβ)β
# A Weight Matrix with Infinite Weight for Missing Edges
a, b, c, d, e, f, g, h = range(8)
inf = float('inf') # a b c d e f g h
W = [
[ 0, 2, 1, 3, 9, 4, inf, inf], # a
[inf, 0, 4, inf, 3, inf, inf, inf], # b
[inf, inf, 0, 8, inf, inf, inf, inf], # c
[inf, inf, inf, 0, 7, inf, inf, inf], # d
[inf, inf, inf, inf, 0, 5, inf, inf], # e
[inf, inf, 2, inf, inf, 0, 2, 2], # f
[inf, inf, inf, inf, inf, 1, 0, 6], # g
[inf, inf, inf, inf, inf, 9, 8, 0]
] # h
>>> W[a][b] < inf # Neighborhood membership
True
>>> W[c][e] < inf # Neighborhood membership
False
>>> sum(1 for w in W[a] if w < inf) - 1Β Β # Degree # exclude diagonal by -1
5
Numpy Array for Adjacency Matrix
N = [[0]*3 for i in range(3)]
print("Python list: \n",N)
import numpy as np
N = np.zeros([3,3])
print("Numpy array: \n",N)
# Python list:
# [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
# Numpy array:
# [[0. 0. 0.]
# [0. 0. 0.]
# [0. 0. 0.]]
For a relatively sparse graph, Using a sparse matrix in scipy.sparse module is recommended.
Leave a comment