summing weights of edges that are k distances from a subset of nodes in igraph
I have a complex, directed graph with 2 way movements between vertices (see below for dummy example). I am trying to generate an output that would give me the sum of edge weights that are directed to a specific set of target vertices (in the example below, vertex = "22", colored purple in the figure) & those target vertices' neighbors. I want to determine this for k1 (colored blue) and k2 (colored green) neighbors of the target vertex.
In other words, I am trying to determine, for each vertex, the sum of all "out" edge values that are directed towards the target vertices & subsequently the sum of all edge values directed towards k1 neighbours of the target vertex.
The network I have is huge (905,352 edges & 141,861 vertices), so I was hoping to solve the problem with igraph functions as I assume that is the fastest approach, but perhaps I am wrong.
library(igraph)
# create sample data for reproducible example
from <- c(1,2,3,3,4,4,4,4,5,6,6,7,8,8,9,9,10,10,11,11,12,12,13,13,13,13,13,13,13,14,15,15)
to <- c(13,4,7,11,2,6,11,22,4,4,14,13,13,22,13,22,13,22,3,22,5,22,1,7,8,9,10,22,15,6,13,22)
set.seed(22)
weight <-sample(2:200,length(to))
#create dataframe & convert to igraph
graph_df <- data.frame(from,to,weight)
graph <- graph_from_data_frame(graph_df)
#distance to target vertex "22"
dist <- distances(graph,v="22",mode="in",weights=NA)
ggraph(graph, layout = "graphopt") +
geom_edge_link(arrow = arrow(length = unit(3, 'mm')),
end_cap = circle(3, 'mm'),
aes(width = weight), alpha = 0.8) +
scale_edge_width(range = c(0.1, 2)) +
geom_node_point(aes(color=factor(-dist),size = factor(-dist))) +
labs(edge_width = "size movement") +
theme_graph()
The desired output would be:
vertex 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 22
k1 0 0 0 129 0 0 0 63 66 115 111 162 86 0 92 0
k2 138 89 45 102 68 177 17 187 32 94 0 0 482 0 118 0
total 138 89 120 416 68 294 17 250 98 209 161 184 658 152 210 0
where
k1 = sum of edge weights per vertex on edges from k1 neighbors to target
k2 = sum of edge weights per vertex on edges from k1 neighbors to target
total = sum of all outgoing edge weights per vertex (i.e. the weighted out strength)
I have tried using the distances()
function with weights, which gives the correct sum for k1 neighbours, but not for k2 or beyond.
distances(graph,v="22",mode="in")
#result of distances
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 22
224 218 156 129 197 306 103 63 66 115 111 162 86 458 92 0
I have had some luck with dplyr on the edge list dataframe via, but my assumption is there are faster ways to approach this issue:
#dataframe of k1 neighbors & summed weight
k1<- graph_df %>%
mutate(k1 = ifelse(to=="22",weight,NA)) %>%
group_by(from) %>%
summarise(total=sum(weight,na.rm=TRUE),
k1=sum(k1,na.rm=TRUE))
#data frame of k2 neighbors & summed weight
k2 <- graph_df %>%
mutate(k2=ifelse(to %in% k1$from[k1$k1>0],weight,NA)) %>%
group_by(from) %>%
summarise(k2 =sum(k2,na.rm=TRUE))
#join
out <- left_join(k1,k2,by="from") %>% rename(vertex=from)
# A tibble: 15 × 4
vertex total k1 k2
<dbl> <int> <int> <int>
1 1 138 0 138
2 2 89 0 89
3 3 120 0 45
4 4 416 129 102
5 5 68 0 68
6 6 294 0 177
7 7 17 0 17
8 8 250 63 187
9 9 98 66 32
10 10 209 115 94
11 11 161 111 0
12 12 184 162 0
13 13 658 86 482
14 14 152 0 0
15 15 210 92 118
Solution 1:
Perhaps you can try this
graph_df %>%
group_by(from) %>%
summarise(total = sum(weight)) %>%
full_join(
graph_df %>%
filter(to %in% 22) %>%
group_by(from) %>%
summarise(K1 = sum(weight)) %>%
full_join(
graph_df %>%
filter(to %in% neighbors(graph, "22", mode = "in")) %>%
group_by(from) %>%
summarise(K2 = sum(weight))
)
) %>%
arrange(from) %>%
replace(is.na(.), 0) %>%
rename(vertex = from)
which gives
vertex total K1 K2
<dbl> <int> <int> <int>
1 1 138 0 138
2 2 89 0 89
3 3 120 0 45
4 4 416 129 102
5 5 68 0 68
6 6 294 0 177
7 7 17 0 17
8 8 250 63 187
9 9 98 66 32
10 10 209 115 94
11 11 161 111 0
12 12 184 162 0
13 13 658 86 482
14 14 152 0 0
15 15 210 92 118