Number of edges in a graph with n vertices and k connected components

Let $m$ be the number of edges, $n$ the number of vertices and $k$ the number of connected components of a graph G.

Prove that:

$m$ $\leq$ $\frac{(n-k+1)*(n-k)}{2}$

Thanks!


Solution 1:

The maximum number of edges is clearly achieved when all the components are complete.

Moreover the maximum number of edges is achieved when all of the components except one have one vertex. The proof is by contradiction. Suppose the maximum is achieved in another case. Then there exist two components with more than one vertex say the number of vertices are $n$ and $m$ . Pick the one with the less vertices suppose it is $m$ vertices. Take one of it vertices and delete it. removing $m-1$ edges. now add a new vertex to the component with $n$ vertices and join it to all its vertices, adding $n$ edges. This graph has more edges, contradicting the maximality of the graph.

Hence the maximum is achieved when only one of the components has more than one vertex. How many vertices does this graph have? the big component has $n-k+1$ vertices and is the only one with edges. So it has $\frac{(n-k+1)(n-k)}{2}$ edges.

Solution 2:

Below is the proof replicated from the book by Narsingh Deo, which I myself do not completely realize, but putting it here for reference and also in hope that someone will help me understand it completely. Things in red are what I am not able to understand.

Proof

Let the number of vertices in each of the $k$ components of a graph G be $n_1,n_2,...,n_k$. Thus we have,

$$n_1+n_2+...+n_k=n$$ $$n_i\geq 1$$

The proof of the theorem is based on the inequality $$\sum^k_{i=1}n_i^2\leq n^2 -(k-1)(2n-k)$$

This inequality can be proved as follows.

$$\sum_{i=1}^k(n_i-1)=n-k$$ Squaring both side, $$\left(\sum_{i=1}^k(n_i-1)\right)^2=n^2+k^2-2nk$$ or $$\color{red}{\sum_{i=1}^k(n_i^2-2n_i)+k+\text{nonnegative cross terms}= n^2+k^2-2nk}$$

because $(n_i-1)\geq 0$, for all $i$.

Therefore, $$\color{red}{\sum_{i=1}^kn_i^2\leq n^2+k^2-2nk-k+2n=n^2-(k-1)(2n-k)}$$

Thus the required inequality is proved.

Now the maximum number of edges in $i^{th}$ component of G (which is simple connected graph) is $\frac{1}{2}n_i(n_i-1)$. Therefore, the maximum number of edges in $G$ is

$$\frac{1}{2}\sum^k_{i=1}(n_i-1)n_i=\frac{1}{2}\left( \sum_{i=1}^kn_i^2 \right) - \frac{n}{2}$$ $$\leq \frac{1}{2} \left( n^2-(k-1)(2n-k) \right) - \frac{n}{2}$$ $$=\frac{1}{2}(n-k)(n-k+1)$$

Solution 3:

I've answered the OP's specific question as to how the book's proof makes sense. I haven't given the complete proof in my answer. I have just explained the steps marked in red, in @Mahesha999's answer. Thus, this is just an elaborate extension of @Mahesha999's answer.


A more detail look into the algebraic proof.

Assuming $n_1 + n_2 + ... + n_k = n$ and $n_i \geq 1$, the proof from the book uses the following algebraic identity to solve the problem:

$$\sum^k_{i=1}n_i^2\leq n^2 -(k-1)(2n-k) \;\;\;\;\;...(1)$$

At a first glance, what happens internally might not seem apparent. The proof for the above identity follows from expanding the following expression.

$$\left(\sum_{i=1}^k(n_i-1)\right)^2=n^2+k^2-2nk \;\;\;\;\;...(2)$$

The choice of using the term $(n_i - 1)$ follows directly as $n_i \geq 1$ or $n_i - 1 \geq 0$. This is a maximization problem, thus, the problem must either be solved by maximizing a positive term (or trying to equate a part of it to zero) or by minimizing a negative term. Note that $n$ is assumed to be a constant, but we are free to vary the distribution of the number of vertices in each of the components in the graph; thus we are free to vary the values taken by $n_1, n_2, ..., n_k$ as long as their sum remains equal to $n$. (2) can be written as,

$$\sum_{i=1}^k(n_i^2-2n_i)+k+\sum_{i, j \in [1, k], i \neq j}((n_i - 1)(n_j-1))= n^2+k^2-2nk \;\;\;\;\;...(3)$$

The positive terms that are neglected are, $$\sum_{i, j \in [1, k], i \neq j}((n_i - 1)(n_j-1))\;\;\;\;\;...(4)$$

I was reading the same book and I had the same problem. Fortunately, I was able to understand it in the following way. If simply removing the positive terms was enough, then it is possible to write,

$$\sum_{i=1}^kn_i^2 < n^2-(k-1)(2n-k)$$

But claiming

$$\sum_{i=1}^kn_i^2 \leq n^2-(k-1)(2n-k)$$

Requires us to have ways for convincing ourselves that the value of $\sum_{i=1}^kn_i^2$ can become equal to $n^2-(k-1)(2n-k)$ for some values of $n_i$.

The RHS in (3) fully involves constants. Thus, its value is bound to remain static. Is it possible to vary the values of $n_i$, as long as its sum equals $n$. Hence to maximize the value of the term $\sum_{i=1}^kn_i^2$ (which is our ultimate goal), we must minimize the value of the term (4), all the while ensuring that the sum $\sum n_i$ equals $n$. Doing this will maximize $\sum_{i=1}^kn_i^2$ because, the RHS does not change as $n$ and $k$ are fixed; thus, out of the two terms present in the LHS, reducing the value of (4) must increase the value of the term $\sum_{i=1}^kn_i^2$. Maximizing the term $\sum_{i=1}^kn_i^2$ eventually causes the summation $\frac{1}{2}\sum^k_{i = 1}(n_i (n_i-1))$ to be maximized leading us to the result.

Thus we must just show that (4) can be equated to $0$, with the value of the summation $\sum(n_i)$ still being equal to $n$.

$$\sum_{i = 1}^k \sum_{j = i + 1}^k (n_i - 1)(n_j-1) = 0, \sum_{i = 1}^k n_i = n ...(5)$$

For a constant $ 1 \leq c \leq k $, let's assign $n_c = n- k$ and for all values of $i$, with $i \neq c$, assign $n_i = 1$. As every term $(n_i - 1)$ in (4) has every other term $(n_j - 1)$ (with $i \neq j$ ) as a coefficient. Thus all terms reduce to zero. This it has been established that (4) can take the value zero. But the RHS remains the same; hence to compensate for the loss in magnitude, the term $\sum_{i=1}^kn_i^2$ get maximized.

Hence we have shown the validity of (5). Thus, we can write (3) as,

$$\sum_{i=1}^k(n_i^2-2n_i)+k+\sum_{i, j \in [1, k], i \neq j}((n_i - 1)(n_j-1))= n^2+k^2-2nk$$

or,

$$\sum_{i=1}^k(n_i^2-2n_i)+k \leq n^2+k^2-2nk \;\;\;\;\;...(6)$$