Why does the Gaussian-Jordan elimination works when finding the inverse matrix?
You have three types of what are called elementary matrices, representing row changes, scaling, and adding a multiple of one row to another. If you left multiply a matrix by an elementary matrix, you perform that operation; for example, with a 3x3 matrix, the elementary matrix $$\pmatrix{1&0&0\\5&1&0\\0&0&1}$$ adds 5 times the first row to the second (can you figure out how the other two look?). If a matrix $A$ is invertible, there are a set of steps to reduce it to the identity matrix, which also means that we have some set of elementary matrices such that $$E_nE_{n-1}\dots E_2E_1A=I$$ However, by right-multiplying by $A^{-1}$ (since $A$ is invertible), we get $$E_nE_{n-1}\dots E_2E_1I=A^{-1}$$ So by performing the steps to reduce $A$ to the identity matrix, those same steps performed on the identity matrix create the inverse of $A$. If we start with $(A|I)$ and reduce the left side to the identity matrix, then we would end up with $(I|A^{-1})$ based on the above information, which explains the algorithm.
You want to find a matrix $B$ such that $BA = I$. $B$ can be written as a product of elementary matrices iff it is invertible. Hence we attempt to obtain $B$ by left-multiplying $A$ by elementary matrices until it becomes $I$. All we have to do is to keep track of the product of those elementary matrices. But that is exactly what we are doing when we left-multiply $I$ by those same elementary matrices in the same order. This is what is happening with the augmented matrix.