How did Chern pictured the first Chern number?
Solution 1:
Chern classes (which are more general then Chern numbers and as well necessary for their construction)are particular cases of characteristic classes. Generally speaking characteristic classes are ways to associate to each vector bundle (complex or real) some classes in cohomologies of the base. They show, roughly speaking, how non-trivial the bundle is (one should be rather cautious about this interpretation: for example vanishing of characteristic classes do not always imply triviality of the bundle).
Chern classes are characteristic classes for complex vector bundles (and complex line bundles can always be regarded as principal $U(1)$-bundles).
The classical (and amazing) introduction to the theory is J.Milnor & J.D. Stasheff's book "Characteristic classes".
I believe characteristic classes where first introduced by Stiefel and Whitney in the middle of 30's -- they probably studied vector fields on manifolds. I'm not sure who and when introduced Chern classes or developed the general theory of characteristic classes via classyfying maps, but I suspect that in some form they where already known to Chern's teacher Èlie Cartan and/or his co-author André Weil. The Chern's development to this theory is giving purely differentialy geometric description of Chern classes as integrals of the curvature form. This approach is now sometimes called Chern-Weil theory.
Answering your second question -- you can find it in almost any basic book on either algebraic geometry, differential geometry or algebraic topology (with possibly different points of view for different choice of subject). I'm sure the already mentioned Milnos&Stasheff's contains it.
Indeed the construction of Chern classes is rather simple, inspite of the fact that one should be familiar (at least on basic properties-level) with sheaves and their cohomologies.