Solution 1:

In this sentence, to cross the road is an objective predicative complement within the sentence.

Such phrases take the general form:

SUBJECT + VERB + OBJECT + COMPLEMENT

The object here is clearly him.

Side note

As far as I'm aware, the "objective complement" is a rather peculiar construct to the English language. Other European languages (I can vouch for French and Spanish at least) require a separate clause, something akin to I saw that he crossed the road. (Also perfectly grammatical English, but less natural.)

Solution 2:

A verb (cross) cannot be the object.

Break the statement into smaller pieces and the true object becomes more obvious:

I saw him

'I' subject, 'saw' verb, 'him' accusative object

he crossed the road

crossing is a separate activity to the seeing