PHP to clean-up pasted Microsoft input
HTML Purifier will create standards compliant markup and filter out many possible attacks (such as XSS).
For faster cleanups that don't require XSS filtering, I use the PECL extension Tidy which is a binding for the Tidy HTML utility.
If those don't help you, I suggest you switch to FCKEditor which has this feature built-in.
In my case, this worked just fine:
$text = strip_tags($text, '<p><a><em><span>');
Rather than trying to pull out stuff you don't want such as embedded word xml, you can just specify you're allowed tags.