In this thread (posted about a year ago) there is a discussion of problems that can come with running Word in a non-interactive session. The (quite strong) advice given there is not to do so. In one post it is stated "The Office APIs all assume you are running Office in an interactive session on a desktop, with a monitor, keyboard and mouse and, most importantly, a message pump." I'm not sure what that is. (I've been programming in C# for only about a year; my other programming experience has primarily been with ColdFusion.)

Update:

My program runs through a large number of RTF files to extract two pieces of information used to construct a medical report number. Rather than try and figure out how the formatting instructions in RTF work, I decided to just open them in Word and pull the text out from there (without actually starting the GUI). Occasionally, the program hiccuped in the middle of processing one file, and left a Word thread open attached to that document (I still have to figure out how to shut that one down). When I re-ran the program, of course I got a notification that there was a thread using that file, and did I want to open a read-only copy? When I said Yes, the Word GUI suddenly popped up from nowhere and started processing the files. I was wondering why that happened; but it looks like maybe once the dialog box popped up the message pump started pushing the main GUI to Windows as well?


Solution 1:

A message loop is a small piece of code that exists in any native Windows program. It roughly looks like this:

MSG msg;
while (GetMessage(&msg, NULL, 0, 0))
{ 
   TranslateMessage(&msg); 
   DispatchMessage(&msg); 
} 

The GetMessage() Win32 API retrieves a message from Windows. Your program typically spends 99.9% of its time there, waiting for Windows to tell it something interesting happened. TranslateMessage() is a helper function that translates keyboard messages. DispatchMessage() ensures that the window procedure is called with the message.

Every GUI enabled .NET program has a message loop, it is started by Application.Run().

The relevance of a message loop to Office is related to COM. Office programs are COM-enabled programs, that's how the Microsoft.Office.Interop classes work. COM takes care of threading on behalf of a COM coclass, it ensures that calls made on a COM interface are always made from the correct thread. Most COM classes have a registry key in the registry that declares their ThreadingModel, by far the most common ones (including Office) use "Apartment". Which means that the only safe way to call an interface method is by making the call from the same thread that created the class object. Or to put it another way: by far most COM classes are not thread-safe.

Every COM enabled thread belongs to a COM apartment. There are two kinds, Single Threaded Apartments (STA) and a Multi Thread Apartment (MTA). An apartment threaded COM class must be created on an STA thread. You can see this back in .NET programs, the entry point of the UI thread of a Windows Forms or WPF program has the [STAThread] attribute. The apartment model for other threads is set by the Thread.SetApartmentState() method.

Large parts of Windows plumbing won't work correctly if the UI thread is not STA. Notably Drag+Drop, the clipboard, Windows dialogs like OpenFileDialog, controls like WebBrowser, UI Automation apps like screen readers. And many COM servers, like Office.

A hard requirement for an STA thread is that it should never block and must pump a message loop. The message loop is important because that's what COM uses to marshal an interface method call from one thread to another. Although .NET makes marshaling calls easy (Control.BeginInvoke or Dispatcher.BeginInvoke for example), it is actually a very tricky thing to do. The thread that executes the call must be in a well-known state. You can't just arbitrarily interrupt a thread and force it to make a method call, that would cause horrible re-entrancy problems. A thread should be "idle", not busy executing any code that is mutating the state of the program.

Perhaps you can see where that leads: yes, when a program is executing the message loop, it is idle. The actual marshaling takes place through a hidden window that COM creates, it uses PostMessage to have the window procedure of that window execute code. On the STA thread. The message loop ensures that this code runs.

Solution 2:

The "message pump" is a core part of any Windows program that is responsible for dispatching windowing messages to the various parts of the application. This is the core of Win32 UI programming. Because of its ubiquity, many applications use the message pump to pass messages between different modules, which is why Office applications will break if they are run without any UI.

Wikipedia has a basic description.

Solution 3:

John is talking about how the Windows system (and other window based systems - X Window, original Mac OS....) implement asynchronous user interfaces using events via a message system.

Behind the scenes for each application there is a messaging system where each window can send events to other windows or event listeners -- this is implemented by adding a message to the message queue. There is a main loop which always runs looking at this message queue and then dispatching the messages (or events) to the listeners.

The Wikipedia article Message loop in Microsoft Windows shows example code of a basic Windows program -- and as you can see at the most basic level a Windows program is just the "message pump".

So, to pull it all together. The reason a windows program designed to support a UI can't act as a service is because it needs the message loop running all the time to enable UI support. If you implement it as a service as described, it won't be able to process the internal asynchronous event handling.

Solution 4:

In COM, a message pump serialises and de-serialises messages sent between apartments. An apartment is a mini process in which COM components can be run. Apartments come in single threaded and free threaded modes. Single threaded apartments are mainly a legacy system for applications of COM components that don't support multi-threading. They were typically used with Visual BASIC (as this did not support multi-threaded code) and legacy applications.

I guess that the message pump requirement for Word stems from either the COM API or parts of the application not being thread safe. Bear in mind that the .NET threading and garbage collection models don't play nicely with COM out of the box. COM has a very simplistic garbage collection mechanism and threading model that requires you to do things the COM way. Using the standard Office PIAs still requires you to explicitly shut down COM object references, so you need to keep track of every COM handle created. The PIAs will also create stuff behind the scenes if you're not careful.

.NET-COM integration is a whole topic all by itself, and there are even books written on the subject. Even using COM APIs for Office from an interactive desktop application requires you to jump through hoops and make sure that references are explicitly released.

Office can be assumed to be thread-unsafe, so you will need a separate instance of Word, Excel or other Office applications for each thread. You would have to incur the starting overhead or maintain a thread pool. A thread pool would have to be meticulously tested to make sure all COM references were correctly released. Even starting and shutting down instances requires you to make sure all references are released correctly. Failure to dot your i's and cross your t's here will result in large numbers of dead COM objects and even whole running instances of Word being leaked.

Solution 5:

Wikipedia suggests it means the program's main Event Loop.