How Drupal works? [closed]

Could someone provide a architectural overview of the Drupal 7 control flow? Perhaps in the sense of a flowchart about how a page gets generated. What additional resources would you suggest consulting with regards to how Drupal works?


Solution 1:

Drupal can be confusing on this front, partially because it has a relatively deep function stack. Although it's procedural PHP it's purely event/listener driven in its architecture, and there's no simple "flow" in the main PHP script for you to look though. I recently did a presentation on this very subject, and the slides are posted on slideshare, but a quick high-level summary may be useful.

  • Drupal's index.php file functions as a frontside controller. All page are piped through it, and the "actual" url/path the user requested is passed to index.php as a parameter.
  • Drupal's path router system (MenuAPI) is used to match the requested path to a given plugin module. That plugin module is responsible for building the "primary content" of the page.
  • Once the primary page content is built, index.php calls theme('page', $content), which hands off the content to Drupal's theming/skinning system. There, it's wrapped in sidebars/headers/widgets/etc..
  • The rendered page is then handed back to apache and it gets sent back to the user's browser.

During that entire process, Drupal and third-party plugin modules are firing off events, and listening for them to respond. Drupal calls this the 'hook' system, and it's implemented using function naming conventions. The 'blog' module, for example, can intercept 'user' related by implementing a function named blog_user(). In Drupal parlance, that's called hook_user().

It's a bit clunky, but due to a PHP quirk (it keeps an internal hashtable of all loaded functions), it allows Drupal to quickly check for listeners just by iterating over a list of installed plugins. For each plugin it can call function_exists() on the appropriately named pattern, and call the function if it exists. ("I'm firing the 'login' event. Does 'mymodule_login' function exist? I'll call it. Does 'yourmodule_login' exist? No? How about 'nextmodule_login'?" etc.) Again, a touch clunky but it works pretty well.

Everything that happens in Drupal happens because of one of those events being fired. The MenuAPI only knows about what urls/paths are handled by different plugin modules because it fires the 'menu' event (hook_menu) and gathers up all the metadata plugin modules respond with. ("I'll take care of the url 'news/recent', and here's the function to call when that page needs to be built...") Content only gets saved because Drupal's FormAPI is responsible for building a page, and fires the 'a form was submitted' event for a module to respond to. Hourly maintenance happens because hook_cron() is triggered, and any module with mymodulename_cron() as a function name will have its function called.

Everything else is ultimately just details -- important details, but variations on that theme. index.php is the controller, the menu system determins what the "current page" is, and lots of events get fired in the process of building that page. Plugin modules can hook into those events and change the workflow/supply additional information/etc. That's also part of the reason so many Drupal resources focus on making modules. Without modules, Drupal doesn't actually DO anything other than say, 'Someone asked for a page! Does it exist? No? OK, I'll serve up a 404.'

Solution 2:

Drupal Page Serving Mechanism

To understand how Drupal works, you need to understand Drupal's page serving mechanism.

In short, all the calls/urls/requests are served by index.php which loads Drupal by including various include files/modules and then calling the appropriate function, defined in module, to serve the request/url.

Here is the extract from the book, Pro Drupal Development, which explains the Drupal's bootstrap process,

The Bootstrap Process

Drupal bootstraps itself on every request by going through a series of bootstrap phases. These phases are defined in bootstrap.inc and proceed as described in the following sections.

Initialize Configuration

This phase populates Drupal’s internal configuration array and establishes the base URL ($base_url) of the site. The settings.php file is parsed via include_once(), and any variable or string overrides established there are applied. See the “Variable Overrides” and “String Overrides” sections of the file sites/all/default/default.settings.php for details.

Early Page Cache

In situations requiring a high level of scalability, a caching system may need to be invoked before a database connection is even attempted. The early page cache phase lets you include (with include()) a PHP file containing a function called page_cache_ fastpath(), which takes over and returns content to the browser. The early page cache is enabled by setting the page_cache_fastpath variable to TRUE, and the file to be included is defined by setting the cache_inc variable to the file’s path. See the chapter on caching for an example.

Initialize Database

During the database phase, the type of database is determined, and an initial connection is made that will be used for database queries.

Hostname/IP-Based Access Control

Drupal allows the banning of hosts on a per-hostname/IP address basis. In the access control phase, a quick check is made to see if the request is coming from a banned host; if so, access is denied.

Initialize Session Handling

Drupal takes advantage of PHP’s built-in session handling but overrides some of the handlers with its own to implement database-backed session handling. Sessions are initialized or reestablished in the session phase. The global $user object representing the current user is also initialized here, though for efficiency not all properties are available (they are added by an explicit call to the user_load() function when needed).

Late Page Cache

In the late page cache phase, Drupal loads enough supporting code to determine whether or not to serve a page from the page cache. This includes merging settings from the database into the array that was created during the initialize configuration phase and loading or parsing module code. If the session indicates that the request was issued by an anonymous user and page caching is enabled, the page is returned from the cache and execution stops.

Language Determination

At the language determination phase, Drupal’s multilingual support is initialized and a decision is made as to which language will be used to serve the current page based on site and user settings. Drupal supports several alternatives for determining language support, such as path prefix and domain-level language negotiation.

Path

At the path phase, code that handles paths and path aliasing is loaded. This phase enables human-readable URLs to be resolved and handles internal Drupal path caching and lookups.

Full

This phase completes the bootstrap process by loading a library of common functions, theme support, and support for callback mapping, file handling, Unicode, PHP image toolkits, form creation and processing, mail handling, automatically sortable tables, and result set paging. Drupal’s custom error handler is set, and all enabled modules are loaded. Finally, Drupal fires the init hook, so that modules have an opportunity to be notified before official processing of the request begins.

Once Drupal has completed bootstrapping, all components of the framework are available. It is time to take the browser’s request and hand it off to the PHP function that will handle it. The mapping between URLs and functions that handle them is accomplished using a callback registry that takes care of both URL mapping and access control. Modules register their callbacks using the menu hook (for more details, see Chapter 4).

When Drupal has determined that there exists a callback to which the URL of the browser request successfully maps and that the user has permission to access that callback, control is handed to the callback function.

Processing a Request

The callback function does whatever work is required to process and accumulate data needed to fulfill the request. For example, if a request for content such as http://example.com/ q=node/3 is received, the URL is mapped to the function node_page_view() in node.module. Further processing will retrieve the data for that node from the database and put it into a data structure. Then, it’s time for theming.

Theming the Data

Theming involves transforming the data that has been retrieved, manipulated, or created into HTML (or XML or other output format). Drupal will use the theme the administrator has selected to give the web page the correct look and feel. The resulting output is then sent to the web browser (or other HTTP client).

Solution 3:

Eaton's answer provides a good overview. (I'm new here so I can't mod him up, thus the comment.)

The brutal "aha" moment for me was realizing everything happens through index.php, and then through the waterfall of modules (core first, then by site). To extend core functionality don't rewrite it. Instead copy the module into /sites/all/modules/ or /sites/[yoursite]/modules and extend THAT, or create a new module in those places. Same for themes. Module directories can contain display code as well, in the form of tpl, css etc.

If you're used to stricter MVC type frameworks like Rails, Django etc. all this gets a little confusing. Modules can mix in a lot of display code, and if you're looking at someone else's modules or templates you'll eventually wind up walking backwards through the stack. That's the beauty/pain of working in PHP.

Ironically, "just build an app" might be the worst way to learn this. Drupal does so much out of the box that's simply obscure until you figure out the control flow. There's nothing in a tpl file that tells you where a function with a fun name like l() comes from, for example.

Solution 4:

It depends on how deep an understanding you're looking for; if you have a good knowledge of php, I would suggest reading through the code itself, starting with index.php, and then going on to the includes/bootstrap.inc, and then some of the other scripts in that directory.

The key include files:

  • menu.inc is very important to understanding how the overall system works, as it handles a lot of the implicit mapping of URLs to content.
  • common.inc has most of the otherwise-mysterious functions that form the basis of the API.
  • module.inc handles the hook invocations that Eaton mentioned
  • form.inc deals with form display, submission and processing
  • theme.inc handles presentation.

There's also some key functionality in the modules/ directory; in particular, modules/node/node.module forms the basis of the node system, which is in general what's used to encapsulate site content.

The code is, in general, very well-commented and clear. The use of Doxygen markup within the commenting means that the code effectively is the canonical documentation.

It also helps to do this using an editor that can quickly jump to the definition of a function. Using vim in combination with ctags works for me; you do have to tell ctags to index .inc, .module, etc. files as php files.

Solution 5:

This (for Drupal 6) & this (for Drupal 7) is a pretty good architectural overview of drupal. If you want more detail then I would start writing something most of the documentation is good. Trying to learn it at a high level of detail without something concrete to achieve will be much more difficult that trying something out.