Extending setuptools extension to use CMake in setup.py?

I'm writing a Python extension that links a C++ library and I'm using cmake to help with the build process. This means that right now, the only way I know how to bundle it, I have to first compile them with cmake before I can run setup.py bdist_wheel. There must be a better way.

I was wondering if it's possible (or anybody has tried) to invoke CMake as part of the setup.py ext_modules build process? I'm guessing there is a way to create a subclass of something but I'm not sure where to look.

I'm using CMake because it gives me so much more control for building c and c++ libraries extensions with complex build steps exactly as I want it. Plus, I can easily build Python extensions directly with cmake with the PYTHON_ADD_MODULE() command in the findPythonLibs.cmake. I just wish this was all one step.


What you basically need to do is to override the build_ext command class in your setup.py and register it in the command classes. In your custom impl of build_ext, configure and call cmake to configure and then build the extension modules. Unfortunately, the official docs are rather laconic about how to implement custom distutils commands (see Extending Distutils); I find it much more helpful to study the commands code directly. For example, here is the source code for the build_ext command.

Example project

I have prepared a simple project consisting out of a single C extension foo and a python module spam.eggs:

so-42585210/
├── spam
│   ├── __init__.py  # empty
│   ├── eggs.py
│   ├── foo.c
│   └── foo.h
├── CMakeLists.txt
└── setup.py

Files for testing the setup

These are just some simple stubs I wrote to test the setup script.

spam/eggs.py (only for testing the library calls):

from ctypes import cdll
import pathlib


def wrap_bar():
    foo = cdll.LoadLibrary(str(pathlib.Path(__file__).with_name('libfoo.dylib')))
    return foo.bar()

spam/foo.c:

#include "foo.h"

int bar() {
    return 42;
}

spam/foo.h:

#ifndef __FOO_H__
#define __FOO_H__

int bar();

#endif

CMakeLists.txt:

cmake_minimum_required(VERSION 3.10.1)
project(spam)
set(src "spam")
set(foo_src "spam/foo.c")
add_library(foo SHARED ${foo_src})

Setup script

This is where the magic happens. Of course, there is a lot of room for improvements - you could pass additional options to CMakeExtension class if you need to (for more info on the extensions, see Building C and C++ Extensions), make the CMake options configurable via setup.cfg by overriding methods initialize_options and finalize_options etc.

import os
import pathlib

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext as build_ext_orig


class CMakeExtension(Extension):

    def __init__(self, name):
        # don't invoke the original build_ext for this special extension
        super().__init__(name, sources=[])


class build_ext(build_ext_orig):

    def run(self):
        for ext in self.extensions:
            self.build_cmake(ext)
        super().run()

    def build_cmake(self, ext):
        cwd = pathlib.Path().absolute()

        # these dirs will be created in build_py, so if you don't have
        # any python sources to bundle, the dirs will be missing
        build_temp = pathlib.Path(self.build_temp)
        build_temp.mkdir(parents=True, exist_ok=True)
        extdir = pathlib.Path(self.get_ext_fullpath(ext.name))
        extdir.mkdir(parents=True, exist_ok=True)

        # example of cmake args
        config = 'Debug' if self.debug else 'Release'
        cmake_args = [
            '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + str(extdir.parent.absolute()),
            '-DCMAKE_BUILD_TYPE=' + config
        ]

        # example of build args
        build_args = [
            '--config', config,
            '--', '-j4'
        ]

        os.chdir(str(build_temp))
        self.spawn(['cmake', str(cwd)] + cmake_args)
        if not self.dry_run:
            self.spawn(['cmake', '--build', '.'] + build_args)
        # Troubleshooting: if fail on line above then delete all possible 
        # temporary CMake files including "CMakeCache.txt" in top level dir.
        os.chdir(str(cwd))


setup(
    name='spam',
    version='0.1',
    packages=['spam'],
    ext_modules=[CMakeExtension('spam/foo')],
    cmdclass={
        'build_ext': build_ext,
    }
)

Testing

Build the project's wheel, install it. Test the library is installed:

$ pip show -f spam
Name: spam
Version: 0.1
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /Users/hoefling/.virtualenvs/stackoverflow/lib/python3.6/site-packages
Requires: 
Files:
  spam-0.1.dist-info/DESCRIPTION.rst
  spam-0.1.dist-info/INSTALLER
  spam-0.1.dist-info/METADATA
  spam-0.1.dist-info/RECORD
  spam-0.1.dist-info/WHEEL
  spam-0.1.dist-info/metadata.json
  spam-0.1.dist-info/top_level.txt
  spam/__init__.py
  spam/__pycache__/__init__.cpython-36.pyc
  spam/__pycache__/eggs.cpython-36.pyc
  spam/eggs.py
  spam/libfoo.dylib

Run the wrapper function from spam.eggs module:

$ python -c "from spam import eggs; print(eggs.wrap_bar())"
42

I would like to add my own answer to this, as a sort of addendum to what hoefling described.

Thanks, hoefling, as your answer helped get me on track towards writing a setup script in much the same manner for my own repository.

Preamble

The primary motivation for writing this answer is trying to "glue together" the missing pieces. The OP does not state the nature of the C/ C++ Python module being developed; I'd like to make it clear up front that the below steps are for a C/ C++ cmake build chain that creates multiple .dll/ .so files as well as a precompiled *.pyd/so file in addition to some generic .py files that need to be placed in the scripts directory.

All of these files come to fruition directly after the cmake build command is run... fun. There is no recommendation for building a setup.py this way.

Because setup.py implies that your scripts are going to be some part of your package/ library and that .dll files that need to be built must be declared through the libraries portion, with sources and include dirs listed, there is no intuitive way to tell setuptools that the libraries, scripts and data files that are resultant of one call to cmake -b that occured in build_ext should all go in their own respective places. Worse still if you want to have this module be tracked by setuptools and fully uninstallable, meaning users can uninstall it and have every trace wiped off their system, if so desired.

The module that I was writing a setup.py for is bpy, the .pyd/ .so equivalent of building blender as a python module as described here:

https://wiki.blender.org/wiki//User:Ideasman42/BlenderAsPyModule (better instructions but now dead link) http://www.gizmoplex.com/wordpress/compile-blender-as-python-module/ (possibly worse instructions but seems to be online still)

You can check out my repository on github here:

https://github.com/TylerGubala/blenderpy

That is my motivation behind writing this answer, and hopefully will help anyone else trying to accomplish something similar rather than throwing away their cmake build chain or, worse yet, having to maintain two separate build environments. I apologize if it is off topic.

So what do I do to accomplish this?

  1. Extend the setuptools.Extension class with a class of my own, which does not contain entries for the sources or libs properties

  2. Extend the setuptools.commands.build_ext.build_ext class with a class of my own, which has a custom method which performs my necessary build steps (git, svn, cmake, cmake --build)

  3. Extend the distutils.command.install_data.install_data class (yuck, distutils... however there doesn't seem to be a setuputils equivalent) with a class of my own, to mark the built binary libraries during setuptools' record creation (installed-files.txt) such that

    • The libraries will be recorded and will be uninstalled with pip uninstall package_name

    • The command py setup.py bdist_wheel will work natively as well, and can be used to provide precompiled versions of your source code

  4. Extend the setuptools.command.install_lib.install_lib class with a class of my own, which will ensure that the built libraries are moved from their resultant build folder into the folder that setuptools expects them in (on Windows it will put the .dll files in a bin/Release folder and not where setuptools expects it)

  5. Extend the setuptools.command.install_scripts.install_scripts class with a class of my own such that the scripts files are copied to the correct directory (Blender expects the 2.79 or whatever directory to be in the scripts location)

  6. After the build steps are performed, copy those files into a known directory that setuptools will copy into the site-packages directory of my environment. At this point the remaining setuptools and distutils classes can take over writing the installed-files.txt record and will be fully removable!

Sample

Here is a sample, more or less from my repository, but trimmed for clarity of the more specific things (you can always head over to the repo and look at it for yourself)

from distutils.command.install_data import install_data
from setuptools import find_packages, setup, Extension
from setuptools.command.build_ext import build_ext
from setuptools.command.install_lib import install_lib
from setuptools.command.install_scripts import install_scripts
import struct

BITS = struct.calcsize("P") * 8
PACKAGE_NAME = "example"

class CMakeExtension(Extension):
    """
    An extension to run the cmake build

    This simply overrides the base extension class so that setuptools
    doesn't try to build your sources for you
    """

    def __init__(self, name, sources=[]):

        super().__init__(name = name, sources = sources)

class InstallCMakeLibsData(install_data):
    """
    Just a wrapper to get the install data into the egg-info

    Listing the installed files in the egg-info guarantees that
    all of the package files will be uninstalled when the user
    uninstalls your package through pip
    """

    def run(self):
        """
        Outfiles are the libraries that were built using cmake
        """

        # There seems to be no other way to do this; I tried listing the
        # libraries during the execution of the InstallCMakeLibs.run() but
        # setuptools never tracked them, seems like setuptools wants to
        # track the libraries through package data more than anything...
        # help would be appriciated

        self.outfiles = self.distribution.data_files

class InstallCMakeLibs(install_lib):
    """
    Get the libraries from the parent distribution, use those as the outfiles

    Skip building anything; everything is already built, forward libraries to
    the installation step
    """

    def run(self):
        """
        Copy libraries from the bin directory and place them as appropriate
        """

        self.announce("Moving library files", level=3)

        # We have already built the libraries in the previous build_ext step

        self.skip_build = True

        bin_dir = self.distribution.bin_dir

        # Depending on the files that are generated from your cmake
        # build chain, you may need to change the below code, such that
        # your files are moved to the appropriate location when the installation
        # is run

        libs = [os.path.join(bin_dir, _lib) for _lib in 
                os.listdir(bin_dir) if 
                os.path.isfile(os.path.join(bin_dir, _lib)) and 
                os.path.splitext(_lib)[1] in [".dll", ".so"]
                and not (_lib.startswith("python") or _lib.startswith(PACKAGE_NAME))]

        for lib in libs:

            shutil.move(lib, os.path.join(self.build_dir,
                                          os.path.basename(lib)))

        # Mark the libs for installation, adding them to 
        # distribution.data_files seems to ensure that setuptools' record 
        # writer appends them to installed-files.txt in the package's egg-info
        #
        # Also tried adding the libraries to the distribution.libraries list, 
        # but that never seemed to add them to the installed-files.txt in the 
        # egg-info, and the online recommendation seems to be adding libraries 
        # into eager_resources in the call to setup(), which I think puts them 
        # in data_files anyways. 
        # 
        # What is the best way?

        # These are the additional installation files that should be
        # included in the package, but are resultant of the cmake build
        # step; depending on the files that are generated from your cmake
        # build chain, you may need to modify the below code

        self.distribution.data_files = [os.path.join(self.install_dir, 
                                                     os.path.basename(lib))
                                        for lib in libs]

        # Must be forced to run after adding the libs to data_files

        self.distribution.run_command("install_data")

        super().run()

class InstallCMakeScripts(install_scripts):
    """
    Install the scripts in the build dir
    """

    def run(self):
        """
        Copy the required directory to the build directory and super().run()
        """

        self.announce("Moving scripts files", level=3)

        # Scripts were already built in a previous step

        self.skip_build = True

        bin_dir = self.distribution.bin_dir

        scripts_dirs = [os.path.join(bin_dir, _dir) for _dir in
                        os.listdir(bin_dir) if
                        os.path.isdir(os.path.join(bin_dir, _dir))]

        for scripts_dir in scripts_dirs:

            shutil.move(scripts_dir,
                        os.path.join(self.build_dir,
                                     os.path.basename(scripts_dir)))

        # Mark the scripts for installation, adding them to 
        # distribution.scripts seems to ensure that the setuptools' record 
        # writer appends them to installed-files.txt in the package's egg-info

        self.distribution.scripts = scripts_dirs

        super().run()

class BuildCMakeExt(build_ext):
    """
    Builds using cmake instead of the python setuptools implicit build
    """

    def run(self):
        """
        Perform build_cmake before doing the 'normal' stuff
        """

        for extension in self.extensions:

            if extension.name == 'example_extension':

                self.build_cmake(extension)

        super().run()

    def build_cmake(self, extension: Extension):
        """
        The steps required to build the extension
        """

        self.announce("Preparing the build environment", level=3)

        build_dir = pathlib.Path(self.build_temp)

        extension_path = pathlib.Path(self.get_ext_fullpath(extension.name))

        os.makedirs(build_dir, exist_ok=True)
        os.makedirs(extension_path.parent.absolute(), exist_ok=True)

        # Now that the necessary directories are created, build

        self.announce("Configuring cmake project", level=3)

        # Change your cmake arguments below as necessary
        # Below is just an example set of arguments for building Blender as a Python module

        self.spawn(['cmake', '-H'+SOURCE_DIR, '-B'+self.build_temp,
                    '-DWITH_PLAYER=OFF', '-DWITH_PYTHON_INSTALL=OFF',
                    '-DWITH_PYTHON_MODULE=ON',
                    f"-DCMAKE_GENERATOR_PLATFORM=x"
                    f"{'86' if BITS == 32 else '64'}"])

        self.announce("Building binaries", level=3)

        self.spawn(["cmake", "--build", self.build_temp, "--target", "INSTALL",
                    "--config", "Release"])

        # Build finished, now copy the files into the copy directory
        # The copy directory is the parent directory of the extension (.pyd)

        self.announce("Moving built python module", level=3)

        bin_dir = os.path.join(build_dir, 'bin', 'Release')
        self.distribution.bin_dir = bin_dir

        pyd_path = [os.path.join(bin_dir, _pyd) for _pyd in
                    os.listdir(bin_dir) if
                    os.path.isfile(os.path.join(bin_dir, _pyd)) and
                    os.path.splitext(_pyd)[0].startswith(PACKAGE_NAME) and
                    os.path.splitext(_pyd)[1] in [".pyd", ".so"]][0]

        shutil.move(pyd_path, extension_path)

        # After build_ext is run, the following commands will run:
        # 
        # install_lib
        # install_scripts
        # 
        # These commands are subclassed above to avoid pitfalls that
        # setuptools tries to impose when installing these, as it usually
        # wants to build those libs and scripts as well or move them to a
        # different place. See comments above for additional information

setup(name='my_package',
      version='1.0.0a0',
      packages=find_packages(),
      ext_modules=[CMakeExtension(name="example_extension")],
      description='An example cmake extension module',
      long_description=open("./README.md", 'r').read(),
      long_description_content_type="text/markdown",
      keywords="test, cmake, extension",
      classifiers=["Intended Audience :: Developers",
                   "License :: OSI Approved :: "
                   "GNU Lesser General Public License v3 (LGPLv3)",
                   "Natural Language :: English",
                   "Programming Language :: C",
                   "Programming Language :: C++",
                   "Programming Language :: Python",
                   "Programming Language :: Python :: 3.6",
                   "Programming Language :: Python :: Implementation :: CPython"],
      license='GPL-3.0',
      cmdclass={
          'build_ext': BuildCMakeExt,
          'install_data': InstallCMakeLibsData,
          'install_lib': InstallCMakeLibs,
          'install_scripts': InstallCMakeScripts
          }
    )

Once the setup.py has been authored this way, building the python module is as simple as running py setup.py, which will run the build and produce the outfiles.

It is recommended that you produce a wheel for users over slow internet or who do not want to build from sources. To do that, you will want to install the wheel package (py -m pip install wheel) and produce a wheel distribution by performing py setup.py bdist_wheel, and then upload it using twine like any other package.