Compiling kernel version >= 2.6.34 on CentOS 5: RAID set "ddf1_foo" was not activated?

I want to mount a Ceph FS on some CentOS 5 servers. Since the ceph-fuse failed with below errors:

# ceph-fuse --no-fuse-big-writes -m 192.168.2.15:6789 /mnt/ceph/
ceph-fuse[7528]: starting ceph client
ceph-fuse[7528]: starting fuse
fuse: unknown option `atomic_o_trunc'
2013-04-04 13:51:21.128506 2b82d6e9e8f0 -1 fuse_lowlevel_new failed
ceph-fuse[7528]: fuse finished with error 33
ceph-fuse[7526]: mount failed: (33) Numerical argument out of domain

Google pointed to this but CentOS 5.x shipped with kernel 2.6.18, I'm going to compile a newer kernel that supports Ceph.

  • My first attempt is with kernel-lt 3.0.71 from the ELRepo
  • The second one is 2.6.34.14 from the kernel.org

The .config was copied from the running kernel with 2 additional settings:

CONFIG_SYSFS_DEPRECATED_V2=y
CONFIG_CEPH_FS=m

But both of them give me the following error:

enter image description here

The first warning can be got rid of by editing the init script after extracting the kernel image and removing 2 lines:

echo "Loading dm-region-hash.ko module"
insmod /lib/dm-region-hash.ko 

http://funky-dennis.livejournal.com/3290.html

How about the second error:

device-mapper: table: 253:0: mirror: Error creating mirror dirty log
RAID set "ddf1_bar" was not activated
  • 2.6.18's init script: http://fpaste.org/byZ3/
  • 2.6.34.14's: http://fpaste.org/8COr/

They are mostly the same except that the following modules are not loaded into the newer kernel:

echo "Loading dm-mem-cache.ko module"
insmod /lib/dm-mem-cache.ko 
echo "Loading dm-message.ko module"
insmod /lib/dm-message.ko 
echo "Loading dm-raid45.ko module"
insmod /lib/dm-raid45.ko 

Is this the reason for RAID set "ddf1_foo" was not activated?


UPDATE Thu Apr 4 21:40:32 ICT 2013

http://alistairphipps.com/wiki/index.php?title=Notes#LVM

A strange error message similar to "mirror log: unrecognised sync argument to mirror log: 2", "table: mirror: Error creating mirror dirty log" means you have mismatched kernel device mapper and userspace tools versions: probably your kernel is too recent for your version of the lvm tools. Install the latest device mapper and lvm2 from sources, and it should work.

I've tried to compile the latest version of LVM2:

# /usr/sbin/lvm version
  LVM version:     2.02.98(2) (2012-10-15)
  Library version: 1.02.67-RHEL5 (2011-10-14)
  Driver version:  4.11.6

but nothing change.


UPDATE Sat Apr 6 18:51:31 ICT 2013

/lib/modules/2.6.18-274.el5/kernel/drivers/md/

|-- dm-crypt.ko
|-- dm-emc.ko
|-- dm-hp-sw.ko
|-- dm-log.ko
|-- dm-mem-cache.ko
|-- dm-message.ko
|-- dm-mirror.ko
|-- dm-mod.ko
|-- dm-multipath.ko
|-- dm-raid45.ko
|-- dm-rdac.ko
|-- dm-region_hash.ko
|-- dm-round-robin.ko
|-- dm-snapshot.ko
|-- dm-zero.ko
|-- faulty.ko
|-- linear.ko
|-- multipath.ko
|-- raid0.ko
|-- raid1.ko
|-- raid10.ko
|-- raid456.ko
`-- xor.ko

/lib/modules/2.6.34.14/kernel/drivers/md/

|-- dm-crypt.ko
|-- dm-log.ko
|-- dm-mirror.ko
|-- dm-mod.ko
|-- dm-multipath.ko
|-- dm-region-hash.ko
|-- dm-round-robin.ko
|-- dm-snapshot.ko
|-- dm-zero.ko
|-- faulty.ko
|-- linear.ko
|-- multipath.ko
|-- raid0.ko
|-- raid1.ko
|-- raid10.ko
|-- raid456.ko
`-- raid6_pq.ko

UPDATE Wed Apr 10 11:22:54 ICT 2013

Do a search in the source folder, I found this:

# grep -lr 'Error creating mirror dirty log' /usr/src/linux-2.6.34.14
/usr/src/linux-2.6.34.14/drivers/md/dm-raid1.c

dm-raid1.c:

static struct dm_dirty_log *create_dirty_log(struct dm_target *ti,
                         unsigned argc, char **argv,
                         unsigned *args_used)
{
    unsigned param_count;
    struct dm_dirty_log *dl;

    if (argc < 2) {
        ti->error = "Insufficient mirror log arguments";
        return NULL;
    }

    if (sscanf(argv[1], "%u", &param_count) != 1) {
        ti->error = "Invalid mirror log argument count";
        return NULL;
    }

    *args_used = 2 + param_count;

    if (argc < *args_used) {
        ti->error = "Insufficient mirror log arguments";
        return NULL;
    }

    dl = dm_dirty_log_create(argv[0], ti, mirror_flush, param_count,
                 argv + 2);
    if (!dl) {
        ti->error = "Error creating mirror dirty log";
        return NULL;
    }

    return dl;
}

dm-log.c:

struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
            struct dm_target *ti,
            int (*flush_callback_fn)(struct dm_target *ti),
            unsigned int argc, char **argv)
{
    struct dm_dirty_log_type *type;
    struct dm_dirty_log *log;

    log = kmalloc(sizeof(*log), GFP_KERNEL);
    if (!log)
        return NULL;

    type = get_type(type_name);
    if (!type) {
        kfree(log);
        return NULL;
    }

    log->flush_callback_fn = flush_callback_fn;
    log->type = type;
    if (type->ctr(log, ti, argc, argv)) {
        kfree(log);
        put_type(type);
        return NULL;
    }

    return log;
}

Why are you using a ddf format raid array in the first place? You appear to be trying to activate it with dmraid, which hasn't seen any development for several years and is more or less depreciated. mdadm is much better supported, and recent versions do support the ddf format, though it's native format is preferred.

Make sure you have loaded the dm-log module.


Thanks to all the help of my friend, the problem is solved.

On the first attempt, he commented out the line ti->error = "Error creating mirror dirty log"; in the dm-raid1.c, and inserted some debugging lines into dm-log.c to determine what caused the above error:

    log = kmalloc(sizeof(*log), GFP_KERNEL);
    if (!log)
        ti->error = "kmalloc error";
        return NULL;

    type = get_type(type_name);
    if (!type) {
        kfree(log);
        ti->error = "get_type error";
        return NULL;
    }

    log->flush_callback_fn = flush_callback_fn;
    log->type = type;
    if (type->ctr(log, ti, argc, argv)) {
        kfree(log);
        put_type(type);
        ti->error = "ctr error";
        return NULL;
    }

then recompiled the kernel and we get:

enter image description here

On the second attempt, he want to get the value of type_name:

if (type->ctr(log, ti, argc, argv)) {
    kfree(log);
    put_type(type);
    char* typeN = kmalloc(1000, GFP_KERNEL);
    char* pTypeN = typeN;
    char* ptype_name = type_name;
    while (*ptype_name != '\0') {
        *pTypeN = *ptype_name;
        ++pTypeN;
        ++ptype_name;
    }
    ti->error = typeN;
    return NULL;
}

enter image description here

Continue tracing to the core_ctr and the create_log_context by using the above method:

static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
                  unsigned int argc, char **argv,
                  struct dm_dev *dev)
{
    enum sync sync = DEFAULTSYNC;

    struct log_c *lc;
    uint32_t region_size;
    unsigned int region_count;
    size_t bitset_size, buf_size;
    int r;

    if (argc < 1 || argc > 2) {
        DMWARN("wrong number of arguments to dirty region log");
        ti->error = "argc < 1 or > 2";
        return -EINVAL;
    }

    if (argc > 1) {
        if (!strcmp(argv[1], "sync"))
            sync = FORCESYNC;
        else if (!strcmp(argv[1], "nosync"))
            sync = NOSYNC;
        else {
            DMWARN("unrecognised sync argument to "
                   "dirty region log: %s", argv[1]);
            ti->error = "unrecognised sync argument to";
            return -EINVAL;
        }
    }

enter image description here

if (argc < 1 || argc > 2) {
    DMWARN("wrong number of arguments to dirty region log");
    char* argcStr = kmalloc(1000, GFP_KERNEL);
    char* pArgc = argcStr;
    unsigned int temp = argc;
    do {
        *pArgc = temp % 10;
        ++pArgc;
        temp = temp / 10;
    } while (temp > 0);
    *pArgc = ' ';
    ++pArgc;
    //copy argv;
    int i = 0;
    for (i; i < argc; ++i) {
        char* pArgv = argv[i];
        while (*pArgv != '\0') {
            *pArgc = *pArgv;
            ++pArgc;
            ++pArgv;
        }
        *pArgc = ' ';
        ++pArgc;
    }
    *pArgc = '\0';
    ti->error = argcStr;
    return -EINVAL;
}

enter image description here

Notice that the ASCII code of the black heart symbol is... 3.

Don't know why the author is mixing up the core_ctr with the disk_ctr. The type_name is core but the number of arguments is 3, so he trim the last argument (block_on_error) by inserting the following into the dm_dirty_log_create struct:

struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
            struct dm_target *ti,
            int (*flush_callback_fn)(struct dm_target *ti),
            unsigned int argc, char **argv)
{
    struct dm_dirty_log_type *type;
    struct dm_dirty_log *log;

    log = kmalloc(sizeof(*log), GFP_KERNEL);
    if (!log) {
        ti->error = "kmalloc error";
        return NULL;
    }

    char* core = "core";
    char* pCore = core;
    int is_core = 1;

    char* ptype_name = type_name;
    while (*ptype_name != '\0') {
        if (*pCore != *ptype_name) {
            is_core = 0;
        }
        ++pCore;
        ++ptype_name;
    }

    if (is_core && *pCore == *ptype_name && argc == 3) {
        --argc;
    }
    type = get_type(type_name);

Let's see what happens:

# uname -r
2.6.34.14

# dmraid -s
*** Group superset .ddf1_disks
--> Active Subset
name   : ddf1_VCBOOT
size   : 489971712
stride : 128
type   : mirror
status : ok
subsets: 0
devs   : 2
spares : 0

# modprobe ceph

# lsmod | grep ceph
ceph                  176676  0 

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/ddf1_VCBOOTp3
                      219G   17G  191G   8% /
/dev/mapper/ddf1_VCBOOTp1
                       99M   64M   30M  69% /boot
tmpfs                  48G   16M   48G   1% /dev/shm
192.168.2.13:6789,192.168.2.14:6789,192.168.2.15:6789:/
                       72T   28T   45T  39% /mnt/ceph