How can I mirror a yum repository but download only the newest versions of each package?
I would like to mirror the following Yum/RPM repositories at http://yum.puppetlabs.com/ :
- http://yum.puppetlabs.com/el/6/products/
- http://yum.puppetlabs.com/el/6/dependencies/
- http://yum.puppetlabs.com/el/5/products
- http://yum.puppetlabs.com/el/5/dependencies/
The Puppet repository contains every Puppet product ever released and is quite large at about 8GB. I only need to mirror the newest versions of the files.
I have tried to mirror the repository using reposync --newest-only
:
reposync --config=puppetlabs.repo.el6 --repoid=puppetlabs-products --repoid=puppetlabs-deps --newest-only --download_path=el/6 --quiet --downloadcomps
and this downloads the newest packages like I need. However, reposync doesn't automatically create the regular directory structure (x86_64
, noarch
, SRPMS
, etc.) and doesn't mirror repodata.xml
. As a result, my yum clients get errors like this:
[root@web1 ~]# yum --quiet install puppet
http://mirrors.example.org/pub/puppet/el/6/puppetlabs-deps/x86_64/repodata/repomd.xml: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found"
Trying other mirror.
Error: Cannot retrieve repository metadata (repomd.xml) for repository: puppetlabs-deps. Please verify its path and try again
[root@web1 ~]#
Is there a way to programmatically mirror only the new files from a Yum repo and follow the standard repository directory structure?
Solution 1:
reposync is the only reliable way to do this. You will need to create a small bash script and use reposync parameters (-a) to download each architecture in a separate folder and then run createrepo to generate the metadata.
Here is a small script that I have (it is running on Ubuntu but doesn't matter, you get the idea):
cat sync-repos
#!/bin/bash
reposync -n -c /etc/yum/yum.conf -p /repos/centos6 -d -r base -r updates -r extras -r centosplus -r contrib
createrepo -g /repos/centos6/base/repodata/comps.xml /repos/centos6/base
createrepo /repos/centos6/updates
createrepo /repos/centos6/extras
createrepo /repos/centos6/centosplus
reposync -n -c /etc/yum/yum.conf -p /repos -d -r vmware -r home_xtreemfs
createrepo /repos/vmware
createrepo /repos/home_xtreemfs
reposync -n -c /etc/yum/yum.conf -p /repos/vz -d -r openvz-utils -r openvz-kernel-rhel6
createrepo /repos/vz/openvz-utils
createrepo /repos/vz/openvz-kernel-rhel6
reposync -n -c /etc/yum/yum.conf -p /repos/nginx -d -r nginx-stable -r nginx-mainline
createrepo /repos/nginx/nginx-stable
createrepo /repos/nginx/nginx-mainline
Solution 2:
You can do this with pulp and the yum rpm distributor plugin.
When congifguring a new repo, to get only one verison of each rpm, set the retain_old_count retain_old_count parameter
retain_old_count
Count indicating how many old rpm versions to retain; by default it will
download all versions available.
So something along the line of:
$ pulp-admin rpm repo create \
--repo-id=rhel6-puppet-products \
--relative-url=rhel6-puppet-products \
--feed=http://yum.puppetlabs.com/el/6/products/ \
--retain-old-count 1
$ pulp-admin rpm repo sync run \
--repo-id=rhel6-puppet-products \
Should achieve what you want. There is a quick start guide which should give you an idea of how the thing works, in case you have not tried it before.