What is the smartest way to handle robots.txt in Express?
I'm currently working on an application built with Express (Node.js) and I want to know what is the smartest way to handle different robots.txt for different environments (development, production).
This is what I have right now but I'm not convinced by the solution, I think it is dirty:
app.get '/robots.txt', (req, res) ->
res.set 'Content-Type', 'text/plain'
if app.settings.env == 'production'
res.send 'User-agent: *\nDisallow: /signin\nDisallow: /signup\nDisallow: /signout\nSitemap: /sitemap.xml'
else
res.send 'User-agent: *\nDisallow: /'
(NB: it is CoffeeScript)
There should be a better way. How would you do it?
Thank you.
Use a middleware function. This way the robots.txt will be handled before any session, cookieParser, etc:
app.use('/robots.txt', function (req, res, next) {
res.type('text/plain')
res.send("User-agent: *\nDisallow: /");
});
With express 4 app.get
now gets handled in the order it appears so you can just use that:
app.get('/robots.txt', function (req, res) {
res.type('text/plain');
res.send("User-agent: *\nDisallow: /");
});
1. Create robots.txt
with following content :
User-agent: *
Disallow: # your rules here
2. Add it to public/
directory.
3. If not already present in your code, add:
app.use(express.static('public'))
Your robots.txt
will be available to any crawler at http://yoursite.com/robots.txt