Web.config modification to block search engines from crawling pdfs
I'm trying to stop web crawlers from indexing pdf files on a website. I know how to do this with an .htaccess file but not in a web.config file. This snippet will stop crawlers from indexing the whole site correct? What do I need in order to just block pdfs from being crawled? Is it possible?
<httpProtocol>
<customHeaders>
<add name="X-Robots-Tag" value="noindex" />
</customHeaders>
</httpProtocol>
Solution 1:
Setting Response Headers is possible with IIS URL Rewrite Module.
<system.webServer>
<rewrite>
<outboundRules>
<rule name="X-Robots-Tag: noindex to .pdf">
<match serverVariable="RESPONSE_X_Robots_Tag" pattern=".*" />
<conditions>
<add input="{REQUEST_FILENAME}" pattern="\.pdf$" />
</conditions>
<action type="Rewrite" value="noindex"/>
</rule>
</outboundRules>
</rewrite>
</system.webServer>