Next.js explosion of crawlable files

So we deployed an app to production and once we started getting traffic on the page we saw an explosion in crawlable pages (with audisto) and also in the search console. A lot of these pages are not actual pages but rather pages like:

_next/static/chunks/415-346ff9ef0019b4be.js or _next/data/lPuQfW-esUIlgkYvK94Ur/de/product/dishwasher-tabs-80-loads.json

Should we add those files to robots.txt or what is the best practice to exclude them from being crawled? Excluding them from robots seems quite excessive/radical.


Solution 1:

Same problem here... Dont know if it's related to a nextjs (using 12.0.8 stable) or a google search console update

  1. Easy fix would be to denied access to _next/data in robots.txt but it is a radical way. enter image description here

  2. Does it hurt SEO ? With now several weeks of hindsight I am less worried about the degradation of SEO ranking. I don't think it will have a negative impact on crawling credits.

The requests are categorized (Googlebot Type) as "Page resource load". According to this page https://support.google.com/webmasters/answer/9679690?hl=en , these requests are a secondary fetch :

Page resource load: A secondary fetch for resources used by your page. When Google crawls the page, it fetches important linked resources such as images or CSS files, in order to render the page before trying to index it. This is the user agent that makes these resource requests.

enter image description here

In the absence of visible impact I do not modify the robots.txt rules