Interrupción parcialshare-external
Why is the facebookexternalhit crawler DDoSing our server?
3

Our IP 81.31.37.22 is being bombarded with a huge number of requests from the facebookexternalhit bot. Does it respect any limits? Google and Microsoft are scraping with a fraction of the intensity compared to Facebook.

Michal
Pregunta realizada hace aproximadamente 6 meses
Respuesta seleccionada
1

DDossing :) Check if those requests actually originate from a registered FB network.

7 de junio a las 5:05 a. m.
Lars
Michal

Yes, they all originate from the FB network. With ipV6, it’s visible at first glance, for example: ‘2a03:2880:21ff:c::face:b00c’. We are trying to rate limit. But then the page previews stop working when sharing posts on FB.

7 de junio a las 5:25 a. m.
Lars

The OG cache most likely only needs to be set once, so it's safe to block any requests after the initial request to a specific URL

7 de junio a las 6:07 a. m.
Michal

We have no problem with multiple requests for a specific url. Our application runs thousands of domains. FB seems to scrape them all and completely. The problem is intensity.

7 de junio a las 6:48 a. m.
Lars

Could be bad actors abusing the Sharing Debugger or the API that allows requesting a cache refresh. However, FB won't tell you so rate limiting the user agent is your best/only option.

7 de junio a las 7:01 a. m.