ScrapeNinja

부분 유료
분류별 Anthony | 업데이트됨 4 दिन पहले | Data
인기

9.9 / 10

지연 시간

2,898ms

서비스 수준

98%

Health Check

N/A

모든 토론으로 돌아가기

Scraping from Site Suddenly Blocked by Cloudflare

Rapid account: Garethjersey Bst GK Gfae 9
garethjersey-BstGK_Gfae9
2 वर्ष पहले

Hi,
The main site I scrape/pull data from is now having the majority of requests being blocked by cloudflare. It has worked previously since I signed up for the API subscription. However, I am now running into issues if I attempt to query the site at short intervals.
I would say about 20% of requests are getting through and successfully pulling back the page.

However, the majority of requests will return:
{“info”:{},“body”:""}

I am using PHP and curl using the following code:

	$curl = curl_init();
	curl_setopt_array($curl, [
	CURLOPT_URL => "https://scrapeninja.p.rapidapi.com/scrape",
	CURLOPT_RETURNTRANSFER => true,
	CURLOPT_FOLLOWLOCATION => true,
	CURLOPT_ENCODING => "",
	CURLOPT_MAXREDIRS => 10,
	CURLOPT_TIMEOUT => 30,
	CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
	CURLOPT_CUSTOMREQUEST => "POST",
	CURLOPT_POSTFIELDS => "{\r
		 \"url\": \"$url\",\r
		 \"headers\": [\r
				 \"X-Header: some-random-header\"\r
		 ]\r
	}",
		CURLOPT_HTTPHEADER => [
			"content-type: application/json",
			"x-rapidapi-host: scrapeninja.p.rapidapi.com",
			"x-rapidapi-key: ........"
		],
	]);

Any ideas?

URL:
https://www.oddschecker.com/horse-racing

Thanks

Rapid account: Jdeboysere
jdeboysere Commented 2 वर्ष पहले

Hi @restyler

The problem with CloudFlare is back (or maybe it’s impossible to bypass it?)… Can you check this in PHP? I run this code locally using LocalWP, but it does work from Rapidapi

curl_setopt_array($curl, [
CURLOPT_URL => “https://scrapeninja.p.rapidapi.com/scrape”,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_ENCODING => “”,
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => “POST”,
CURLOPT_POSTFIELDS => “{\r
"url”: “https://www.instant-gaming.com/fr/”\r
}",
CURLOPT_HTTPHEADER => [
“X-RapidAPI-Host: scrapeninja.p.rapidapi.com”,
“X-RapidAPI-Key: xxx”,
“content-type: application/json”
],
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_SSL_VERIFYPEER => false,
]);

Rapid account: Garethjersey Bst GK Gfae 9
garethjersey-BstGK_Gfae9 Commented 2 वर्ष पहले

Looks to be working as usual now! Thanks for the quick help

Rapid account: Restyler
restyler Commented 2 वर्ष पहले

Hey this was a temporary hiccup due to proxies degradation. Should be good now - let me know.

아래에 의견을 추가하고 토론에 참여하세요.

새 댓글을 게시하려면 로그인 / 가입