Hey there, I seem to be stuck on an issue after having done as much research and Googling as I possibly can.
When using cURL to crawl pages I’m seeing more and more 403 Forbidden errors. Most pages are 403 Forbidden.
If I cURL a http:// url that re-directs to a https:// then it shows the 403 error. If I cURL the https:// then it’s just a blank page.
I have followed your instructions on your site about placing the cacert.pem file in my root which I have done. (However, didn’t get these from your site as your link is broken).
Here is my function for cURL. Wondered if you could let me know what I am doing wrong, or how I can fix this, in clear detail (please)…
function get_page_now($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(“User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15”) );
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . “\cacert.pem”);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER[‘REMOTE_ADDR’]);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result= curl_exec ($ch);
curl_close ($ch);
return $result;
}
I have tried removing the CURLOPT_SSL_VERIFYPEER option too and still no luck.
I have also tried replacing the CURLOPT_HTTPHEADER line with this line…
curl_setopt($ch, CURLOPT_USERAGENT, “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15”);
But still no joy.
If I need to add a cookie setopt then could you please let me know exactly how to do that. ie - the exact lines of code that I’d need to add.
I look forward to receiving some help. Much appreciated!
Best regards
Andy