PHP's curl () in terms of efficiency crawled pages is relatively high, and supports multi-threading, and file_get_contents () would be slightly more efficient, of course, need to open when using curl under the curl extension.
Code combat
First look login part of the code:
//Analog Login
function login_post($url, $cookie, $post) {
$curl = curl_init();//Initialization curl module
curl_setopt($curl, CURLOPT_URL, $url);//Address Sign submitted
curl_setopt($curl, CURLOPT_HEADER, 0);//Whether to display the header information
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 0);//Whether to automatically display the returned information
curl_setopt($curl, CURLOPT_COOKIEJAR, $cookie); //Set Cookie information is saved in the specified file
curl_setopt($curl, CURLOPT_POST, 1);//post submit
curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($post));//submitting infomation
curl_exec($curl);//execute cURL
curl_close($curl);//Close cURL resource, and free up system resources
}
Function login_post () initializes curl_init (), then use curl_setopt () to set the relevant options, including url address to be submitted by the cookie file save, post data (user name and password information, etc.), whether to return information, etc., and then curl_exec execute curl, last curl_close () to release resources. Note that PHP comes http_build_query () can be converted to a string array connected.
Next, if the login is successful, we have to get after a successful login page information.
//After successful login to get data
function get_content($url, $cookie) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie); //read cookie
$rs = curl_exec($ch); //Perform cURL fetch page content
curl_close($ch);
return $rs;
}
Function get_content () is also first initialize curl, and then set the relevant options, execute curl, release resources. Where we set CURLOPT_RETURNTRANSFER automatically return information for a while CURLOPT_COOKIEFILE can read the cookie information when you log saved, and finally return to the page content.
Our ultimate goal is to get to the post-simulation login information, which is useful information only after successful login in order to obtain normal.
Use summary
1. Initialization curl;
2, using the curl_setopt set the target url, and other options;
3, curl_exec, execute curl;
4, after the execution, close curl;
5, the output data.