FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Extract info from a webpage





mk12327
I am trying to periodically extract information from a webpage and store it in a database so that I could do analysis on the information collected. I did some read up about this and found that 1 possible solution is to use PHP's CURL library.

In fact, I found this resource during my research: http://archive.devnewz.com/devnewz-3-20041221UsingPHPCURLLibrarytoScrapetheInternet.html. Despite following closely with the resource, the codes doesn't seemed to really work. When I tried to run the php script, information was not extracted. I was wondering if there is anything that is missing from the codes provided or was it just me who does not have any knowledge in using CURL.

I was also wondering if I need to make use of cron jobs in order to achieve the periodical retrieving capabilities? Thanks in advance!
rvec
let me guess, you tried the following code without putting a class around it?
Quote:
Code:
function Grabber($url)
{
$this->content="";
$ch = curl_init ();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
$this->content = curl_exec ($ch);
curl_close ($ch);
}


try this instead:
Code:
function Grabber($url)
{
$content="";
$ch = curl_init ();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
$content = curl_exec ($ch);
curl_close ($ch);
return $content;
}
$website = Grabber('http://www.frihost.com');

va_dump($website);
 


edit: yes cron is the best way to do that.
mk12327
Thanks for the reply. But that was not the only problem I had. After changing the codes, I tried running it again on my local computer and I got the following error message:
Quote:
Error in my_thread_global_end(): 1 threads didn't exit
.

I decided to find alternatives and came across the DOMDocument's loadHTMLFile and saveHTML methods in PHP. I made use of these 2 methods but the error still persisted.

Further research led me to find out that it was a bug in the libmysql.dll file of certain releases of PHP. (The bug occurs in the libmysql.dll file of releases after 5.2.1 and before 5.2.5)

I frantically checked my local PHP version and I realised that I am running version 5.2.3... which means i'm struck with that bug and not because of my program codes. I subsequently updated my PHP to the latest 5.2.6 and it is now working fine.
mk12327
Here's an update...

I spent my whole night on this program and finally got it done. In the end I settled with loadHTMLFile and saveHTML methods in PHP. Although I was able to quickly understand how the methods and properties were used (I had no knowledge about DOMDocument nor cURL before), what took me the most time was to break down the html page retrieved. I had to break down to the level where I can isolate the information I really wanted and that took me a long time.

I am sure that the PHP script is working fine, without any bugs (as far as I know). I did a lot of testing and changing of codes on my local computer until it was working correctly before uploading to Frihost to test. Currently, when I go to the url of my PHP script, it accurately retrieves the information I wanted and store them into the MySQL database. But I did not want to manually run the script. So I guess the next step is to set up a cron job to run the php script regularly. But I have no knowledge about cron jobs Sad

Any help will be very much appreciated... Thanks in advance!
Related topics
[java scripts] Text effect , img ....
A "small" list of free apps
Some basic ftp info needed!! Frontpage help
PC-News Dominicana
Memberships to MyEzy
Tutorial: Image Rollovers w/ Javascript
Save webpage source into javascript variable
easyplay.info
And so it begins???
Torture : It's a no-brainer
Website login- How?
Using PHP toreciever information from external source.
Can't Extract , Can't Install
Adapting other webpage`s info
Reply to topic    Frihost Forum Index -> Scripting -> Php and MySQL

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.