FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Web Page Scrapper





Possum
Hi

Could php do this.


I have a list of 153 webpages that change daily. I want to search them everyday for the appearance of about 60 different words. When that word appears I want the word and page address to be added to an email that is sent to me.

That would be just one Email with a list of the pages and words.


I think this is what they call scrapper...... I hope I can do this... Do you think it can be done?


thank you...
Marcuzzo
it can definitely be done.
you can use the function file_get_contents for this.
Just use this function to get the page contents ( read html code ) and all you need to do is parse it.

EDIT: most websites have an RSS feed, I would try to use that to check for changes
vat322
Yes. It can be done. But, since, not all webpages can be scraped using file_get_contents(), I suggest you to use CURL which does that the job perfectly.
And to search for the words use strpos() .
Possum
Quote:
you can use the function file_get_contents for this.


Yes I understand this.. Cool

As usual there is another way.. So much learning..

Quote:
I suggest you to use CURL


thx Smile
Lolswagg2019
Thats not bad Smile

Thanks.
jmraker
The reason why curl is better for website downloading in "bot programs" is because
. You can use it to pretend to be a web browser by sending it's user agent
. It can post data to forms
. It can set and remember cookies for you.
. It has those and many more options
http://php.net/manual/en/function.curl-setopt.php

To parse the pages it's easiest for me to use regular expression matching on the whole page source.
Since the 153 sites are probably vastly different it would take awhile to get all of them working
http://us1.php.net/manual/en/function.preg-match.php
http://us1.php.net/manual/en/function.preg-match-all.php
johans
i have not tried it but it think that would be a wonderful challenge.
Possum
As a learner of php I think file_get_contents is a great function to start to really bite into. Sort of learn your way out to the rest of php from here...

If I could some how write or use a search engine around it.. Would be fun...
Hogwarts
Maybe you should do Udacity's CS101 course, Possum

One of the project's is building a search engine Smile
Possum
Quote:
Maybe you should do Udacity's CS101 course, Possum


Hey!.. I am a member and I will.. cool.. thx
Related topics
Beautiful personal web page!
Which is the best content management software
How Do You Center Your Entire Web Page?
LOST
web page maker
web page maker
Web Page Maker
Department web page
how long does web page posts take ???
Anyone know how to make an email form?
Help on making a web page
secure your pictures on a web page !
how to insert data into mysql base from a web page
Looking for a place to start a business Web page
Reply to topic    Frihost Forum Index -> Scripting -> Php and MySQL

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.