FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Search Bots How?





binsmyth
Has anyone tried to make their own search bots. I want to know the principle behind the search bots and What would be the best place to start and what programming language would be good to make search bots?
leontius
binsmyth wrote:
Has anyone tried to make their own search bots. I want to know the principle behind the search bots and What would be the best place to start and what programming language would be good to make search bots?


Are you sure you want to do that? It requires enormous amount of resources (processing power and disk space) and a very complicated system. Basically what they do is that they download a page from a website, index / put it in a database, look for links in that page, visit & download all of them, and so on. Then when a person wants to search the system will refer to the database to see what indexed pages are relevant to the search terms.
Peterssidan
like leontius said it requires a lot resources, at least if you want to compete with the big ones. But not all have to be best and everyone has to start somewhere.
I guess there is many languages you can use, if it's not too serious you can pick a language that you are familiar with. The best language is probably something like Erlang, but I'm sure there are many more languages that will handle it.
Most important are probably the algorithms. It's probably very useful to be a math genius, to be able to make all kinds of estimates. I'm sure there are papers available about various search engine topics. Problem you need to find them. They will probably be at a "high level" so it's important that you understand them, because I don't think there is a step by step tutorial how to build a search engine like there is for games.
leontius
Haha I realized I haven't said anything about language Smile Basically I think any language will be ok because most of your bottleneck will be on database and network access anyway, so the speed of the language compiler/implementation won't matter much. Use the language you're most comfortable with. Then as Peterssidan said, be a math (and statistics, and favourably also computer science) expert, or else you won't be able to hardly compete with the others!

By the way, a good start will be to explore the source code of various open source search engine, for example http://xapian.org/ and http://lucene.apache.org/solr/
Peterssidan
I found this paper: http://infolab.stanford.edu/~backrub/google.html

I havn't read it through but it looks like a good start.
About programming language:
Brin & Page wrote:
Most of Google is implemented in C or C++ for efficiency and can run in either Solaris or Linux.
Related topics
The Best Way to Get Page Rank 10 in GOOGLE
What Do You suggest ?
Link to frihost
Can anyone make sense of AdSense to me...
Should I make a flash site?
Search Engine Linking
Is n00bie an attention whore??
Google Removal
encrypt html source
Auto-SE Ready
www.karmasarmy.net
Google Adsense - *OFFICIAL THREAD*
Has google updated the page ranks ??
How to prepare a true "GOOGLE XML" file - SITEMAP.
Reply to topic    Frihost Forum Index -> Webmaster and Internet -> SEO and Search Engines

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.