FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Splitting large site over more than one servers





imagefree
How do large sites spread over more than 1 servers(machines). Take example of google. Google.com takes huge traffic daily that is surely beyond the capacity of one server. So, my question is how large sites (like google.com) split over servers.

Currently i have apache, mysql and memcache on one server. I want to test and learn how to put and access mysql and memcache on more than one servers. Also i want to access whole of my site using only example.com/ while the site's frontend is hosted on more than 1 machines.

Please share such splitting techniques, software and best practices. I tried to explore google search and wikipedia but didnt find and useful resource for beginners. I am using WINDOWS.
Fire Boar
It's a technique called clustering. Here are a few pointers:

- Don't try it if you're a beginner.
- Don't use Windows.

Okay, so here's a basic setup. Suppose you have a standard setup: PHP and MySQL. Suppose also that you have an arbitrary number of servers. One technique is to have the MySQL database on one server, then to have a gateway receiving the HTTP requests. The gateway passes the request on to a random HTTP server on the network (one of many, each identically configured except for local IP address) which will then decide what to do. If it can retrieve the page from a cache, it will do so and pass it back to the gateway to be served. Otherwise, it will have to process the page, a task which will either be performed directly, or passed to another dedicated PHP server.

You can cluster MySQL requests as well, but it's more complicated because you also have to synchronize UPDATE and INSERT queries. Google (and other high volume websites) uses its own ultra-optimized server setup.
imagefree
Fire Boar wrote:
It's a technique called clustering. Here are a few pointers:

- Don't use Windows.



Why not? I do not need a secure or perfect solution, i just want to do it for learning.

Also i need step by step guide of doing this. Additional information is already available on the web+wikipedia.
badai
because you need to buy windows compute cluster server. can you afford it?

when the time come, you don't need step by step guide. you just configure it. they will give you the check list, and you just fill in the blank (IP, services, protocol, port)

and you can't do this alone. at least dozens of people from few companies will take part.

google does not use clustering, they use distributed file system.

if you have kids under 12, this is something for them:
http://communication.howstuffworks.com/google-file-system.htm

imagefree wrote:
Fire Boar wrote:
It's a technique called clustering. Here are a few pointers:

- Don't use Windows.



Why not? I do not need a secure or perfect solution, i just want to do it for learning.

Also i need step by step guide of doing this. Additional information is already available on the web+wikipedia.
AftershockVibe
Quote:
Why not? I do not need a secure or perfect solution, i just want to do it for learning.

Also i need step by step guide of doing this. Additional information is already available on the web+wikipedia.


If it's for learning purposes then why not learn to do it properly? There is good reason for the "don't use windows" comment.

Typically, only large companies with lots of data and users require clustering. This means that Microsoft can charge them for it and there are special versions of Microsoft software to do just this. Of course, Microsoft also comes in and helps set this up too, as it's a large undertaking.

Of course, Linux is (as always) the geeks playground. Ignoring its use by enterprise for a second, people do Linux clustering just because they can and you can do it for free (without resorting to copied software).

This is why if you google for "Windows cluster" you get a lot of useless information and one which is specific to SQL Server, but if you google for "Linux cluster" the top hit is a step-by-step guide divided into simple chapters!
thnn
One such way is to use a load balancing proxy, such as haproxy. All requests are sent via this. You can configure it to randomly or on a round robin, select servers to serve webpages. This way only the box hosting haproxy need be running linux. The others can be running Windows, Mac, Linux, anything you like.

Haproxy allows you to filter on hostnames also, which means you can supply a different list of servers. I have a Windows VPS and a Linux VPS, and I use haproxy to transparently proxy requests for my different hostnames through to the correct server.
rvec
most bigger sites user something like thnn said. Some load balancers, on round robin, behind them some cache/proxy servers like squid or haproxy, and behind that a webserver like apache, a static file server like lighttpd and a database server like mysql.

Of course you can't just put any website on that configuration, but you'd have to think about scalability when designing the app.

Another way is to use clouds, which are even better and more scalable, but a lot harder to write software for. This is also the way google, amazon (they provide cloud-based hosting), yahoo and all the other big sites do it.
Fire Boar
rvec wrote:
Another way is to use clouds, which are even better and more scalable, but a lot harder to write software for. This is also the way google, amazon (they provide cloud-based hosting), yahoo and all the other big sites do it.


Just a point here. "Cloud" is basically "unlimited hardware, providing you can pay for its rent". I wouldn't say "use clouds" is any harder to write software for: at the end of the day it's a whole bunch of (in the case of cloud computing, virtual) machines on a network. And I hardly think the big names that provide "cloud computing" services will actually use it themselves: the whole point of cloud computing is to outsource your servers: Google have and need so many themselves and for their services that outsourcing is frankly ludicrous.
toasterintheoven
google the term web farm, if you can't afford to try something like that, you can just use one server and try web gardening, which'll involve multiple server instances
Related topics
Do you use css to design a website?
WWE Wrestling
Do you like browsing a site that uses HTML frames?
How to become a millionaire on the internet?
Download Manager from Microsoft(??)
Hello
When and Why You Made Your First Website
Sending LARGE files through internet --> how?
Razorback Servers Seized
CSS/HTML
Project Management Software
Moving webhoster...
Is non-therapeutic circumcision of infants child abuse?
google bot unable to access the site
Reply to topic    Frihost Forum Index -> Scripting -> Php and MySQL

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.