FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Get page source in javascript





Stubru Freak
I want to get the -real- source of the page I'm currently viewing.

The reason I want this, is because I like making javascript bookmarks, and I'm making one to automatically check the current document with the w3c validator using source upload (so it also works on pages requiring authentication or on a local computer)

Thanks,
Frederik Vanderstraeten
Aredon
Two words: HTTP Request.

Something quick I wrote up that works in FireFox and IE 6.0
Code:

javascript:if("XMLHttpRequest" in window)xmlhttp=new XMLHttpRequest();if("ActiveXObject" in window)xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");xmlhttp.open('GET', location.pathname,true);xmlhttp.onreadystatechange=function() {if (xmlhttp.readyState==4) {alert(xmlhttp.responseText)}};xmlhttp.send(null);void(0);



Might want to also give this a read if you want a cross-browser XMLHTTPRequest object.
http://jibbering.com/2002/4/httprequest.html

A tiny note: Don't let the name XMLHttpRequest fool you into thinking it only works in XML, it works in HTML as well.
Stubru Freak
So that's the only way?
OK
Aredon
You could use some kind of innerHTML and document.doctype to construct a run-time source of the page however if the page has any JavaScript on it that writes or edits the page (e.g. onclick="this.value='clicked'"), the runtime source would have been edited and if your element initially read:
Code:

<input type="button" onclick="this.value='clicked'" value="not clicked">

The run-time source, after clicking the above button, would read it as:
Code:

<input type="button" onclick="this.value='clicked'" value="clicked">

Sure you can avoid clicking it, but there is bound to be some auto-running code or document.writes on the page.

You can even go as far as outerHTML (which is IE only) to retrieve the run-time source of the page, however we would run into the same problem.

The only logically solution is to use an HTTP Request, whether client-side or server-side.
Stubru Freak
Aredon wrote:
You could use some kind of innerHTML and document.doctype to construct a run-time source of the page however if the page has any JavaScript on it that writes or edits the page (e.g. onclick="this.value='clicked'"), the runtime source would have been edited and if your element initially read:
Code:

<input type="button" onclick="this.value='clicked'" value="not clicked">

The run-time source, after clicking the above button, would read it as:
Code:

<input type="button" onclick="this.value='clicked'" value="clicked">

Sure you can avoid clicking it, but there is bound to be some auto-running code or document.writes on the page.

You can even go as far as outerHTML (which is IE only) to retrieve the run-time source of the page, however we would run into the same problem.

The only logically solution is to use an HTTP Request, whether client-side or server-side.


But the problem with that would still be that some pages that require log-in, cookie data, and more, are unavailable.
Aredon
Stubru Freak wrote:
But the problem with that would still be that some pages that require log-in, cookie data, and more, are unavailable.

Precisely! Even more so you're forced to switch to HTTP Requests. Thus, The only logically solution is to use an HTTP Request, whether client-side or server-side.

If you don't know how to use HTTP Requests maybe it is time you broadened your horizons.
HTTP Requests DO allow you to use cookies if you set the header. They also allow you to simulate a submitting form for User Logins and pick between a GET or POST method request, and all the works to spoof the server into thinking that the request was from an actual browser and a user clicking around. The only thing in an HTTP Request that you cannot spoof is your IP Address -- which would reflect your actual IP Address but you can still mess with that through proxys and the such.
Just it might be a pretty big bookmarklet to simulate the user's browser in every way. Note that all this can be done either server-side or client-side. Most people pick to do this type of stuff server side, however since you want it as a bookmarklet you can also do it client side. I should give you a heads-up that bookmarklets DO have a limit and if it's gonna be rather lengthy, you're gonna have to stash a file somewhere that it will use or just do this all server side and you can even create a bookmarklet to run the current page through your server. Another option is to split the bookmarklet into segments and define varaibles one at a time and have the final bookmarklet read all the variables.


Why not settle with the following in any case?
Code:

javascript:location.href="http://validator.w3.org/check?uri="+encodeURI(location.href)


Just an idea, you might be able to skip the whole cookies and HTTP Header step if you can find the cached file on the user's machine and get permission from the user to open it. The only problem with this is that the URL to the cached file is always a mystery, if you find a way to detect it's location dynamcially, I'm all ears.
Stubru Freak
Thanks.

I know how to use HTTP Requests, that's the way I'm planning to contact the validator, so that it will only show a congratulations alert when your page is valid, instead of loading the results. This will result in faster results when working with a lot of pages. (In fact, it will only seem faster, but perceived speed is more important then real speed.)
But to connect to another website I need to make it an extension anyway. So maybe that opens new possibilities, like intercepting and saving the page code before the user gets it, or (better) get the View Source content somehow.

I'm using this code at the moment:
Code:
javascript:var html = confirm("Do you want to do a HTML check?");var css = confirm("Do you want to do a CSS check?");if(html == true){open("http://validator.w3.org/check?uri=" + escape(location.href));}if(css == true){open("http://jigsaw.w3.org/css-validator/validator?uri=" + escape(location.href));}void(0);
Aredon
(please delete this post, the form submited the same post twice)
Aredon
It seems you are forced to use HTTP Requests and you now have to pick whether to program this client side or server side.

It really depends on a few things which to program it in.
1. Your server side and client side programming skills
2. Clients are faster than servers (point to JavaScript)
3. Cross-browser compatibility (point to PHP)
4. The type of users that will be using your code (point PHP)
5. If you will continue to make this a bookmarklet or an actual page
Stubru Freak
I'm gonna make this a firefox extension. It doesn't seem like any server-side code is needed here.
Aredon
Firefox Extension, wow, why didn't I think of that. Good luck yo Cool
Related topics
xmlhttprequest
How to design a theme?
Password Protecting a webpage
I need a fool-proof anti-source view javascript.
Php parsing error! Need help
Flash Site Optimization Tips
Question About Drop-Down search buttons
Self Marking Quiz
how do i make a login page
hide your code
dtd+javascript
comment cacher le code source d'une page html
PHP Page Source
Using Javascript to Encrypt Data then POST to PHP
Reply to topic    Frihost Forum Index -> Scripting -> Others

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.