FRIHOST • FORUMS • SEARCH • FAQ • TOS • BLOGS • COMPETITIONS
You are invited to Log in or Register a free Frihost Account!

A HTML to Frihost Forum Markup Converter




Sometimes I re-use a blog posting or anything else written somewhere in the web and thus available as HTML code here for a forum or blog posting. Like this posting I had published here as well in my Wordpress blog. When I do this I can re-use my writing, but have to translate HTML into the markup language used here on Frihost. Which is kind of minimalistic, since it only supports links, images, code blocks and a few text formatting options like bold and italic.
Nevertheless, I wrote a little Python program doing the job for me. It uses the HTMLParser module which makes it quiet easy to handle HTML and provides a class making it easy to handle three major events: occurrence of a start tag, occurrence of an end tag and occurrence of content in between. Here is my usage of that class so far. As one can see I am handling HTML tags "a" ( links ), "img" ( images ), "listing" ( code blocks ) and text formatting directives like "b" or "strong" or "i", as well as line breaks and paragraphs. All other HTML tags would be ignored and thus stripped off.
Code:

class MyHTMLParser(HTMLParser):
    def __init__(self, url):
        HTMLParser.__init__(self)
        self.text = ""
        self.ignore = False
    def handle_starttag(self, tag, attrs):
        self.entry = {}
        if not self.ignore:
            if tag == "a" and attrs[0][0] == 'href':
                self.text += "[url="+attrs[0][1]+"]"
                self.ignore = False
            elif tag == "img" and not options_no_img:
                for attr in attrs:
                    if attr[0] == "src":
                        src = attr[1]
                        self.text += "[img]"+src+"[/img]"
            elif tag in ("p", "br"):   
                self.text += "<br>"
                self.ignore = False
            elif tag in ("b", "strong"):   
                self.text += "[b]"
            elif tag in ("i", "em"):   
                self.text += "[i]" 
            elif tag in ("u"):   
                self.text += "[u]"               
            elif tag == ("listing"):   
                self.text += "[code]"               
            elif tag == "table" and options_no_tables:
                self.ignore = True [/code][code]

    def handle_endtag(self, tag):
        if tag == "a":
            if not self.ignore: self.text += "[/url]"
        elif tag == "table":           
            self.ignore = False
        elif tag == ("listing"):   
            self.text += "[/code]"           
        elif tag in ("b", "strong"):   
            self.text += "[/b]"           
        elif tag in ("i", "em"):   
            self.text += "[/i]"           
        elif tag in ("u"):   
            self.text += "[/u]"           
    def handle_data(self, data): 
        if not self.ignore:
            if data[0] != "\n":
                self.text += data
    def close(self):
        HTMLParser.close(self)       
        return self.text

Once I had the Python code written I wanted to make it available as a web application, thus I had to convert it to a little web server. This is no rocket science, I just have to ensure my little Python program returns "Content-Type: text/html" plus a blank line as the first two lines to make it work as a web server, as described here. Besides I have been using the cgitb module for better error reporting and cgi to retrieve posted data, as I described here a while ago.
My frontend is written in Javascript and jQuery and basically provides one input field at the top where to paste in HTML code, a "Go!" button and an output field at the bottom where the converted markup language will appear.
The Ajax post request to call my little Python based server looks like this:
Code:
 
    $.post( "cgi-bin/HTML2FrihostMarkup.py",
    {"input": inp},
     function(text) {       
           $('div#output').html("[b]" + text + "[/b]");   
           $('div#output').fadeIn('slow');
       },
        "text" );

"inp" basically is the content of my input field and thus handed over to the server. What is addressed as ‘div#output' is the output field where the result computed and returned by my server will appear.
You can try out my tool with this test data if you want:
Code:

<p>This is <strong>strong</strong>.</p>
<p>This is <b>bold</b>.</p>
<p>This is <em>italic</em>.</p>
<p>This is also <i>italic</i>.</p>
<p>This is <u>underlined</u>.</p>
<p><listing>This is code</listing>
<p>Here we have a <a href="http://www.amagard.frihost.org/">link</a>.</p>
<p>Here we have an image: <img alt="Star" src="http://messenger.msn.com/MMM2006-04-19_17.00/Resource/emoticons/star.gif"></p>

As a matter of fact this little tool now allows me to write my postings with any HTML editor I like and later on convert it to Frhost's markup languge. My preferred WYSIWYG HTML editor I also use for writing blog postings is Windows Live Writer which I also used to compose this posting here.



0 blog comments below




FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.