You are invited to Log in or Register a free Frihost Account!

Complicated regular expression (for preg_replace)

Fire Boar
Okay, I'm looking for a way to change the syntax of all the links in a document. Specifically, I want external links to work normally, and internal links to change to the following:

Before: forumdisplay.php?f=1
After: ../index.php?option=com_connector&cid=1&Itemid=3&file=forumdisplay&b_f=1

Yes, I know it looks really confusing but basically the after is always "index.php?option=com_connector&cid=1&Itemid=3". Then the query variable "file" is the .php file to use, and any variables passed it (in this case it's "f") have b_ prepended ("f" becomes "b_f", for example). What I want to do is to make this change using preg_replace. The trouble is, I've had little to no experience with regular expressions. Could someone help me please? Thanks.

By the way, if you're familiar with the strings, you'll probably have guessed that my purpose in doing this is bridging vBulletin with Joomla.
Try this:


function change($subject) {
  $result = preg_replace('/(\\w+)=([^&]*)/', 'b_$1=$2', $subject);
  $result = preg_replace('/(\\w+)\\.php/', '../index.php?option=com_connector&cid=1&Itemid=3&file=$1', $result);
  return $result;

I think it does what you want. For example:


echo change("forumdisplay.php?f=1&txt='hoi'&debug=_#on");



Fire Boar
Thanks, that's nearly what I'm after. But not quite. What I'm looking for is something to change all the links in the page, and not just changing one link input. Yours is great for links, but when a whole page is passed through it it converts everything with an = in it. Which is somewhat messy.

That's great so far though. Smile
Do all the internal links in your documents start with a filename or a path followed by a filename and all external links with http://? If not, can you give a sample document?
Fire Boar
That's correct. None of the internal links start with http://.
Well, that certainly made things more complicated Smile But it's a nice exercise. This is what I made of it:


  // processLinks($source)
  // rewrites all internal links and returns the new HTML source
  function processLinks($source) {
    return preg_replace_callback('/<a[^>]+?href\\s*=\\s*["\']?([^\'" >]+)[^>]*>/si', "processLink", $source);
  // processLink($matchResult)
  // Callback. Do not call directly
  function processLink($matchResult) {
    preg_match('/(.*href\\s*=\\s*["\']?)([^\'" >]+)(.*)/si', $matchResult[0], $result);
    // internal link to php-file?
    if (preg_match('/^(?!\\s*(http:|www\.)).*\.php\b/is', $result[2])) {
      // yes. rewrite
      $new = "../index.php?option=com_connector&cid=1&Itemid=3";
      // get filename and parameters
      preg_match('/(.*)\.php(?:\?(.*))?/s', $result[2], $link);
      $new .= "&file=".$link[1];
      // rewrite parameters
      if ($link[2] != "") {
          $pars = preg_replace('/(\\w+)=([^&]*)/', 'b_$1=$2', $link[2]);
          $pars = preg_replace('/(^|&)([^&=]+)(?=&|$)/', '$1b_$2', $pars);
          $new .= "&".$pars;
      return $result[1].$new.$result[3];
    } else {
      return $matchResult[0];
  // Below testing only
  function test($str) {
    echo "<B>Before:</B><BR>";
    echo htmlspecialchars($str);
    echo "<BR><B>After:</B><BR>";
    echo htmlspecialchars(processLinks($str));
    echo "<BR><BR>";

  test("<A href = ''>External link</a>");
  test("<a href =>External link</a>");
  test("<A hREF=\"  \">test</a>");
  test("<a href = forumdisplay.php?f=1 id='myid'>test2</a>");
  test("<a HREF = '/extra/print.php?str=this+is+a+test:%20&oke=1&yes' target='_blank'>test3</a>");
  // And all at once
  test("<A href = ''>External link</a><a href =>External link</a><A hREF=\"  \">test</a><a href = forumdisplay.php?f=1 id='myid'>test2</a><a HREF = '/extra/print.php?str=this+is+a+test:%20&oke=1&yes' target='_blank'>test3</a>");

processLinks($source) rewrites all links in $source which:
1. are internal (don't start with 'www.' or 'http:')
2. are links to a '.php' file
3. are embedded as the href= parameter in an <A> tag.

All other links and/or urls are left alone.
No doubt it needs some fine-tuning and/or has some bugs, so test it to see if it works and let me know what needs to be changed. Maybe other urls also need to be rewritten?
Related topics
Echoing Titles
Perl server referencing
PHP - Find and Delete
PHP Regex
mod_rewrite code
HTML: Automatically add ALT tags
[JS] Regular expressions global match get first group
Highlighting Search Terms
How to use ereg_replace to remove something?
regular expression
Formulate Regular Expression
Help with regular expression
How to do this find and replace
Python Expression Evaluator Version 2
Reply to topic    Frihost Forum Index -> Scripting -> Php and MySQL

© 2005-2011 Frihost, forums powered by phpBB.