FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


a regex help





Mgccl
Code:
preg_match('/<div style=\"visibility:hidden\">(.+?)<\/div>/',$a,$captured_elements);


if I want to make this multi-line capable...what can I do?
MrBlueSky
Here you go. Also case-insensitive.

Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/mi',$a,$captured_elements);
Mgccl
MrBlueSky wrote:
Here you go. Also case-insensitive.

Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/mi',$a,$captured_elements);


I don't think it works... Confused
gorn
Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/si',$a,$captured_elements);


(Change the m to an s)

The single line/multiline thing is confusing. But in this case you want all the text to be treated as if it's on a single line (ie \n is just a normal character).
bladesage
Well, for one, you place a lot of unescaped special characters in there...

The symbols <>=: are special pattern characters with special meanings that alter the pattern. You could, for instance, use preg_quote($str). More to the point, here is something you can use.
Code:
preg_match('/\<div\s+style\="visibility\:hidden"\>(.+?)\<\/div\>/si', $a, $captured_elements);


If that helps, then good. I mean, I've had tons of problems with regular expressions (and PCRE in general), only to find out that I didn't escape everything that needed to be. It's the little things that really drive us nuts Wink.
MrBlueSky
Mgccl wrote:


I don't think it works... Confused


My mistake. What gorn says:

gorn wrote:
Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/si',$a,$captured_elements);


(Change the m to an s)

The single line/multiline thing is confusing. But in this case you want all the text to be treated as if it's on a single line (ie \n is just a normal character).


bladesage wrote:
Well, for one, you place a lot of unescaped special characters in there...

The symbols <>=: are special pattern characters with special meanings that alter the pattern.


There are no characters in his pattern that should be escaped, he even escaped 'too much'. <>=: are not special characters in regular expressions.
bladesage
MrBlueSky wrote:
Mgccl wrote:


I don't think it works... Confused


My mistake. What gorn says:

gorn wrote:
Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/si',$a,$captured_elements);


(Change the m to an s)

The single line/multiline thing is confusing. But in this case you want all the text to be treated as if it's on a single line (ie \n is just a normal character).


bladesage wrote:
Well, for one, you place a lot of unescaped special characters in there...

The symbols <>=: are special pattern characters with special meanings that alter the pattern.


There are no characters in his pattern that should be escaped, he even escaped 'too much'. <>=: are not special characters in regular expressions.


php.net wrote:
The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | :


I trust the people at the official PHP website myself. And I did have problems before by not escaping the above symbols. So, all things considered, I'd say that these are special characters.
gorn
As preg is perl regular expression, the perl manual says:

perldoc perlre
Code:

       "(?<=pattern)"
                 A zero-width positive look-behind assertion.  For example,
                 "/(?<=\t)\w+/" matches a word that follows a tab, without
                 including the tab in $&.  Works only for fixed-width
                 look-behind.
       "(?<!pattern)"
                 A zero-width negative look-behind assertion.  For example
                 "/(?<!bar)foo/" matches any occurrence of "foo" that does not
                 follow "bar".  Works only for fixed-width look-behind.

    ...

       "(?>pattern)"
                 WARNING: This extended regular expression feature is consid-
                 ered highly experimental, and may be changed or deleted with-
                 out notice.

                 An "independent" subexpression, one which matches the sub-
                 string that a standalone "pattern" would match if anchored at
                 the given position, and it matches nothing other than this
                 substring.  This construct is useful for optimizations of
                 what would otherwise be "eternal" matches, because it will
                 not backtrack (see "Backtracking").  It may also be useful in
                 places where the "grab all you can, and do not give anything
                 back" semantic is desirable.
    ...

       "(?:pattern)"
       "(?imsx-imsx:pattern)"
                 This is for clustering, not capturing; it groups subexpres-
                 sions like "()", but doesn't make backreferences as "()"
                 does.  So

                     @fields = split(/\b(?:a|b|c)\b/)

                 is like

                     @fields = split(/\b(a|b|c)\b/)


That said perl's own examples don't always escape <>=:. <> especially are very commonly not escaped as they're used for HTML/XML parsing and don't normally fit the above usages so they're interpreted correctly.
MrBlueSky
bladesage wrote:
MrBlueSky wrote:
Mgccl wrote:


I don't think it works... Confused


My mistake. What gorn says:

gorn wrote:
Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/si',$a,$captured_elements);


(Change the m to an s)

The single line/multiline thing is confusing. But in this case you want all the text to be treated as if it's on a single line (ie \n is just a normal character).


bladesage wrote:
Well, for one, you place a lot of unescaped special characters in there...

The symbols <>=: are special pattern characters with special meanings that alter the pattern.


There are no characters in his pattern that should be escaped, he even escaped 'too much'. <>=: are not special characters in regular expressions.


php.net wrote:
The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | :


I trust the people at the official PHP website myself. And I did have problems before by not escaping the above symbols. So, all things considered, I'd say that these are special characters.


Programming Perl wrote:

For all their power and expressivity, patterns in Perl recognize the same 12 traditional metacharacters (the Dirty Dozen, as it were) found in many other regular expression packages:

\ | ( ) [ { ^ $ * + ? .



php.net wrote:


Meta-characters

The power of regular expressions comes from the ability to include alternatives and repetitions in the pattern. These are encoded in the pattern by the use of meta-characters, which do not stand for themselves but instead are interpreted in some special way.

There are two different sets of meta-characters: those that are recognized anywhere in the pattern except within square brackets, and those that are recognized in square brackets. Outside square brackets, the meta-characters are as follows:

\
general escape character with several uses
^
assert start of subject (or line, in multiline mode)
$
assert end of subject (or line, in multiline mode)
.
match any character except newline (by default)
[
start character class definition
]
end character class definition
|
start of alternative branch
(
start subpattern
)
end subpattern
?
extends the meaning of (, also 0 or 1 quantifier, also quantifier minimizer
*
0 or more quantifier
+
1 or more quantifier
{
start min/max quantifier
}
end min/max quantifier



As you can see both the Perl-documentation and php.net (http://www.php.net/manual/en/reference.pcre.pattern.syntax.php) agree on the fact that characters like <>:=" have no special meaning in regular expressions. You don't need to escape them, except when they are used in character sequences which have a special meaning in regular expressions, as the ones mentioned by gorn above. If you want to search for all HTML tags you can use:

Code:

/<.*?>/


without escaping the < and >. Try it. You shouldn't put too much trust on documentation, because documention is easily misunderstood. Wink
Mgccl
gorn wrote:
Code:

preg_match('/<div\s+style="visibility:hidden">(.+?)<\/div>/si',$a,$captured_elements);


(Change the m to an s)

The single line/multiline thing is confusing. But in this case you want all the text to be treated as if it's on a single line (ie \n is just a normal character).

oh thx, it works now Smile
Related topics
can i help?
help,Bonding
HELP!!
Help me with MySQL Account Maintenance and phpBB 2.0.15 :((
help, change pass my acc in cpanel
help me upgrade php to 5.
Need help adding a domain
help add admin in phpbb forum
How to solve this problem? Thanks
Help!
Help me change port for phptriad
problam Plz help
PHP Regex
Perl regex help
Reply to topic    Frihost Forum Index -> Scripting -> Php and MySQL

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.