I'm developing a new tool (keyword analysis) that uses regular expression patterns to query or spider pieces of web pages. I works just fine with title tags and meta tags, but the problem, for instance, is querying code that ends with ">, like alt tags.
I tried : regExp.Pattern = "alt=""(.*)?"""
But this only worked if the alt text ended with ">, but when there is more like:
...alt="text text text" style="padding-top:2px;">
I would get:
...text text text" style="padding-top:2px;
I only want the alt text. I also tried:
regExp.Pattern = "\balt=""(.*)?""\b"
This gave the same result, just without the "padding-top:2px;
Why won't the pattern stop after the second quotation mark?
If anyone can help, I would really appreciate it. I've read alot on the web about regular expressions but nothing helped. Thanks!
|