Extract image src from a string
I'm trying to match all the images elements as strings,
This is my regex:
html.match(/<img[^>]+src="http([^">]+)/g);
This works, but I want to extract the src
of all the images. So when I execute the regular expression on this String:
<img src="http://static2.ccn.com/ccs/2013/02/img_example.jpg />
it returns:
"http://static2.ccn.com/ccs/2013/02/img_example.jpg"
Solution 1:
You need to use a capture group ()
to extract the urls, and if you're wanting to match globally g
, i.e. more than once, when using capture groups, you need to use exec
in a loop (match
ignores capture groups when matching globally).
For example
var m,
urls = [],
str = '<img src="http://site.org/one.jpg />\n <img src="http://site.org/two.jpg />',
rex = /<img[^>]+src="?([^"\s]+)"?\s*\/>/g;
while ( m = rex.exec( str ) ) {
urls.push( m[1] );
}
console.log( urls );
// [ "http://site.org/one.jpg", "http://site.org/two.jpg" ]
Solution 2:
var myRegex = /<img[^>]+src="(http:\/\/[^">]+)"/g;
var test = '<img src="http://static2.ccn.com/ccs/2013/02/CC_1935770_challenge_accepted_pack_x3_indivisible.jpg" />';
myRegex.exec(test);
Solution 3:
As Mathletics mentioned in a comment, there are other more straightforward ways to retrieve the src attribute from your <img>
tags such as retrieving a reference to the DOM node via id, name, class, etc. and then just using your reference to extract the information you need. If you need to do this for all of your <img>
elements, you can do something like this:
var imageTags = document.getElementsByTagName("img"); // Returns array of <img> DOM nodes
var sources = [];
for (var i in imageTags) {
var src = imageTags[i].src;
sources.push(src);
}
However, if you have some restriction forcing you to use regex, then the other answers provided will work just fine.