0

I'm writing a chrome extension, and I need to split a string that contains only text and img tags, so that every element of the array is either letter or img tag. For example, "a", "b", "c", "<img.../>", "d". I've found a way to do this: str.split(/(<img.*?>|)/), however, some elements of the resulting array are empty (I don't know why). Are there any other suitable regexes?

Thank you very much for your help.

6
  • Can you show your code? Commented Jul 16, 2013 at 11:46
  • You could filter out the empty elements. Commented Jul 16, 2013 at 11:46
  • Do you have two image tags next to each other? That will put a blank entry between them. Commented Jul 16, 2013 at 11:48
  • Sample code:"<img>fdsf<img>dasaadda".split(/(<img.*?>|)/). As you see, no blank entries. I know, I could filter out empty elements, but it's just interesting if there are other ways. Commented Jul 16, 2013 at 11:53
  • There are empty results because your regex has a null alternation. Commented Jul 16, 2013 at 12:03

2 Answers 2

1

The reason you get empty elements is the same why you get <img...> inyour results. When you use capturing parentheses in a split pattern, the result will contain the captures in the places where the delimiters were found. Since you have (<img.*?>|), you match (and capture) an empty string if the second alternative is used. Unfortunately, (<img.*?>)| alone doesn't help, because you'll still get undefined instead of empty strings. However, you can easily filter those out:

str.split(/(<img[^>]*>)|/).filter(function(el) { return el !== undefined; });

This will still get you empty elements at the beginning and the end of the string as well as between adjacent <img> tags, though. So splitting <img><img> would result in

["", "<img>", "", "<img>", ""]

If you don't want that, the filter function becomes even simpler:

str.split(/(<img[^>]*>)|/).filter(function(el) { return el; });
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, it works. Well, it seems that there is no other solution except using filter so I'll accept your answer.
1

You can use exec instead of split to obtain the separated elements:

var str = 'abc<img src="jkhjhk" />d';
var myRe = /<img[^>]*>|[a-z]/gi;
var match;
var res= new Array();

while ((match = myRe.exec(str)) !== null) {
    res.push(match[0]);
}
console.log(res);

1 Comment

Thank you for your answer. Of course I can use exec, but I wanted to solve this task using split.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.