Translate BBCode into HTML

Question

BBCode is a markup language commonly used in webforum software in the 2000s and 2010s. Your task is to write a program or function that translates BBCode to HTML according to the following spec. (BBCode has wildly varying implementations in the real world, but for this challenge we're defining it like this):

Details

BBCode tags are case insensitive [b][/b] and [B][/b] and [B][/B] are all valid), but HTML output tags must be lowercase.
Nested tags are valid (see Nesting section for further details): [b][i]text[/i][/b] → text
Unmatched, malformed, or unknown tags are left as literal text: [b]foo → [b]foo, [b ]foo[/b] → [b ]foo[/b], [/b] → [/b], [bar]foo[/bar] → [bar]foo[/bar]
All attributes are valid; don't worry about URI validation, color/size name validation, anti-XSS, etc.
Empty tags are valid: [b][/b] →

Input attributes and input text inside (valid) tags (except for [code]) will not contain[, ], =, or ".

Nesting

Tags close in LIFO order (like a stack).

Inputs will always have properly nesting tags.

Test cases

Input: [b]bold[/b]
Output: <strong>bold</strong>

Input: [B]BOLD[/B]
Output: <strong>BOLD</strong>

Input: [b][i]bold italic[/i][/b]
Output: <strong><em>bold italic</em></strong>

Input: [url]http://example.com[/url]
Output: <a href="http://example.com">http://example.com</a>

Input: [url=http://example.com]click here[/url]
Output: <a href="http://example.com">click here</a>

Input: [img]http://example.com/image.png[/img]
Output: <img src="http://example.com/image.png">

Input: [color=red]red text[/color]
Output: <span style="color:red">red text</span>

Input: [size=20px]big text[/size]
Output: <span style="font-size:20px">big text</span>

Input: [quote]someone said this[/quote]
Output: <blockquote>someone said this</blockquote>

Input: [quote=John]someone said this[/quote]
Output: <blockquote><cite>John</cite>someone said this</blockquote>

Input: [b]unclosed tag
Output: [b]unclosed tag

Input: unopened tag[/b]
Output: unopened tag[/b]

Input: [unknown]text[/unknown]
Output: [unknown]text[/unknown]

Input: [b]nested [i]tags[/i] work[/b]
Output: <strong>nested <em>tags</em> work</strong>

Input: [url=http://test.com][b]bold link[/b][/url]
Output: <a href="http://test.com"><strong>bold link</strong></a>

Input: plain text with no tags
Output: plain text with no tags

Input: [code]<script>alert('hi')</script>[/code]
Output: <code><script>alert('hi')</script></code>

Input: [CoLoR=blue]case test[/color]
Output: <span style="color:blue">case test</span>

Input: [code][b]not bold[/b][/code]
Output: <code>[b]not bold[/b]</code>

Input: [code][url=http://test.com]link[/url][/code]
Output: <code>[url=http://test.com]link[/url]</code>

Input: [b][code]tags[/code] outside[/b]
Output: <strong><code>tags</code> outside</strong>

Input: [b][i][u]triple nested[/u][/i][/b]
Output: <strong><em><u>triple nested</u></em></strong>

Input: [color=red][b]colored bold[/b][/color]
Output: <span style="color:red"><strong>colored bold</strong></span>

Input: [quote=Alice][b]bold quote[/b][/quote]
Output: <blockquote><cite>Alice</cite><strong>bold quote</strong></blockquote>

Input: [url=http://test.com][color=blue]styled link[/color][/url]
Output: <a href="http://test.com"><span style="color:blue">styled link</span></a>

Input: [code][code]nested code[/code][/code]
Output: <code>[code]nested code[/code]</code>

Input: [u]foo[/u]
Output: <u>foo</u>

Input: plaintext
Output: plaintext

Input: [code]left[/code][code]right[/code]
Output: <code>left</code><code>right</code>

This is code-golf. Standard loopholes are forbidden.

HTML has  and  tags and they are semantically different from  and . Not requesting to change rules, but why do you define the mapping that way? — Explorer09
– Explorer09, Commented Jan 29 at 4:05
@Explorer09 www.bbcode.org does translate [b] into  and [i] into . And so does Markdown on this very site with **this** and *this*. — Arnauld
– Arnauld, Commented Jan 29 at 4:51
text will never contain [, ] seems to contradict malformed, or unknown tags are left as literal text. — Arnauld
– Arnauld, Commented Jan 29 at 4:56
@qarz  and  have different semantics in HTML5 and are not interchangeable.  is for general marker for drawing attention or keywords.  is for making importance in text/speech. The **this** in Markdown translates better to  and not . Likewise for  (general marker for loanwords, scientific terms and titles) and  (stress or emphasis in speech). See also: FAQ from WHATWG — Explorer09
– Explorer09, Commented 2 days ago
@Explorer09 i don't think it's very relevant to be pedantic about semantic html when we're talking about software that is nowadays largely considered obsolete, used for informal conversation by non-technical users. — Themoonisacheese
– Themoonisacheese, Commented 2 days ago

mastaH · Accepted Answer · 2026-01-30 12:38:12Z

Perl 5, 485 bytes

undef$/;$_=<>;s/\[code]((?:(?R)|.)*?)\[\/code]/push@a,$1;"\0"/geis;1while s{\[(b|i|u|img|url|color|size|quote)(?:=([^]]+))?](.*?)\[/\1]}{($n,$x,$y)=(lc$1,$2,$3);$n=~/^u/?$n=~/l/?"<a href=\"".($x||$y)."\">$y</a>":"<u>$y</u>":$n=~/^i/?$n=~/m/?"<img src=\"$y\">":"<em>$y</em>":$n=~/^b/?"<strong>$y</strong>":$n=~/^q/?"<blockquote>".($x?"<cite>$x</cite>":"")."$y</blockquote>":"<span style=\"".($n=~/c/?"color":"font-size").":$x\">$y</span>"}gie;s/\0/"<code>".shift(@a)."<\/code>"/ge;print

Try it online!

Detailed explaination

undef$/;$_=<>;

Undefines the line separator so $_=<>; reads the entire input (newlines and all) into the default variable $_ at once.

s/\[code]((?:(?R)|.)*?)\[\/code]/push@a,$1;"\0"/geis;

Matches a [code] block and substitutes its contents with a null byte \0, saving them in the array @a for later.

1 while s{\[(b|i|u|img|url|color|size|quote)(?:=([^]]+))?](.*?)\[/\1]}{
  ($n,$x,$y)=(lc$1,$2,$3);
  $n=~/^u/?$n=~/l/?"<a href=\"".($x||$y)."\">$y</a>":"<u>$y</u>":
  $n=~/^i/?$n=~/m/?"<img src=\"$y\">":"<em>$y</em>":
  $n=~/^b/?"<strong>$y</strong>":
  $n=~/^q/?"<blockquote>".($x?"<cite>$x</cite>":"")."$y</blockquote>":
  "<span style=\"".($n=~/c/?"color":"font-size").":$x\">$y</span>"
}gie;

s{...}g replaces all matching tags in the string. The while loop handles nested tags.

$n is the tag name
$x are the tag attributes if present
$y is what's within the tag

Inside the loop, a set of if/else replace the appropriate tags.

Starts with u? -> u or url
Starts with i? -> img or i
Starts with b? -> b.
Starts with q? quote (adds <cite>...</cite> only if an author was defined)
Else: It must be color or size. Merge these as they both output  by checking if the name contains c to decide between color: or font-size:.

s/\0/"<code>".shift(@a)."<\/code>"/ge;
print

s/\0/.../ge finds every null byte placeholder created in the beginning and replaces it (FIFO) shift(@a) with the array contents.

Themoonisacheese · Accepted Answer · 2026-01-30 12:53:30Z

Retina, 675 bytes

i`\[code](.*)\[/code]
==$1==
i`(?<!==.*)\[b](.*)\[/b](?!.*==)
<strong>$1</strong>
i`(?<!==.*)\[i](.*)\[/i](?!.*==)
<em>$1</em>
i`(?<!==.*)\[(.)](.*)\[/\1](?!.*==)
<$1>$2</$1>
i`(?<!==.*)\[color=(.*)](.*)\[/color](?!.*==)
<span style="color:$1">$2</span>
i`(?<!==.*)\[size=(.*)](.*)\[/size](?!.*==)
<span style="font-size:$1">$2</span>
i`(?<!==.*)\[url](.*)\[/url](?!.*==)
[url=$1]$1[/url]
i`(?<!==.*)\[url=(.*)](.*)\[/url](?!.*==)
<a href="$1">$2</a>
i`(?<!==.*)\[img](.*)\[/img](?!.*==)
<img src="$1">
i`(?<!==.*)\[quote=(.*)](.*)\[/quote](?!.*==)
[quote]<cite>$1</cite>$2[/quote]
i`(?<!==.*)\[quote](.*)\[/quote](?!.*==)
<blockquote>$1</blockquote>
==(.*)==
<code>$1</code>

Try it online!

-20 bytes: I discovered by accident that ] matches that character and it doesn't need to be escaped when it doesn't close a character class

+0 byte: fixing a typo, saved another ]

a big mess but by the first 5 minutes i already wanted to be done writing regex.

Explanation

all exressions are case-insensitive
First, replace all code tags with == which is guaranteed not to appear elsewhere
I use negative look(ahead|behind) to assert the things i'm matching aren't enclosed in == in all subsequent expressions, properly escaping whatever is in code tags (this is shorter than checking for the full [code] tag everytime)
i replace b and i tags with their respective html
i replace the remaining 1-letter tags with the same tag as html
i replace color and size tags (an optimization here could allow first replacing size with font-size then matching and replacing both but i couldn't find a way where that's shorter)
I replace urls without an = with one that has
I replace all urls properly
i replace all images
i replace quotes with an = to one without, that just contains the literal <cite> tag
I replace quote tags properly
finally, i replace my custom == tags with their contents in a <code> tag.

@Seggan to be quite honest it wasn't fun, it wasn't rewarding and the documentation sucked. I do not recommend it. — Themoonisacheese
– Themoonisacheese, Commented 2 days ago
there might be a typo for the closing <blockquote> as it doesn't have the / character — mastaH
– mastaH, Commented yesterday
For a start, here's an obvious 19 byte saving: Try it online! (input removed due to comment length limitations). Using Retina 0.8.2 here just to prove you're not using Retina 1's power. — Neil
– Neil, Commented yesterday

Arnauld · Accepted Answer · 2026-01-31 17:20:25Z

JavaScript (ES12), 478 bytes

s=>(a=s.split(/(\[.+?\])/)).map(S=(s,i)=>i&1?([,C,t,,p]=/.(\/)?(\w+)(=(.+))?./.exec(s),n=`url|color|size|b|i|quote|img|u|s|code`.split`|`.indexOf(t=t.toLowerCase()),C)?!([j,p]=S[t]?.pop()||[],T=`|a href="0"||21:0"||2font-1:0"|strong||em||block1|block1><cite>0</cite|1 src="0"||u||s||1`.split`|`[n*2|!!p||1]?.replace(/\d/g,n=>[p||a[j+1],t,'span style="'][n]),c-=c&&n>8)*j&&T?a[a[j]=`<${T}>`,i]=n-6?`</${/\w+/.exec(T)}>`:a[j+1]='':0:(c+=n>8,S[t]||=[]).push([i,p]):0,c=0)&&a.join``

Attempt This Online!

Method

We split the input string on pseudo-legal BBCode tags, placing the tags at odd positions and the remaining parts at even positions. For instance:

[b][i]foo[/i][/b] → ["","[b]","","[i]","foo","[/i]","","[/b]",""]

We then iterate over the resulting array, modifying it in-place whenever a valid pair of opening and closing tags is found.

There is one stack per tag type in S, storing the position of each opening tag and its BBCode parameter, if any.

The counter c is used to keep track of the nesting depth of code blocks, allowing us to disable HTML conversion inside them.

We use two lookup tables:

one with 10 entries to identify the BBCode tags
one with 20 entries for the corresponding HTML tags, with and without a BBCode parameter

Commented

s =>
// split the input string on pseudo-legal BBCode tags '[…]'
(a = s.split(/(\[.+?\])/))
// for each part s at index i
.map(S = (s, i) =>
  i & 1 ?
    // if this is a tag
    (
      // C = 'closing tag' flag, t = tag name, p = optional parameter
      [, C, t,, p] = /.(\/)?(\w+)(=(.+))?./.exec(s),
      // force t to lowercase and get n = internal BBCode tag ID
      n =
      // 0   1     2    3 4 5     6   7 8 9
        `url|color|size|b|i|quote|img|u|s|code`
        .split`|`
        .indexOf(t = t.toLowerCase()),
      C
    ) ?
      // if this is a closing tag
      !(
        // attempt to retrieve j = position of the opening tag
        // and p = parameter of the opening tag
        [j, p] = S[t]?.pop() || [],
        // T = HTML tag determined by n and the presence of a parameter
        // (NB: an 'url' tag is always forced to entry #1)
        T =
        //  1           3      5          6       8
          `|a href="0"||21:0"||2font-1:0"|strong||em||` +
        // 10     11                   12         14 16 18
          `block1|block1><cite>0</cite|1 src="0"||u||s||1`
          .split`|`
          [n * 2 | !!p || 1]
          // unpack: 0 → p or a[j + 1], 1 → t, 2 → 'span style="'
          ?.replace(/\d/g, n => [ p || a[j + 1], t, 'span style="'][n]),
        // decrement c if it's greater than 0 and the tag is 'code'
        c -= c && n > 8
      ) * j && T ?
        // if c is not 0 and both j and T are defined,
        // replace the opening tag with T
        a[a[j] = `<${T}>`, i] =
          n - 6 ?
            // if this is not an 'img' tag, update the closing tag
            // using the name extracted from T
            `</${/\w+/.exec(T)}>`
          :
            // otherwise, clear both a[j + 1] and the closing tag
            a[j + 1] = ''
      :
        // invalid tag: do nothing
        0
    :
      // opening tag: push [i, p] onto this tag's stack
      // and increment c if the tag is 'code'
      (c += n > 8, S[t] ||= []).push([i, p])
  :
    // this is not a tag: do nothing
    0,
  // c = code block counter, initialized to 0
  c = 0
)
// end of map(), return a[] joined
&& a.join``

qarz · Accepted Answer · 2026-01-29 18:44:13Z

Python 3.12, ̶1̶4̶9̶1̶ ̶ 1484 bytes

J=len
import re
def A(s):
 U='color';T='img';S='url';R='span';Q='strong';L='code';Z='blockquote';G=[];C='';I=F=0;K={};D=list(re.finditer('\\[(/?)(\\w+)(?:=([^\\]]+))?\\]',s,re.I))
 for(B,E)in enumerate(D):
  if E[1]:continue
  A=E[2].lower();H=B+1;O=A==L
  while H<J(D):
   if D[H][2].lower()==A:
    if A==L:O=O+1-2*bool(D[H][1])
    D[H][1]and(A==L)*(O==0)*(A!=L):K[B]=H;break
   H+=1
 while F<J(D):
  E=D[F];C+=s[I:E.start()];I=E.end();A=E[2].lower();M=E[3];P=E[1]
  if G and G[-1][0]==L:
   if P and A==L and F==K[G[-1][1]]:C+='</code>';G.pop()
   else:C+=E[0]
  elif P:
   if G and G[-1][0]==A:C+=f"</{dict(b=Q,i='em',quote=Z,url='a',color=R,size=R).get(A,A)}>";G.pop()
   else:C+=E[0]
  elif A in'biuscode':
   if F in K:C+=f"<{dict(b=Q,i='em').get(A,A)}>";G+=[(A,F)]
   else:C+=E[0]
  elif'quote'==A:
   if F in K:C+=f'<{Z}><cite>{M}</cite>'if M else f'<{Z}>';G+=[(A,F)]
   else:C+=E[0]
  elif A==S:
   if M:
    if F in K:C+=f'<a href="{M}">';G+=[(A,F)]
    else:C+=E[0]
   else:
    B=F+1
    while B<J(D)and not(D[B][1]and D[B][2].lower()==S):B+=1
    if B<J(D):N=s[I:D[B].start()];C+=f'<a href="{N}">{N}</a>';I=D[B].end();F=B
    else:C+=E[0]
  elif A==T:
   B=F+1
   while B<J(D)and not(D[B][1]and D[B][2].lower()==T):B+=1
   if B<J(D):N=s[I:D[B].start()];C+=f'<img src="{N}">';I=D[B].end();F=B
   else:C+=E[0]
  elif A in'colorsize':
   H=U if A==U else'font-size'
   if F in K:C+=f'<span style="{H}:{M}">';G+=[(A,F)]
   else:C+=E[0]
  else:C+=E[0]
  F+=1
 return C+s[I:]

Few trivial bytesaves possible, for example elif A=='quote' can be changed to elif'quote'==A, if A==L and O==0 or A!=L and D[H][1] can become D[H][1]and(A==L)*(O==0)*(A!=L), etc. — CrSb0001
– CrSb0001, Commented 2 days ago
You might want to check this question for more general tips, maybe see if you can incorporate some of those in the answer — CrSb0001
– CrSb0001, Commented 2 days ago

qarz · Accepted Answer · 2026-01-31 20:37:08Z

C, 1493 bytes

#include<stdio.h>
#include<string.h>
#define P printf
char I[99999],*T[]={"b","i","u","s","code","url","img","color","size","quote"},A[999][256],t[256],a[256],*O[]={"strong","em","u","s","code"};int S[999],R[999],s,o[99999],c[99999],L,z,e,n,Y,x,d,i,k;int q(char*u,char*v){for(;*u&&*v;u++,v++)if((*u|32)!=(*v|32))return 0;return!*u&&!*v;}int y(char*m){for(k=0;k<10;k++)if(q(m,T[k]))return k;return-1;}int p(int r){int k=r+1;z=0;*a=0;if(I[k]==47)z=1,k++;n=0;while(I[k]&&I[k]-93&&I[k]-61&&I[k]-91)t[n++]=I[k++];t[n]=0;if(I[k]==61&&!z){k++;n=0;while(I[k]&&I[k]-93&&I[k]-91)a[n++]=I[k++];a[n]=0;}return I[k]==93?e=k+1:0;}int main(){L=fread(I,1,99999,stdin);for(i=0;i<L;i++)o[i]=c[i]=-1;for(i=0;i<L;i++)if(I[i]==91&&p(i)){Y=y(t);if(x){if(Y==4&&z&&!--d){x=0;if(s&&S[s-1]==4)s--,o[R[s]]=4,c[i]=4;}else if(Y==4&&!z)d++;continue;}if(~Y){if(z){if(s&&S[s-1]==Y)s--,o[R[s]]=Y,strcpy(A[R[s]],A[s]),c[i]=Y;}else{if((Y==7|Y==8)&&!*a)continue;S[s]=Y;strcpy(A[s],a);R[s++]=i;if(Y==4)x=1,d=1;}}}for(i=0;i<L;){if(~o[i]){Y=o[i];char*v=A[i];p(i);if(Y<5)P("<%s>",O[Y]);else if(Y<7){if(Y<6&&*v)P("<a href=\"%s\">",v);else{n=0;for(;e<L&&c[e]-Y;)t[n++]=I[e++];t[n]=0;Y<6?P("<a href=\"%s\">%s</a>",t,t):P("<img src=\"%s\">",t);p(e);i=e;goto N;}}else if(Y<9)P("<span style=\"%s:%s\">",Y<8?"color":"font-size",v);else{P("<blockquote>");if(*v)P("<cite>%s</cite>",v);}i=e;}else if(~c[i]){Y=c[i];p(i);Y<4?P("</%s>",O[Y]):Y<5?P("</code>"):Y<6?P("</a>"):Y<7?0:Y<9?P("</span>"):P("</blockquote>");i=e;}else putchar(I[i++]);N:;}}

Stack Exchange Network

Translate BBCode into HTML

Tags

Details

Nesting

Test cases

5 Answers 5

Perl 5, 485 bytes

Detailed explaination

Retina, 675 bytes

JavaScript (ES12), 478 bytes

Method

Commented

Python 3.12, ̶1̶4̶9̶1̶ ̶ 1484 bytes

C, 1493 bytes

Linked

Hot Network Questions

Translate BBCode into HTML

Tags

Details

Nesting

Test cases

5 Answers 5

Perl 5, 485 bytes

Detailed explaination

Retina, 675 bytes

JavaScript (ES12), 478 bytes

Method

Commented

Python 3.12, ̶1̶4̶9̶1̶ ̶ 1484 bytes

C, 1493 bytes

Linked

Related

Hot Network Questions