In a recent project, I desperately needed an RTF to HTML converter written in PHP. Googling around turned up some matches, but I could not get them to work properly. Also, one of them called passthru() to use a RTF2HTML executable, which is something I didn’t want. I was looking for a RTF2HTML converter written […]

In a recent project, I desperately needed an RTF to HTML converter written in PHP. Googling around turned up some matches, but I could not get them to work properly. Also, one of them called passthru() to use a RTF2HTML executable, which is something I didn’t want. I was looking for a RTF2HTML converter written purely in PHP.

Since I couldn’t find anything ready-made, I sat down and coded one up myself. It’s short, and it works, implementing the subset of RTF tags that you’ll need in HTML and ignoring the rest. As it turns out, the RTF format isn’t that complicated when you really look at it, but it isn’t something you code a parser for in 15 minutes either.

How to use it

Include the file rtf.php somewhere in your project. Then do this:

If you’d like to see what the parser read, then call this:

To convert the parser’s parse tree to HTML, call this:

Update 3 Sep ’14:

  • Fixed bug: underlining would start but never end. Now it does.
  • Feature request: images are now filtered out of the output.

Update 13 Aug ’15:

This project is now on Github:

The code



44 44 Responses to “A working RTF to HTML converter in PHP”
  1. Eugene Valeriano says:

    You Save my Life dude.. Thanks!

  2. Anne says:

    Very useful script, thanks! Just one question. This code doesn’t filter embedded images, so the output may contain large text strings. Might it be useful to add an image filter or something? :)

    • alex says:

      That’s a good idea. The principle of this script is simplicity, so I would want to filter out images rather than including them because deciphering them would be expensive for the server. I’ll add an image filter when I update the code.

    • alex says:

      I’ve updated the code. Images are now filtered out.

  3. Pratiksha says:

    Nice Article ..

    It help me a lot

    Thank you.

  4. Klaus Delacroix says:

    Hi, found your moudle, which really helped me a lot.

    I have found 2 problems, though:

    1. The underline of underlined words is not terminated after the word, but is in effect until the end of the text.

    2. Colours are not supported

    Any idea on how I could teach the module to behave correctly also in these cases?

    • alex says:

      Actually looking at the code I don’t see why underlining doesn’t get turned off. You could make a simple RTF test file with one word underlined, and see what the dump() method says. There’s probably a typo somewhere.

      For colours I would have to go back and look at the RTF spec again. For my project needs I didn’t require underlining or colours, so this was never tested.

      • alex says:

        Actually, Sebastien posted below about the same underline issue. It turns out that RTF’s ul tag gets closed with ulnone, not ul0. I thought it would work like the bold b tag, which is closed with b0, but no. This would have to be changed in the code.

    • alex says:

      I’ve updated the code. Underlining now terminates correctly.

  5. Sergio Gabriel says:

    Thanks a lot! I have a little problem, the class don’t convert “tabs control” to “nbsp;”, how can do this?

  6. Sergio Gabriel says:

    In formatControlWord method of RtfHtml class, I put this if($word->word == “tab”) $this->output .= ” “; and work!

    • alex says:

      Nice work. In order for this code to be more robust, the HTML formatter should be made into a separate class, so you can more easily extend it as you have done. Also, you could have plaintext and XML formatters. But oh well, this was meant to be simple and solve only the RTF to HTML problem.

  7. Chris says:

    This is really A W E S O M E !
    Thank you! :)

  8. Sebastien says:

    I’m trying out this script for a projet and evreything is underline anyclue why?

    RFT :

    {rtf1ansiansicpg1252deff0nouicompat{fonttbl{f0fnilfcharset0 Tahoma;}{f1fnil Tahoma;}{f2fnilfcharset2 Symbol;}}
    {colortbl ;red0green0blue0;}
    {*generator Riched20 6.3.9600}viewkind4uc1
    pardcf1ulbf0fs22lang3084 Maintenance des transporteursulnoneb0par
    Ajouter une nouvel onglet ‘Param’e8tres EDI’ dans laquelle on retrouvera :par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200 Dans le haut : f1par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li520f0 Un titre ‘EDI – Achats’f1par
    {pntextf2’B7tab}f0 Nom du transporteur ‘e0 exporterf1par
    {pntextf2’B7tab}f0 No de compte d’e9fautf1par


    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200f0 Une grillef1par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li520f0 No divisionf1par
    {pntextf2’B7tab}f0 Nom de la division (en affichage)f1par
    {pntextf2’B7tab}f0 No de de compte ‘e0 utiliser pour cette divisionf1par

    ulf0 Noteulnonepar
    Le no de compte qui sera utilis’e9 en priorit’e9 sera :par

    1) Selon la division de la commandepar
    2) Selon la division m’e8re de la commandepar
    3) No de compte d’e9fautpar


    pardulbfs22 Module ‘Bon d’achats’par
    Dans l’onglet ‘Ent’eate’ / Sous onglet ‘Termes’, ajouter une section pour ‘EDI’ dans laquelle on pourra saisir un code du transporteur exig’e9.par
    ulbfs22 Module ‘R’e9quisition’par
    Dans la fen’eatre ‘Cr’e9ation des bons d’achat’, onglet ‘Bon d’achats” / Section ‘Termes’ de la grille des BA’s ‘e0 ‘e9mettre, ajouter une colonne ‘Trp-EDI’ repr’e9sentant le code du transporteur exig’e9.par
    ulbf0fs22 Module ‘EDI’par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200 Cr’e9er une nouvelle proc’e9dure ‘EDI_SP_Export_BonAchEnt_Trp’ qui retournera les infos requises pour construire le segment ‘PO_TRANSPORTEUR’.par
    {pntextf2’B7tab}Ajouter cette proc’e9dure aux proc’e9dures disponibles dans le catalogue EDI.par

    ulb Importantulnoneb0 par
    La d’e9finition du segment / colonnes EDI et l’ajout du segment ‘e0 l’enveloppe n’est pas incluse dans cet estim’e9. Normalement, vous pouvea effectuer cette t’e2che. Si toutefois, vous aviez besoin d’assistance, un formateur pourra intervenir en appliquant son temps contre votre banque d’heures.par

    thx for your time.

  9. Anon says:

    Sensational: there’s only a couple of missing features that I feel would expand it to be able to convert rtf from the majority of basic rtf editors:

    Font (face and colour)
    Alignment (left, center, right)
    List items (bullets and numbering)
    Super / subscript

    Love it – keep up the good work!

  10. Serge says:

    Very nice Alexander !

    A question: when I have a text like “Poëzie” it converts to (shortened):

    Which gives as result : Po ë zie

    Any idea about a quick solution?

    Greetings & thanks Serge

  11. Nick says:

    Everything works, except no ‘br” or “p” line breaks.
    So my output is one run-on line of text.
    There’s lots of “span”s, but adding a “br” after each didn’t work, because it wrapped a pair of “span”s around the “bullet point” symbol. My adding a “br” after “/span” therefore also put a “br” after the bullet, which put the bullet point text on a new line.

    Provided web hosting services, and website development, in English and Spanish.

    Managed on-going work schedules.
    Is there supposed to be some css that goes with this?
    Is there something wrong with my RTF file?

  12. Doug says:

    Finally a PHP RTF to HTML class that works. Everything else I’ve found is horribly written, buggy, throws errors, doesn’t work, or all of the above.

    Thank you… a million times, thank you. I wasn’t looking forward to writing my own.

  13. Peter says:

    Hi there
    I just stumbled upon your Code in search of a working RTF to HTML Converter. I encounter some problems with your class and I wonder if you could help me.
    When my RTF String is something like this:

    {rtf1ansiansicpg1252deff0{fonttbl{f0fnilfcharset0 MS Sans Serif;}}
    {colortbl ;red0green0blue0;}
    viewkind4uc1pardcf1lang2055bf0fs16 Some Bold Textb0 , some more text.par
    bi Temp. 22’b0C is best.b0i0fs20par

    I get the following result:

    tf1fonttblf0fnilfcharset0 MS Sans Serif;u000biewkind4f0fs16 Some Bold Text, some more text.Temp. 22°C is best.fs20

    If I pull the exact same string from my Databse I get the following error:

    “errMsg”:”Uninitialized string offset: 407″, “lineNo”:”140″

    And if I use a String with almost no formating like this one:

    “this is some example text rnrn with two line breaks”

    I get this error message:

    “errMsg”:”array_push() expects parameter 1 to be array, null given”, “lineNo”:”308″

    The Line Numbers should be the same as in the original code. Do you have any Idea how to fix this?
    Thanks in advance.

    Regards Peter

  14. Strablet says:

    Great job. I have spent days looking through other variations and this is the best one I’ve found so far.

    It works right out of the box, with some exceptions, which I think I can manage.

    I live in Asia (Taiwan) and so some special characters get used.

    However, because your code is so well organized, I’m sure I can add a few things here and there to get what I need.

    I love the concept of making two passes at it. That totally rocks. First, strip out the stuff we don’t need. Second, convert what’s left to HTML.

    Thanks for the great work.

  15. Strablet says:

    Just discovered something you might find interesting. When I open Wordpad and re-save the RTF file to itself (it cleans up a lot of leftovers from MS Word), it also inserts three blank spaces at the end of the file, after the last curly bracket { and then your software creates a PHP error message:

    PHP Warning: array_push() expects parameter 1 to be array, null given

    The lines in reference are these:

    while(!$terminate && $this->pos len);

    $rtftext = new RtfText();
    $rtftext->text = $text;
    array_push($this->group->children, $rtftext);

    You can find them in ParseText();

    My guess is that the parser isn’t at the end of the file, so $this->pos len is not true, but the last curly bracket has been passed, and so !$terminate is not true either. Or something like that. My logic may be a little fuzzy here.

    Will find a solution to this on my own, if I can, and post it here. If I delete the last three spaces after the last curly bracket in the RTF file, everything runs fine.

  16. Tim says:

    could you maybe put this code into github. Maybe we all can help to improve the code or implement new features?

  17. Tim says:

    this code is shit. I read it through and tried to run it but it does nothing. The worst thing is that there is no documentation or hints to run it. (Yes i read the hints at the comment section at the beginning but it does not work, there is no output from this code).

  18. Calfa says:

    Hi, thanks. It’s very good script.
    Is possible to include support for colored text?

  19. This code is working very well. Thanks.
    But, now I need the versa version, to convert html to rtf.
    Do you have the solution ?

  20. JS says:

    Bro! Thank you a lot for this stuff!


  21. Rockberto says:

    Big work. It ‘ also possible to extract the tables ? For example:

    { rtf1 ansi deff0
    intbl cell 1 cell
    intbl 2 cell cell
    intbl 3 cell cell
    This will give :
    | cell 1 | cell 2 | 3 cell |
    A row is delimted with trowd … row
    Each cell ends with cell
    cellx Determines the right side of the Corresponding cell in twips .

    Thank you in advance.

  22. iman says:

    Thanks very much …
    but when i use rtf with other language Persain or arabic for example ,content shown ÇÓÊ …
    rtf content: سلام این یک تست است
    out put: ÓáÇã Ç◊ä ◊� ÊÓÊ ÇÓÊ
    note: i used a meta tag on header of out put file :

    but dosnt work!!!

  23. komal says:

    You are a legend, thank you very much.

  24. Oladipo says:

    Thanks so much for this code.

    I have figured a way to sort out colored text, how do I send code. Also, to fix some errors with bold text and some other bugs here and there.

    How do I get updates across to you.

    Is the code on GitHub?

  25. David Garcia says:

    Hey There, I ‘m trying to run it when I get error on line 383 of file rtf-html-php.php ” Fatal error: Call to a member function GetType() on null in rtf-html-php.php on line 383″ any idea why the error happens ?

  26. Chuck says:

    Worked like a bliss! Thanks very much for a great tool. Simple yet perfect!

  27. Lukas Gómez says:

    Thank you!! Have been looking for weeks now a working RTF -> HTML class… This one is the ONLY ONE that worked!

  28. Samriti says:

    I’ve used your code and it worked like a charm in php5, but my problem is, i’ve a project whch was being built in php4, when i use this code in php4 , it does not return anything. I’ve removed all the access specifiers, in rder to make it work in php4. Can you help me out?

  29. Rajesh says:

    When I try to run it , I get the same error that David Garcia gets.

    Fatal error: Call to a member function GetType() on a non-object in rtf-html-php.php on line 383

    What am I doing wrong ?

    • Rajesh says:

      I got it to work. Sorry that was a haphazard comment without doing all the necessary and complete groundwork. My setup fetches rtf documents from a firebird database and then parses to display html. The library works beautifully. Great job. Thank you very much.

  30. Sondre says:

    Thank you so much.

    This tool was perfect, and just what i need.

    My only problem is the same as Nick wrote in 2014. I can not see that Linebreaks (BR) works?
    So all the text is in one line.

    Anyone else here with this issue?

  31. Peter says:

    Hi Alex,
    This looks like what I’m need, however, I’m not a PHP programmer and I was wondering if you would be so kind and have a demo page where we could paste rtf code into a textarea and display the results. This way we can see if this will work for us.
    FYI the rtf data I have is in an xml document, there could be many hundreds of entries, I use JavaScript to convert the rtf to plain text which works perfect, however, I also need to convert it to HTM.

    I do understand that this is a big ask.
    Regards, Peter

  32. Adrian says:

    Parser returns True and tree looks good but Html ouput(after Format) returns String 0 “”. Any idea whats goind on?


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">