In a recent project, I desperately needed an RTF to HTML converter written in PHP. Googling around turned up some matches, but I could not get them to work properly. Also, one of them called passthru() to use a RTF2HTML executable, which is something I didn’t want. I was looking for a RTF2HTML converter written […]

This article was posted by Independent Software, a website and database application development company based in Maputo, Mozambique. Our website offers regular write-ups on technical and design issues, ranging from details at code level to 3D Studio Max rendering. Read more about Independent Software's philosophy, or get in touch with Independent Software.

In a recent project, I desperately needed an RTF to HTML converter written in PHP. Googling around turned up some matches, but I could not get them to work properly. Also, one of them called passthru() to use a RTF2HTML executable, which is something I didn’t want. I was looking for a RTF2HTML converter written purely in PHP.

Since I couldn’t find anything ready-made, I sat down and coded one up myself. It’s short, and it works, implementing the subset of RTF tags that you’ll need in HTML and ignoring the rest. As it turns out, the RTF format isn’t that complicated when you really look at it, but it isn’t something you code a parser for in 15 minutes either.

How to use it

Include the file rtf.php somewhere in your project. Then do this:

If you’d like to see what the parser read, then call this:

To convert the parser’s parse tree to HTML, call this:

Update 3 Sep ’14:

  • Fixed bug: underlining would start but never end. Now it does.
  • Feature request: images are now filtered out of the output.

Update 13 Aug ’15:

This project is now on Github:

The code


Did this article help you out? Please help us find more time to write useful guides & articles like this by donating a buck or two. It'll keep us coffee-fueled. Thanks!


52 52 Responses to “A working RTF to HTML converter in PHP”
  1. Eugene Valeriano says:

    You Save my Life dude.. Thanks!

  2. Anne says:

    Very useful script, thanks! Just one question. This code doesn’t filter embedded images, so the output may contain large text strings. Might it be useful to add an image filter or something? 🙂

    • alex says:

      That’s a good idea. The principle of this script is simplicity, so I would want to filter out images rather than including them because deciphering them would be expensive for the server. I’ll add an image filter when I update the code.

    • alex says:

      I’ve updated the code. Images are now filtered out.

  3. Pratiksha says:

    Nice Article ..

    It help me a lot

    Thank you.

  4. Klaus Delacroix says:

    Hi, found your moudle, which really helped me a lot.

    I have found 2 problems, though:

    1. The underline of underlined words is not terminated after the word, but is in effect until the end of the text.

    2. Colours are not supported

    Any idea on how I could teach the module to behave correctly also in these cases?

    • alex says:

      Actually looking at the code I don’t see why underlining doesn’t get turned off. You could make a simple RTF test file with one word underlined, and see what the dump() method says. There’s probably a typo somewhere.

      For colours I would have to go back and look at the RTF spec again. For my project needs I didn’t require underlining or colours, so this was never tested.

      • alex says:

        Actually, Sebastien posted below about the same underline issue. It turns out that RTF’s ul tag gets closed with ulnone, not ul0. I thought it would work like the bold b tag, which is closed with b0, but no. This would have to be changed in the code.

    • alex says:

      I’ve updated the code. Underlining now terminates correctly.

  5. Sergio Gabriel says:

    Thanks a lot! I have a little problem, the class don’t convert “tabs control” to “nbsp;”, how can do this?

  6. Sergio Gabriel says:

    In formatControlWord method of RtfHtml class, I put this if($word->word == “tab”) $this->output .= ” “; and work!

    • alex says:

      Nice work. In order for this code to be more robust, the HTML formatter should be made into a separate class, so you can more easily extend it as you have done. Also, you could have plaintext and XML formatters. But oh well, this was meant to be simple and solve only the RTF to HTML problem.

  7. Chris says:

    This is really A W E S O M E !
    Thank you! 🙂

  8. Sebastien says:

    I’m trying out this script for a projet and evreything is underline anyclue why?

    RFT :

    {rtf1ansiansicpg1252deff0nouicompat{fonttbl{f0fnilfcharset0 Tahoma;}{f1fnil Tahoma;}{f2fnilfcharset2 Symbol;}}
    {colortbl ;red0green0blue0;}
    {*generator Riched20 6.3.9600}viewkind4uc1
    pardcf1ulbf0fs22lang3084 Maintenance des transporteursulnoneb0par
    Ajouter une nouvel onglet ‘Param’e8tres EDI’ dans laquelle on retrouvera :par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200 Dans le haut : f1par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li520f0 Un titre ‘EDI – Achats’f1par
    {pntextf2’B7tab}f0 Nom du transporteur ‘e0 exporterf1par
    {pntextf2’B7tab}f0 No de compte d’e9fautf1par


    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200f0 Une grillef1par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li520f0 No divisionf1par
    {pntextf2’B7tab}f0 Nom de la division (en affichage)f1par
    {pntextf2’B7tab}f0 No de de compte ‘e0 utiliser pour cette divisionf1par

    ulf0 Noteulnonepar
    Le no de compte qui sera utilis’e9 en priorit’e9 sera :par

    1) Selon la division de la commandepar
    2) Selon la division m’e8re de la commandepar
    3) No de compte d’e9fautpar


    pardulbfs22 Module ‘Bon d’achats’par
    Dans l’onglet ‘Ent’eate’ / Sous onglet ‘Termes’, ajouter une section pour ‘EDI’ dans laquelle on pourra saisir un code du transporteur exig’e9.par
    ulbfs22 Module ‘R’e9quisition’par
    Dans la fen’eatre ‘Cr’e9ation des bons d’achat’, onglet ‘Bon d’achats” / Section ‘Termes’ de la grille des BA’s ‘e0 ‘e9mettre, ajouter une colonne ‘Trp-EDI’ repr’e9sentant le code du transporteur exig’e9.par
    ulbf0fs22 Module ‘EDI’par

    pard{pntextf2’B7tab}{*pnpnlvlbltpnf2pnindent0{pntxtb’B7}}fi-200li200 Cr’e9er une nouvelle proc’e9dure ‘EDI_SP_Export_BonAchEnt_Trp’ qui retournera les infos requises pour construire le segment ‘PO_TRANSPORTEUR’.par
    {pntextf2’B7tab}Ajouter cette proc’e9dure aux proc’e9dures disponibles dans le catalogue EDI.par

    ulb Importantulnoneb0 par
    La d’e9finition du segment / colonnes EDI et l’ajout du segment ‘e0 l’enveloppe n’est pas incluse dans cet estim’e9. Normalement, vous pouvea effectuer cette t’e2che. Si toutefois, vous aviez besoin d’assistance, un formateur pourra intervenir en appliquant son temps contre votre banque d’heures.par

    thx for your time.

  9. Anon says:

    Sensational: there’s only a couple of missing features that I feel would expand it to be able to convert rtf from the majority of basic rtf editors:

    Font (face and colour)
    Alignment (left, center, right)
    List items (bullets and numbering)
    Super / subscript

    Love it – keep up the good work!

  10. Serge says:

    Very nice Alexander !

    A question: when I have a text like “Poëzie” it converts to (shortened):

    Which gives as result : Po ë zie

    Any idea about a quick solution?

    Greetings & thanks Serge

  11. Nick says:

    Everything works, except no ‘br” or “p” line breaks.
    So my output is one run-on line of text.
    There’s lots of “span”s, but adding a “br” after each didn’t work, because it wrapped a pair of “span”s around the “bullet point” symbol. My adding a “br” after “/span” therefore also put a “br” after the bullet, which put the bullet point text on a new line.

    Provided web hosting services, and website development, in English and Spanish.

    Managed on-going work schedules.
    Is there supposed to be some css that goes with this?
    Is there something wrong with my RTF file?

  12. Doug says:

    Finally a PHP RTF to HTML class that works. Everything else I’ve found is horribly written, buggy, throws errors, doesn’t work, or all of the above.

    Thank you… a million times, thank you. I wasn’t looking forward to writing my own.

  13. Peter says:

    Hi there
    I just stumbled upon your Code in search of a working RTF to HTML Converter. I encounter some problems with your class and I wonder if you could help me.
    When my RTF String is something like this:

    {rtf1ansiansicpg1252deff0{fonttbl{f0fnilfcharset0 MS Sans Serif;}}
    {colortbl ;red0green0blue0;}
    viewkind4uc1pardcf1lang2055bf0fs16 Some Bold Textb0 , some more text.par
    bi Temp. 22’b0C is best.b0i0fs20par

    I get the following result:

    tf1fonttblf0fnilfcharset0 MS Sans Serif;u000biewkind4f0fs16 Some Bold Text, some more text.Temp. 22°C is best.fs20

    If I pull the exact same string from my Databse I get the following error:

    “errMsg”:”Uninitialized string offset: 407″, “lineNo”:”140″

    And if I use a String with almost no formating like this one:

    “this is some example text rnrn with two line breaks”

    I get this error message:

    “errMsg”:”array_push() expects parameter 1 to be array, null given”, “lineNo”:”308″

    The Line Numbers should be the same as in the original code. Do you have any Idea how to fix this?
    Thanks in advance.

    Regards Peter

  14. Strablet says:

    Great job. I have spent days looking through other variations and this is the best one I’ve found so far.

    It works right out of the box, with some exceptions, which I think I can manage.

    I live in Asia (Taiwan) and so some special characters get used.

    However, because your code is so well organized, I’m sure I can add a few things here and there to get what I need.

    I love the concept of making two passes at it. That totally rocks. First, strip out the stuff we don’t need. Second, convert what’s left to HTML.

    Thanks for the great work.

  15. Strablet says:

    Just discovered something you might find interesting. When I open Wordpad and re-save the RTF file to itself (it cleans up a lot of leftovers from MS Word), it also inserts three blank spaces at the end of the file, after the last curly bracket { and then your software creates a PHP error message:

    PHP Warning: array_push() expects parameter 1 to be array, null given

    The lines in reference are these:

    while(!$terminate && $this->pos len);

    $rtftext = new RtfText();
    $rtftext->text = $text;
    array_push($this->group->children, $rtftext);

    You can find them in ParseText();

    My guess is that the parser isn’t at the end of the file, so $this->pos len is not true, but the last curly bracket has been passed, and so !$terminate is not true either. Or something like that. My logic may be a little fuzzy here.

    Will find a solution to this on my own, if I can, and post it here. If I delete the last three spaces after the last curly bracket in the RTF file, everything runs fine.

  16. Tim says:

    could you maybe put this code into github. Maybe we all can help to improve the code or implement new features?

  17. Tim says:

    this code is shit. I read it through and tried to run it but it does nothing. The worst thing is that there is no documentation or hints to run it. (Yes i read the hints at the comment section at the beginning but it does not work, there is no output from this code).

  18. Calfa says:

    Hi, thanks. It’s very good script.
    Is possible to include support for colored text?

  19. This code is working very well. Thanks.
    But, now I need the versa version, to convert html to rtf.
    Do you have the solution ?

  20. JS says:

    Bro! Thank you a lot for this stuff!


  21. Rockberto says:

    Big work. It ‘ also possible to extract the tables ? For example:

    { rtf1 ansi deff0
    intbl cell 1 cell
    intbl 2 cell cell
    intbl 3 cell cell
    This will give :
    | cell 1 | cell 2 | 3 cell |
    A row is delimted with trowd … row
    Each cell ends with cell
    cellx Determines the right side of the Corresponding cell in twips .

    Thank you in advance.

  22. iman says:

    Thanks very much …
    but when i use rtf with other language Persain or arabic for example ,content shown ÇÓÊ …
    rtf content: سلام این یک تست است
    out put: ÓáÇã Ç◊ä ◊� ÊÓÊ ÇÓÊ
    note: i used a meta tag on header of out put file :

    but dosnt work!!!

  23. komal says:

    You are a legend, thank you very much.

  24. Oladipo says:

    Thanks so much for this code.

    I have figured a way to sort out colored text, how do I send code. Also, to fix some errors with bold text and some other bugs here and there.

    How do I get updates across to you.

    Is the code on GitHub?

  25. David Garcia says:

    Hey There, I ‘m trying to run it when I get error on line 383 of file rtf-html-php.php ” Fatal error: Call to a member function GetType() on null in rtf-html-php.php on line 383″ any idea why the error happens ?

  26. Chuck says:

    Worked like a bliss! Thanks very much for a great tool. Simple yet perfect!

  27. Lukas Gómez says:

    Thank you!! Have been looking for weeks now a working RTF -> HTML class… This one is the ONLY ONE that worked!

  28. Samriti says:

    I’ve used your code and it worked like a charm in php5, but my problem is, i’ve a project whch was being built in php4, when i use this code in php4 , it does not return anything. I’ve removed all the access specifiers, in rder to make it work in php4. Can you help me out?

  29. Rajesh says:

    When I try to run it , I get the same error that David Garcia gets.

    Fatal error: Call to a member function GetType() on a non-object in rtf-html-php.php on line 383

    What am I doing wrong ?

    • Rajesh says:

      I got it to work. Sorry that was a haphazard comment without doing all the necessary and complete groundwork. My setup fetches rtf documents from a firebird database and then parses to display html. The library works beautifully. Great job. Thank you very much.

  30. Sondre says:

    Thank you so much.

    This tool was perfect, and just what i need.

    My only problem is the same as Nick wrote in 2014. I can not see that Linebreaks (BR) works?
    So all the text is in one line.

    Anyone else here with this issue?

  31. Peter says:

    Hi Alex,
    This looks like what I’m need, however, I’m not a PHP programmer and I was wondering if you would be so kind and have a demo page where we could paste rtf code into a textarea and display the results. This way we can see if this will work for us.
    FYI the rtf data I have is in an xml document, there could be many hundreds of entries, I use JavaScript to convert the rtf to plain text which works perfect, however, I also need to convert it to HTM.

    I do understand that this is a big ask.
    Regards, Peter

  32. Adrian says:

    Parser returns True and tree looks good but Html ouput(after Format) returns String 0 “”. Any idea whats goind on?


  33. Julien says:


    Parser returns Call to a member function GetType() on a non-object in …
    Any idea whats going on?



  34. Jonathan Hyams says:

    This is my RTF. The conversion doesn’t work at all:

    tf1fbidisansiansicpg1252deff0{fonttbl{f0fswissfprq2fcharset0 Arial;}{f1fromanfprq2fcharset0 Times New Roman;}{f2fnil Tahoma;}}
    {colortbl ;
    viewkind4uc1pardltrparlang1033f0fs20 Each photograph is signed and numbered by the artist.f1fs24par
    f0fs20 This photograph is available in the following two sizes:f1fs24par
    f0fs20 30 x 40 cm (Edition of 20)par
    80 x 100 cm (Edition of 5) lang2057par
    lang2057f0 Fran’e7ois Truffaut (1932 – 1984) was one of the founders of the French new wave, and remains an icon of international cinema. In a career lasting just over a quarter of a century, he was screenwriter, director, producer and actor in over twenty-five films, including the iconic i Les quatre cents coupsi0 (i The 400 Blowsi0 ), 1959, i Jules et Jimi0 (i Jules and Jimi0 ), 1962, i La peau doucei0 (i Soft Skini0 ), 1964 and his only feature in English and in colour, the 1965 adaptation of Ray Bradbury
    quote s classic sci-fi novel i Fahrenheit 451i0 . Truffaut has become, along with Jean-Luc Godard, one of new wave
    quote s most historically dominant figures, partly due to his fierce individualism and naturalistic aesthetic. As Truffaut
    quote s autobiography i Truffaut by Truffauti0 states, he believed: lquote We have to film other things, in another spirit. We
    quote ve got to get out of the over-expensive studios…Sunshine costs less than Klieg lights and generators. We should do our shooting in the streets and even in real apartments. Instead of, like Clouzot, spreading artificial dirt over the sets.
    quote As Cauchetier recalls, lquote Godard and Truffaut were both writers for i Cahiers du Cinemai0 , and had a great bond. It was the triumph of the new wave, and both reinvented cinema. Truffaut with i Les quatre cents coupsi0 (i The 400 Blowsi0 ) and Godard with i’c0 bout de souffle i0 (i Breathlessi0 )i . i0 It was only much later that they became competitors and adversaries.

  35. Geek says:

    hey that method is great, myself tried it it works well but i get some errors such as “Notice: Uninitialized string offset: 4414 in C:” and “Warning: array_push() expects parameter 1 to be array, null given in C:” together with them before the file i converted is printed i get lots of “WORD” and “TEXT” written in bold just like i’ve written there, of course they occur before the required document is printed well, so what do they mean. and how can i solve the problems because for the case of the array i don’t know why because i passed though the code and the array is there i wonder what causes Uninitialized String offset while the parameter it’s referring is an array.I need help for this, I am not a such very advanced php programmer, so i just coppied the code and there i find the errors.Help me please.

    • alex says:

      It’s best to copy the code off this web page and not from the Github repository. The code on this page works, but only for well-formatted RTF strings. What’s being worked on on Github is making the code deal with RTF strings that are not well-formatted, but so far that’s buggy. Contributions made by some people (which were great & well meant) have introduced some instability and the code must be reviewed (when I get time to do it). HTH.

  36. Pierre-Luc Mc Neil says:

    I figured a fix for the line breaks after searching how the line breaks are working in an RTF file and in the code.
    From what I understood, line breaks in the RTF file are represented by “\ ” (so slash followed by a space) or at least in the output generated by file_get_contents(). In the code it would be represented as a control symbol. The thing is I could not figure out how to write the condition to know if it was this symbol. So I did a workaround, which is probably not really clean, but it worked out well for me.

    I modified the function ParseControlSymbol() by adding an else statement after the if($symbol == ‘\”).

    if($symbol == ‘\”)
    $parameter = $this->char;
    echo $parameter . $this->char;
    $parameter = hexdec($parameter . $this->char);
    } else {
    // 13 is the decimal code for carriage return
    $parameter = 13;

    and then I modified the FormatControlSymbol($symbol) function by adding and if statement

    if ($symbol->parameter == 13) {
    //$this->output .= htmlentities(chr($symbol->parameter));
    //$this->output .= “&#13”; // HTML Entity (decimal)
    //$this->output .= “&#xd”; // HTML Entity (hex)
    $this->output .= “”;

    Since the other if in the function used htmlentities() i tried to do it for the carriage return but it did not work.
    My two other attempts also failed, but by adding a simple “” it worked out.

  37. Alex says:

    first of all: it´s a really fantastic tool and works great.
    When I want to load the parser a second time it will break the full code after the first text is load in a while loop.
    There are two text fields in a database and when both fields are filled with some text it stops like mentioned above.
    When only one field is filled, everything works fine.
    See the code below:

    if (!empty (trim($row['Memoextern']))) {
    include "../parser/rtf-html-php.php";
    $reader = new RtfReader();
    $rtf = $row['Memoextern'];
    $result = $reader->Parse($rtf);
    $formatter = new RtfHtml();
    $formatierterpostextextern = $formatter->Format($reader->root);
    echo '
    Kopf Text
    ' . $formatierterpostextextern . '
    if (!empty (trim($row['Memointern']))) {
    include "../parser/rtf-html-php.php";
    $reader = new RtfReader();
    $rtf = $row['Memointern'];
    $result = $reader->Parse($rtf);
    $formatter = new RtfHtml();
    $formatierterpostextintern = $formatter->Format($reader->root);
    echo '
    Interner Text
    ' . $formatierterpostextintern . '

    Is it possible to run the code directly twice ?
    Maybe anyone have a solution for my problem.
    Thanks and best regards,

  38. Tahir says:

    I have a rtf file with images.
    Can you please explain how can i fetch those and save in some where or just link of the image.

Leave a Reply

Your email address will not be published. Required fields are marked *