Convert Word Range to Html in C# -


i'm trying convert word range html. know how convert word document, how can convert range word document?

the code converts full word document looks this:

private string gethtmlfromrange(range range) {   xelement html;    byte[] bytearray = file.readallbytes(@"c:\test.docx");   using (memorystream memorystream = new memorystream())   {     memorystream.write(bytearray, 0, bytearray.length);     using (wordprocessingdocument doc = wordprocessingdocument.open(memorystream, true))     {       htmlconvertersettings settings = new htmlconvertersettings()       {         pagetitle = "my page title"       };       html = htmlconverter.converttohtml(doc, settings);        file.writealltext("test.html", html.tostringnewlineonattributes());     }   }    return html.tostringnewlineonattributes(); } 

i had similar problem. if using htmlconverter powertools openxml cannot perform conversion directly memorystream. convert range first need parse original document , create new document containing desired range or specify paragraph objects include in new document instead of range. in either case, conversion happen after new document has been defined. because object model doesn't use ranges; character ranges property of rendered document.

so, options either a) first parse rendered document working desired range (using documentbuilder methods) or b) parse converted html select elements corresponding desired range using htmlagilitypack.

for solution, realized every use case required user have ms office installed, used microsoft.office.interop.word:

1) define range want select (e.g., position 5 position 100, including non-printing characters),

        var doc = globals.thisaddin.application.activedocument;         object start = 5;         object end = 100;         var originaltext = doc.activewindow.selection; 

2) copy range new document

        var newdocument = new word.document();         newdocument.range().formattedtext = doc.range(start, end).formattedtext; 

3) save new document

        object nullparameter = system.reflection.missing.value;         object outputfilename = @"d:\converted.html";         object fileformat = word.wdsaveformat.wdformatfilteredhtml;         newdocument.saveas(ref outputfilename, ref fileformat);         newdocument.close(ref nullparameter, ref nullparameter, ref nullparameter); 

4) use system.io access output file , whatever want contents , delete when done.

it's not @ elegant, if you're using interop anyway, elegance may not requirement.


Comments