Saturday, 26 April 2008

... that's unfortunately not a trivial task.

But with the new Office 2007 word documents and their Open XML format it is possible.

Background: The new office documents (*.docx) are just data containers (zip files), containing different data parts (xml, images, styles etc.) and their relations to each other. The main part itself is a xml describing the content (paragraphs, image positions, plain text etc). More Info at [1] and [2]

On Brian Jones Blog [3] i found the solution. With the "altChunk" element is it possible to reference a xHtml part within the document. 1+1=2 ... with this information and the latest Open XML SDK (April CTP) [4] i wrote a small piece of code wich inserts the content of web pages in a new  word document.

Here are the code snippets, maybe you can reuse it somehow. You can also download the Visual Studio 2008 solution at [5]

First) Generate a new word document

   1: /// <summary>
   2: /// Creates the new word document using open XML SDK.
   3: /// see http://msdn2.microsoft.com/en-us/library/bb656295.aspx
   4: /// </summary>
   5: /// <param name="document">The document.</param>
   6: public static void CreateNewWordDocument(string document)
   7: {
   8:     using (WordprocessingDocument wordDoc = WordprocessingDocument.Create(document, WordprocessingDocumentType.Document))
   9:     {
  10:         // Set the content of the document so that Word can open it.
  11:         MainDocumentPart mainPart = wordDoc.AddMainDocumentPart();
  12:        
  13:         //write the main content xml structure in the main part
  14:         const string docXml =
  15:  @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?> 
  16: ent xmlns:w=""http://schemas.openxmlformats.org/wordprocessingml/2006/main"">
  17: ody><w:p><w:r><w:t>Generated from Holgers Blog:</w:t></w:r></w:p></w:body>
  18: ment>";
  19:  
  20:         using (Stream stream = mainPart.GetStream())
  21:         {
  22:             byte[] buf = (new UTF8Encoding()).GetBytes(docXml);
  23:             stream.Write(buf, 0, buf.Length);
  24:         }
  25:     }
  26: }


Secondly) Add a Bookmark in the Document

   1: /// <summary>
   2:         /// Inserts a bookmark in the document.
   3:         /// </summary>
   4:         /// <param name="document">The document.</param>
   5:         /// <param name="bookmarkName">Name of the bookmark.</param>
   6:         public static void InsertBookmark(string document, string bookmarkName)
   7:         {
   8:             using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
   9:             {
  10:                 using (Stream stream = wordDoc.MainDocumentPart.GetStream())
  11:                 {                    
  12:                     //create a xmldocument from the passed xml stream
  13:                     XmlDocument xmlDocument = new XmlDocument();
  14:                     xmlDocument.LoadXml(new StreamReader(stream).ReadToEnd());
  15:  
  16:                     //find all paragraph nodes and add the bookmark at the latest position.
  17:                     XmlNodeList nodes = FindNodes(xmlDocument, "/w:document/w:body/w:p");
  18:                     if (nodes.Count > 0)
  19:                     {
  20:                         string bookmarkID = Guid.NewGuid().ToString();
  21:                      
  22:                         //create the bookmark string
  23:                         string bookmark = string.Format("<w:bookmarkStart w:id=\"{0}\" w:name=\"{1}\"/><w:bookmarkEnd w:id=\"{2}\" />", bookmarkID, bookmarkName, bookmarkID);
  24:                         
  25:                         //add the bookmark at the latest position
  26:                         nodes[nodes.Count - 1].CreateNavigator().InsertAfter(bookmark);
  27:                   
  28:                         //reset the stream and fill it with the new content
  29:                         byte[] buf = (new UTF8Encoding()).GetBytes(xmlDocument.OuterXml);
  30:                         stream.Seek(0, 0);
  31:                         stream.Write(buf, 0, buf.Length);
  32:                     }
  33:                 }
  34:             }
  35:         }
  36:  
  37:  
  38:         /// <summary>
  39:         /// Finds some nodes in the xml.
  40:         /// This is extracted to a method, because so many namespaces.
  41:         /// </summary>
  42:         /// <param name="xmlDocument">The XML document.</param>
  43:         /// <param name="xPathExpression">The x path expression.</param>
  44:         /// <returns></returns>
  45:         public static XmlNodeList FindNodes(XmlDocument xmlDocument, string xPathExpression)
  46:         {
  47:             //create the namespace manager and add some namespaces
  48:             XmlNamespaceManager namespaceManager = new XmlNamespaceManager(xmlDocument.NameTable);
  49:             namespaceManager.AddNamespace("r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships");
  50:             namespaceManager.AddNamespace("tns", "http://schemas.openxmlformats.org/officeDocument/2006/extended-properties");
  51:             namespaceManager.AddNamespace("dcmitype", "http://purl.org/dc/dcmitype/");
  52:             namespaceManager.AddNamespace("w", "http://schemas.openxmlformats.org/wordprocessingml/2006/main");
  53:             namespaceManager.AddNamespace("cp", "http://schemas.openxmlformats.org/package/2006/metadata/core-properties");
  54:             namespaceManager.AddNamespace("ds", "http://schemas.openxmlformats.org/officeDocument/2006/customXml");
  55:             namespaceManager.AddNamespace("vt", "http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes");
  56:             namespaceManager.AddNamespace("v", "urn:schemas-microsoft-com:vml");
  57:             namespaceManager.AddNamespace("w10", "urn:schemas-microsoft-com:office:word");
  58:             namespaceManager.AddNamespace("wne", "http://schemas.microsoft.com/office/word/2006/wordml");
  59:             namespaceManager.AddNamespace("b", "http://schemas.openxmlformats.org/officeDocument/2006/bibliography");
  60:             namespaceManager.AddNamespace("sl", "http://schemas.openxmlformats.org/schemaLibrary/2006/main");
  61:             namespaceManager.AddNamespace("m", "http://schemas.openxmlformats.org/officeDocument/2006/math");
  62:             namespaceManager.AddNamespace("o", "urn:schemas-microsoft-com:office:office");
  63:             namespaceManager.AddNamespace("dcterms", "http://purl.org/dc/terms/");
  64:             namespaceManager.AddNamespace("a", "http://schemas.openxmlformats.org/drawingml/2006/main");
  65:             namespaceManager.AddNamespace("dc", "http://purl.org/dc/elements/1.1/");
  66:             namespaceManager.AddNamespace("wp", "http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing");
  67:             namespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
  68:             namespaceManager.AddNamespace("ve", "http://schemas.openxmlformats.org/markup-compatibility/2006");
  69:             namespaceManager.AddNamespace("pkg", "http://schemas.microsoft.com/office/2006/xmlPackage");
  70:  
  71:             return xmlDocument.SelectNodes(xPathExpression, namespaceManager);
  72:         }
 

Thirdly)  Add the xHtml part to the Document and put it in the Bookmark

   1: /// <summary>
   2:         /// Adds the XHTML part in the document.
   3:         /// see Brian Jones Blog: http://blogs.msdn.com/brian_jones/archive/2006/08/08/692705.aspx
   4:         /// </summary>
   5:         /// <param name="document">The document.</param>
   6:         /// <param name="xHtmlStream">The x HTML stream.</param>
   7:         /// <param name="bookmarkName">Name of the bookmark.</param>
   8:         public static void AddXHtmlPart(string document, Stream xHtmlStream, string bookmarkName)
   9:         {
  10:             using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
  11:             {
  12:                 MainDocumentPart mainPart = wordDoc.MainDocumentPart;
  13:  
  14:                 string relationID = "myExternalXhtmlID";
  15:                 string altChunk = "<w:altChunk r:id=\"" + relationID + "\" />";                
  16:  
  17:                 //add the extendedPart (xhtml)
  18:                 ExtendedPart extPart = mainPart.AddExtendedPart("http://schemas.openxmlformats.org/officeDocument/2006/relationships/aFChunk", "application/xhtml+xml", "/AddedXhtml.xhtml", relationID);
  19:                 extPart.FeedData(xHtmlStream);
  20:  
  21:                 //create dictionary with BookmarkeNames / xhtml snippets
  22:                 Dictionary<string, string> xmlSnippetCollection = new Dictionary<string, string>();
  23:                 xmlSnippetCollection.Add(bookmarkName, altChunk);
  24:  
  25:                 //and replace the bookmarks 
  26:                 using (Stream stream = mainPart.GetStream())
  27:                 {
  28:                     ReplaceBookmarks(stream, xmlSnippetCollection);
  29:                 }
  30:             }
  31:         }
  32:  
  33:  
  34:  
  35:         /// <summary>
  36:         /// Replaces the bookmarks found in the xml stream with the xml snippets from the collection
  37:         /// </summary>
  38:         /// <param name="stream">The stream.</param>
  39:         /// <param name="xmlSnippetCollection">The xmlSnippet collection.</param>
  40:         private static void ReplaceBookmarks(Stream stream, Dictionary<string, string> xmlSnippetCollection)
  41:         {
  42:             //create xmldocument from the passed xml stream
  43:             XmlDocument xmlDocument = new XmlDocument();
  44:             xmlDocument.LoadXml(new StreamReader(stream).ReadToEnd());
  45:  
  46:             //find all Bookmarks
  47:             XmlNodeList selectedNodes = FindNodes(xmlDocument, "/w:document/w:body//w:bookmarkStart");
  48:      
  49:             if (selectedNodes.Count > 0)
  50:             {
  51:                 foreach (XmlNode selectedNode in selectedNodes)
  52:                 {
  53:  
  54:                     //add the r:namespace if not exist. Its neccessary for the chunk  
  55:                     if (xmlDocument.DocumentElement.Attributes["xmlns:r"] == null)
  56:                     {
  57:                         XmlAttribute test = xmlDocument.CreateAttribute("xmlns", "r", "http://www.w3.org/2000/xmlns/");
  58:                         test.Value = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";
  59:                         xmlDocument.DocumentElement.Attributes.Append(test);
  60:                     }
  61:  
  62:                     string bookmarkName = selectedNode.Attributes[1].Value;
  63:                     if (xmlSnippetCollection.ContainsKey(bookmarkName))
  64:                     {
  65:                         //insert the references after the bookmarks 
  66:                         //(after the paragraph, else the document produce errors)
  67:                         if (selectedNode.ParentNode != null && selectedNode.ParentNode.Name == "w:p")
  68:                             selectedNode.ParentNode.CreateNavigator().InsertAfter(xmlSnippetCollection[bookmarkName]);
  69:                         else selectedNode.CreateNavigator().InsertAfter(xmlSnippetCollection[bookmarkName]);
  70:                     }
  71:                 }
  72:  
  73:                 //reset the stream and fill it with the new content
  74:                 byte[] buf = (new UTF8Encoding()).GetBytes(xmlDocument.OuterXml);
  75:                 stream.Seek(0, 0);
  76:                 stream.Write(buf, 0, buf.Length);
  77:             }
  78:         }


Fourthly
) Put everything together

   1: protected void btnDoTheMagic_Click(object sender, EventArgs e)
   2: {
   3:     // get the stream of the website 
   4:     Stream stream = WebRequest.Create(TextBox1.Text).GetResponse().GetResponseStream();
   5:     
   6:     // define the filename
   7:     string fileName = Path.Combine(Server.MapPath(""), "Generated.docx");
   8:  
   9:     // generate a new document
  10:     CreateNewWordDocument(fileName);
  11:  
  12:     // add a bookmark in the document
  13:     InsertBookmark(fileName, "AddXHtmlHere");
  14:  
  15:     // add the web site stream in the document
  16:     AddXHtmlPart(fileName, stream, "AddXHtmlHere");
  17:  
  18:     //Force this content to be downloaded 
  19:     //as a Word document with the name of your choice
  20:     Response.AppendHeader("Content-Type", "application/msword");
  21:     Response.AppendHeader("Content-disposition", "attachment; filename=myword.doc");
  22:  
  23:     Response.WriteFile(fileName);
  24:     Response.End();
  25: }

 

At the end it looks like this. Ok, it's not really original, but acceptable ... or not? Thinking 

[1] http://openxmldeveloper.com/
[2] Open XML SDK Documentation
[3] Using XHTML in a WordprocessingML document
[4] download OpenXML SDK
[5] WordProgramming.zip

Saturday, 26 April 2008 00:04:48 (GMT Daylight Time, UTC+01:00)  #    Comments [240]  | 
Monday, 14 April 2008

... you will find a lot of persons.

No, i am not schizophrenic (i believe), but last weekend i played around a little with the new Silverlight 2 Deepzoom Control [0].

... and that's the result: my cat, my city, my self ;-) (only visible in IE with latest Silverlight plugin)

use mouse for zooming and moving

What you need are some big very detailed images and the deep zoom composer [5], which generates the tiles. Then reference the generated output to the MultiScaleImage Control:

msiSnippet.png

The hardest but also funniest thing was to get a nice detailed picture. I only found the virtual pixel town [1] from a german pixel project. Unfortunately most of the good panorama gigapixel images cost a lot, so i decided to generate my own ones;) The first is a sweet "fear-me" cat [2]. I converted it to ascii text with the help of the mosascii m2 tool [3]. And the big mosaic of myself i created with andrea mosaic [4].

Notes:
[0] Thanks Jeff Prosis for his Expression Blend Solution (http://www.wintellect.com/cs/blogs/jprosise/archive/2008/04/01/silverlight-deep-zoom.aspx)
[1] http://www.vo-pixeltown.com/
[2] http://allcutecats.blogspot.com/2007/09/fear-me-kitten.html
[3] http://www.mosasciim2.com
[4] http://www.andreaplanet.com/andreamosaic/
[5] http://www.microsoft.com/downloads/details.aspx?familyid=457b17b7-52bf-4bda-87a3-fa8a4673f8bf&displaylang=en
[6] DeepZoom Memorabilia: http://memorabilia.hardrock.com/ 

 

Monday, 14 April 2008 18:28:01 (GMT Daylight Time, UTC+01:00)  #    Comments [111]  | 

Theme design by Jelle Druyts

Pick a theme: