Transform Open XML Documents with Open XML SDK, Azure ... › uploads...Transform Open XML Documents...

Post on 09-Jun-2020

7 views 0 download

transcript

Transform Open XML Documents with Open XML SDK, Azure Functions and Microsoft Flow

Tom Jebo

Sr Escalation Engineer

India

2014

Frustration!

What if…we could automate?

Goals• Allow presenters to upload their slide deck

• Automatically process the deck to change the theme

What do we need?• A repository for the decks

• An understanding of Open XML File Format

• A way to programmatically transform a deck

• A way to automate the process

Overview to Open XML File Format Markup Languages

Open XML SDK

Demo: Transforming a Presentation

Tools, Support & Resources

Agenda

Brief history of Office Open XML

2000

First XML based

format used by

OfficeXP

2003

Microsoft Office

XML format

released in Office

2003

2005/6

Office Open XML

submitted to

ECMA Int’l

2007

Office 2007

makes OOXML

default file

format

2008

ISO/IEC

29500:2008

published

Support by Office Version

First to support OOXML, default format

ECMA-376 support

ISO/IEC 29500 “Transitional” r/w

ISO/IEC 29500 “Strict” read

ECMA-376 read

ISO/IEC 29500 “Strict” r/w

i.e. File | Save As… and choose “Strict Open XML Presentation (*.pptx)

“Strict” r/w

+

+

Get the standardiso 29500 download

Parts of the ISO 29500 Standard

Parts

Reference

Part 1MarkupFundamentals

Packaging

Model

Part 2Open Packaging

MCE elements

and attributes

Part 3Compatibility/Extensibility

Namespaces

and elements

in transitional

markup

Part 4Transitional

WordprocessingML, SpreadsheetML, PresentationML & DrawingML

• Parts and elements to get started:

• WordprocessingML• 11.3.10 (document.xml)

• 17.2 and 17.3 (body, paragraphs and runs)

• SpreadsheetML• 12.3.24 (sheet<x>.xml)

• 18.3 (Worksheet elements)

• PresentationML• 13.3.8 (slide<x>.xml)

• 19.3 (slide elements)

• Schemas: • Annex A – W3C XML1

• Annex B – RELAX NG9

• Primer• Annex L – detailed intro to the ML’s. Use this!

<w:document<w:body><w:p w14:paraId="2673269E" w14:textId="522FD3EB" w:rsidR="00BD0355" w:rsidRDefault="00BD0355">

<w:r><w:t xml:space="preserve">This is a run of text. It is part of a paragraph.

</w:r></w:p>

<w:p w14:paraId="43453C87" w14:textId="3F744A5F" w:rsidR="00BD0355" w:rsidRDefault="00BD0355"><w:r>

<w:t>Here is another run of text in a new paragraph.</</w:r>

</w:p>

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml<dimension ref="A1:B4"/><sheetData>

<row r="1" spans="1:2" x14ac:dyDescent="0.25"><c r="A1">

<v>1</v></c>

</row><row r="2" spans="1:2" x14ac:dyDescent="0.25">

<c r="A2"><v>2</v>

</c></row><row r="4" spans="1:2" x14ac:dyDescent="0.25">

<c r="B4" t="s"><v>1</v>

</c>

<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" <p:cSld>

<p:spTree><p:sp>

<p:nvSpPr><p:cNvPr id="2" name="Title 1"/><p:cNvSpPr>

<a:spLocks noGrp="1"/></p:cNvSpPr>

</p:nvSpPr><p:spPr/><p:txBody><a:p>

<a:r><a:rPr lang="en-US" dirty="0" smtClean<a:t>Fancy art from the internet</a:t

</a:r>

Supporting Open Specifications

[MS-OI29500]Notes for Microsoft Office products implementing ISO/IEC 29500

[MS-ODRAWXML]

[MS-DOCX]

[MS-XLSX]

[MS-PPTX]Extensions to the DrawingML, WordprocessingML, SpreadsheetML and PresentationML standard elements defined in

ISO/IEC 29500

See reference note for URL to MSDN page for these.10

Open XML SDK on Github

Open XML SDK on Github

Github

Fork, build, modify

Nuget package

Release and latest builds

Issues

Report bugs

Contribution

Enhancements, fixes

SDK ClassesOOXMLSDK API

Part API DOM APIFramework

API

Schema

validation

Semantic

validation

DocumentFormat.OpenXml.dll

1. Generate

OpenXmlPart

API Source

Code

2. Build

Framework

API

3. Process

Schema Files

4. Generate

DOM API

Source Code

5. Generate

Schema

Validation

Data

6. Generate

Semantic

Validation

Data

7. Build

DocumentFormat.OpenXML DLL

SDK Class generation process

Common Scenarios

Generation Extraction Transform

WordprocessingML

SpreadsheetML

PresentationML

Best Practices and Getting Started

• XSL Transformation using Flat OPC

• Use Templates (Office)

• Use Reflected code and modify (Productivity Tool)

• Use LINQ to find target elements and parts (.Net)

Real World Usage• Microsoft Open Specifications!

• Microsoft internal/public

• Many major organizations

Putting it all together:

High-performance extraction, modification and generation of

Demo: Transform a Presentation using Azure Functions and Flow

1

3

2

4

Upcoming Sessions

Tomorrow:

9:00am Intro to Open Specifications

9:15am Exchange Server Protocol Overview 2019

9:45am-1:15pm

MAPI, FSSHTTP & WOPI Roundtables

1:15-4:45pm

SQL Server 2019 Protocol Overview

SQL Server Remote Storage (SMB)

SQL Server 2019 Big Data Cluster Overview

Introducing Azure SQL DB Edge

Tools

Open XML Package Editor for Visual Studio:

https://github.com/OfficeDev/Open-XML-Package-Editor-Power-Tool-for-Visual-Studio

OOXML Tools Extension for Chrome

(search “ooxml tools chrome” and install in Chrome)

Open XML SDK Productivity Tool

(search “open xml sdk 2.5”, click download, OpenXMLSDKToolV25.msi)

Support & ResourcesSDK

Open XML SDKhttps://github.com/OfficeDev/Open-Xml-Sdk

OpenXMLDeveloperhttp://www.openxmldeveloper.org

libopchttp://libopc.codeplex.com (third-party open source OOXML library)

Transforming Open XML Documents using XSLThttps://blogs.msdn.microsoft.com/ericwhite/2008/09/29/transforming-open-xml-documents-using-xslt/

Azure Functionshttps://docs.microsoft.com/en-us/azure/azure-functions/

Open Specifications

dochelp@Microsoft.com

https://social.msdn.microsoft.com/Forums/en-US/home?category=openspecifications