processing web data with xml and xslt - part 2 · xml vs html vs xhtml processing web data with xml...
TRANSCRIPT
![Page 1: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/1.jpg)
Processing Web Data With XML And XSLT - Part 2
Open4Tech Summer School, 2020
© 2020 Syncro Soft SRL. All rights reserved.
Bogdan Dumitru, Syncro [email protected]
![Page 2: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/2.jpg)
Note:
• GitHub repository:https://github.com/dumitrubogdanmihai/processing-web-data-with-xml-and-xslt
• Questions
Processing Web Data With XML And XSLT
![Page 3: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/3.jpg)
Let's recap
Processing Web Data With XML And XSLT
In the first part we:
• saw how browsers render web pages• talked about Selenium• created a web crawler
![Page 4: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/4.jpg)
Today's Agenda
• XML• XPath• XSLT• Live Coding
Processing Web Data With XML And XSLT
![Page 5: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/5.jpg)
About XML
Processing Web Data With XML And XSLT
eXtensible Markup Language
• It is a markup language– text is surrounded by tags
(that provide semantics)• doesn't define a set of elements
![Page 6: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/6.jpg)
About XML - Syntax
Processing Web Data With XML And XSLT
XML Syntax Rules
• Prolog must be on the first line (if is present)• Must have only one root element• All start tags must have a closing tag
– or to be self-closing tags• Entities
– < > & ' "
• Comments– <!-- TODO: fix it! -->
![Page 7: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/7.jpg)
About XML – Verification
Processing Web Data With XML And XSLT
Well-formed XML vs Valid XML
• Well-formed = conform the syntax rules– e.g: no missing end tags, no overlapping tags
• Valid = conform the schema rules– e.g: no more elements that the schema declares
![Page 8: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/8.jpg)
XML Strong Points
Processing Web Data With XML And XSLT
• Semantics– data is wrapped in semantics– XML vocabularies
• Validation– controlled structure
• Reuse– data isn't duplicated– XInclude
![Page 9: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/9.jpg)
Where it is used?
Processing Web Data With XML And XSLT
Wherever any of the following needs arise:
• semantic content• content reuse• well-structured content• content validation
![Page 10: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/10.jpg)
XML vs HTML vs XHTML
Processing Web Data With XML And XSLT
XML– standard (specification) for describing structure and content– extensible– can explain what data means
• HTML (hypertext markup language)– non extensible (fixed tags set)– can't explain what data means
• XHTML– HTML that conform to XML standards (well formed HTML)
![Page 11: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/11.jpg)
XML-related Technologies
Processing Web Data With XML And XSLT
The XML world is really big!
• XML• XPath• XSLT• XQuery• XSD, DTD• SVG• etc.
![Page 12: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/12.jpg)
XML DOM
Processing Web Data With XML And XSLT
XML Document Object Model
![Page 13: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/13.jpg)
XPath
Processing Web Data With XML And XSLT
XML Path Language
• Select a set of nodes within an XML document• Highly used in XSLT• Any CSS selector can be written in XPath
![Page 14: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/14.jpg)
XPath – Syntax 1
Processing Web Data With XML And XSLT
• Basic syntax– / - root element– //div - all div elements within document– /html/body/div - div elements within body– /html/body/../ - body – /html/body/* - body children– */@class – the “class” attributes– . – the current element
![Page 15: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/15.jpg)
XPath – Syntax 2
Processing Web Data With XML And XSLT
• Axes– //p/following-sibling:: - elements placed after p– //p/preceding-sibling:: - elements placed before p– //p/descendant:: - descendents of p
![Page 16: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/16.jpg)
XPath – Syntax 3
Processing Web Data With XML And XSLT
• Operators– “|”
● //book | //magazine– “=”
● //book[@price=9.8]– “or”
● //book[@price>=9.8 or @price<=10]
![Page 17: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/17.jpg)
XSLT
Processing Web Data With XML And XSLT
Extensible Stylesheet Language Transformations
• Transform/remodel XML documents
![Page 18: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/18.jpg)
XSLT
Processing Web Data With XML And XSLT
Basic concepts
• <template>– <value-of>– context (.)– <copy>
• <apply-templates>
![Page 19: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/19.jpg)
XSLT
Processing Web Data With XML And XSLT
![Page 20: Processing Web Data With XML And XSLT - Part 2 · XML vs HTML vs XHTML Processing Web Data With XML And XSLT XML – standard (specification) for describing structure and content](https://reader034.vdocument.in/reader034/viewer/2022050521/5fa4754e76d0537fb47f9536/html5/thumbnails/20.jpg)
Let's Code
Processing Web Data With XML And XSLT
• We'll extract extract useful data from the files generated from previous course.