nlp in archival processingbitcurator.net/files/2016/12/mennerich-1.pdf · lloyd k.means clustering:...
TRANSCRIPT
![Page 1: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/1.jpg)
NLP in Archival ProcessingDonald Mennerich, NYU Libraries
![Page 2: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/2.jpg)
Scale
![Page 3: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/3.jpg)
![Page 4: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/4.jpg)
![Page 5: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/5.jpg)
Forensics
![Page 6: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/6.jpg)
<!-- plugin_process -->
<pronomPuid>x-fmt/391</pronomPuid>
<pronomFormatName>Exchangeable Image File Format (Compressed)</pronomFormatName>
<pronomSignatureName>EXIF Compressed Image 2.2</pronomSignatureName>
<pronomMimeType>image/jpeg</pronomMimeType>
<pronomMatchType>signature</pronomMatchType>
<pronomSignatureFileVersion>formats-v70.xml</pronomSignatureFileVersion>
<pronomContainerFileVersion>20130501.xml</pronomContainerFileVersion>
<fidoVersion>1.3.1</fidoVersion>
<identificationUuid>49050200-e308-4060-886a-14a8efd82078</identificationUuid>
<scanStatus>PASSED</scanStatus>
<clamAVVersion>ClamAV 0.98.1</clamAVVersion>
<virusScanUuid>6dea8d54-107a-43d0-a1fe-beb9c6bc4a21</virusScanUuid>
![Page 7: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/7.jpg)
![Page 8: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/8.jpg)
![Page 9: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/9.jpg)
![Page 10: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/10.jpg)
Scale
![Page 11: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/11.jpg)
![Page 12: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/12.jpg)
![Page 13: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/13.jpg)
![Page 14: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/14.jpg)
![Page 15: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/15.jpg)
![Page 16: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/16.jpg)
![Page 17: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/17.jpg)
improvements
• Better infrastructure, distributed processing, machine Learning• Topic modeling, cluster analysis, document similarity• Visualizations• Integration with discovery, dissemination and access systems, Linked
open data
![Page 18: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/18.jpg)
Beyond the obsolete…
![Page 19: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/19.jpg)
![Page 20: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/20.jpg)
![Page 21: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/21.jpg)
![Page 22: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/22.jpg)
![Page 23: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/23.jpg)
NLP
Named entity extractionTopic modelingClusteringClassificationCollaborative filteringLanguage detection
![Page 24: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/24.jpg)
![Page 25: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/25.jpg)
![Page 26: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/26.jpg)
![Page 27: NLP in Archival Processingbitcurator.net/files/2016/12/mennerich-1.pdf · Lloyd k.means Clustering: o 00 0 9000 tee o sc00cÊ o 0 0 00 iterations 00 0 a Topic 1 Topic 2 Topic 1 Topic](https://reader033.vdocument.in/reader033/viewer/2022042208/5eaba8339871a907b93ff6f0/html5/thumbnails/27.jpg)
Thanks.