Text Mining the Hathi Trust

Thursday, February 14, 2019 // 10:00 am

Paley Library: Digital Scholarship Center -- 1210 W. Berks Street, Philadelphia, PA 19122

text mining workshop

This workshop will introduce students to data analysis using HathiTrust’s repository (containing over 15 million books), especially its extracted features datasets, as well as its Data Portal’s tools for accessing copyrighted data. Students will learn not only how to navigate HathiTrust’s search functionalities, but also to build their own custom datasets. It is an onerous process to access HathiTrust’s data analytics tools, and we will walk through each step of accessing a Data Capsule. We will experiment with both Voyant-Tools for user-friendly text analysis, as well as some easy to adapt scripts in R for text analysis.

Software: Hathitrust Data Analytics tools, Voyant-Tools, R Programming Language

This workshop is part of a series. Register at https://paleystudy.temple.edu/event/4921129

Digital Scholarship Center