TUportal Support the Libraries

Text Mining the Hathi Trust

Thu, Feb 07, 2019 | 10:00 am

This workshop will introduce students to data analysis using HathiTrust’s repository (containing over 15 million books), especially its extracted features datasets, as well as its Data Portal’s tools for accessing copyrighted data. Students will learn not only how to navigate HathiTrust’s search functionalities, but also to build their own custom datasets. It is an onerous process to access HathiTrust’s data analytics tools, and we will walk through each step of accessing a Data Capsule. We will experiment with both Voyant-Tools for user-friendly text analysis, as well as some easy to adapt scripts in R for text analysis.

Software: Hathitrust Data Analytics tools, Voyant-Tools, R Programming Language

This workshop is part of a series. Register at https://paleystudy.temple.edu/event/4921129

Paley Library

Digital Scholarship Center