…entering the complex and dynamic realm of digital humanities. Through my text mining and visualization work at RRCHNM, I have gained an understanding of how historians can make use of some of these tools to approach humanistic research related to the Appalachian Trail. My goal has been to engage with hands-on textual analysis tools to examine historical data from the Appalachian Trail Conservancy, specifically, the Appalachian Trailway News. The overarching historical question was: is there evidence of the presence of Native Americans in the content of the ATN that goes beyond the naming of places and objects? Content that speaks about the various indigenous cultures along the Appalachian Trail?
Thanks to the first part of the internship, I am now capable of cleaning raw data and preparing it for analysis. This exercise, although, arduous at times, it proved to be very useful to appreciate the work entailed and the purpose of doing so in the first place. There were so many aspects I considered as I created my procedure for the clean-up of fifteen (15) volumes of ATN. Where do I access the volumes, how do I organized the files as I clean them? Which file converter should I use? What is the most effective way to go through this process? Which criteria should I employ to remove the “noise”? These were required steps that allowed for a more efficient topic modeling. The most challenging part of this activity was to experience “distant reading” for the first time (Moretti). I had to constantly pull myself away from the content of the ATN and go against my normal method of close reading a text.
During the second part of my internship I mainly explored the different methods to topic model the ATN data. Thanks to my work with Voyant Tools I can now start making sense of larger amounts of text. With my gained familiarity with MALLET, I can now engage in more in-depth modeling topic routines. The larger the batch of cleaned volumes I completed, the better the appreciation for what topic modeling platforms and software can do for historians to analyze larger portions of unformatted text. I can run multiple commands to train models, I have become more comfortable with distant reading, I can discover new topics emerge that I would’ve never made sense of through a typical close reading.
The third part involved a brief study of the AT Thru-Hiker Guidebook (2020) (an annual publication of the ATC) to find out if there is indigenous presence in its content that goes beyond place naming. Throughout all these exercises, the answer to my historical question became evident: there is no significant presence of indigenous lexicon pertaining to the tribal cultures along the Trail, except for toponyms.
As the final project, I am creating a libguide on topic modeling tools that will provide an overview of the open source technologies that are available to users in the humanities. It will include links to such sources, as well as instructions on how to prepare data, visualize it with Voyant Tools, and analyze it with MALLET. Throughout this semester, my mentor, T. Mills Kelly, has provided me with the guidance and perspective I need to contextualize the data and make sense of my findings.