This module about data and text mining and analysis is not only relevant but timely. Just yesterday as I was working with Voyant and exploring data projects such as “Robots Reading Vogue,” I saw this in my news feed. This Bloomberg article provides a visual representation of this year’s presidential debate with word analyses based on big data:
I think Voyant is one of the coolest and most useful tools I’ve ever used. That said, the web-version is very glitchy. Attempting to get key words to show for different states and to export the correct link that matched the correct visual took over four hours. Also if I stepped away from my computer for any length of time, I had to start over with stop words, filters, etc. In order to get the desired export visual links, I found it easier to reload individual documents (for states) into Voyant, and I hope the activity links I entered do in fact represent the differentiation I was seeking as I followed the activity directions. I would not use this with my students until I could work out the kinks and had fully tested the documents to be used in class. As an educator, I know all too well from experience that if something can go wrong with software or web-based applications when working with students, it usually does. That said, I have downloaded a version of it to my computer and hope this will make Voyant more user-friendly and maximize utility for data analysis.
Despite technical difficulties, this tool (Voyant) allows users to mine and assess enormous amounts of data in many different ways. To have such a tool is an incredible gift for both teachers and students. You can visualize word usage with word clouds, links to other words, graphically chart the use of key words across a corpus or within a document, view and connect word use within context and within a range from 10 words to full-text.
New users should:
- Open http://voyant-tools.org/
- Paste url text or upload document and generate text data
- Manipulate “stop words” to appropriately cull key words
- Compare/contrast key words in different documents as well as across the entire corpus
- Study and analyze key words using word cirrus, trends, reader, summary, and contexts
- Draw conclusions
Trends: Frequency of “Mother” in North Carolina WPA Slave Narratives