Antconc – Vivero Peer Mentoring

This tutorial was written by Katherine Walden, Digital Liberal Arts Specialist at Grinnell College. The tutorial framework was created by Sarah Purcell (L.F. Parker Professor of History, Grinnell College) and Papa Ampim-Darko, a student research assistant at Grinnell College

This tutorial was reviewed by Gina Donovan (Instructional Technologist. Grinnell College).

This tutorial is adapted from the Programming Historian’s Corpus Analysis with AntConc tutorial.

Text Analysis in AntConc is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Developed by Laurence Anthony, AntConc is a free, closed-source program that runs on Windows, OS, and Linux. At the most basic level, AntConc is a concordancer, or a program that constructs a concordance based on terms in a text or collection of texts. AntConc also allows users to visualize concordance calculations and generate word and keyword lists based on terms present in the text. AntConc also supports cluster and collocation analysis and visualization. With Voyant, we explored a graphical user interface option for conducting textual analysis. AntConc offers a somewhat more hands-on, customizable approach to analyzing a text.

DOWNLOADING ANTCONC

Some campus computers have AntConc pre-installed. (The DASIL lab computers, for example.) If you want to work with AntConc on your own computer, select the appropriate version for your operating system and following the installation instructions.

1-Launch AntConc by double clicking on the Desktop icon or searching for the program in the Start menu.

DATA

2-AntConc allows you to open single files, as well as open an entire file directory. For this tutorial, we will be working with a large number of oral history text files, so opening the directory makes more sense than loading these files individually. Download the dataset used for this tutorial: MN_Trans_OralHistories_ZIP Right click on the folder you downloaded to extract the contents. Copy the extracted folder to your Desktop.

About this data (from the University of Minnesota’s Tretter Transgender Oral History Project): The Tretter Transgender Oral History Project is part of the Jean-Nickolaus Tretter Collection in GLBT Studies at the University of Minnesota Libraries. Transgender voices and experiences are often missing in contemporary documentation and the historic record. The goal of this Project is to empower individuals to tell their story, while providing students, historians, and the public with a richer foundation of primary source material about the Transgender community. Materials are housed within the Tretter Collection. Phase 1 of this Project (2015-2018) focused on documenting the experience of transgender and gender queer people in the Upper Midwest. Oral Historian Andrea Jenkins conducted 200 interviews covering identity, family, love and experiences. These oral histories are posted online. There is also an online exhibit about Phase 1.

3-Select File->Open Dir and navigate to the cleaned_txt_files folder. Click OK. On Mac: Select File-> Create Quick Corpus and navigate to the cleaned_txt_files folder. If it does not give you the option to open the folder, highlight all the files within the folder (Shift + Click) and then click Open.

4-The loaded files will be listed on the left-hand window in AntConc, and the total number of files will display at the bottom of that window.

ANTCONC’S FUNCTIONALITY

5-The main AntConc screen gives you access to seven different textual analysis tools.

Concordance searches for and displays keywords in context (KWIC).
Concordance Plot presents a preliminary, basic visualization of a KWIC search.
File View is like the Reader panel in Voyant—it shows you a full file view to see a search result in the larger context of a text.
Clusters highlights terms that appear together frequently in the text.
Collocates calculates the statistical likelihood of terms appearing together in the text. Clusters looks for term patterns as they are represented in the text. Collocates looks at the likelihood of terms appearing together in the text.
Word List calculates how frequency words appear in your text.
Keyword List compares keywords from two text sources (a reference text and an analysis text).

SEARCHING KEYWORDS IN CONTEXT

6-AntConc (and other computing tools) excel at identifying patterns in language that are not always detected by the average reader. For example, function words like a, an, the, he, she, I, etc. (often called stopwords in textual analysis) don’t frequently catch our attention as readers. A computational tool focuses on analyzing the words as term objects, rather than interpreting them based on meaning, context, or function.

7-Type “the” in the Search Term box at the bottom of the Concordance window and click Start. 8-The Concordance tab shows key words in context (KWIC), with the search term highlighted. 9-The Kwic Sort options allow you to change how AntConc displays or sorts the context for your search term. 1R includes the term immediately to the right of your search term, 2R includes the second term to the right from your search term, etc. 1L includes the term immediately to the left of your search term, 2L includes the second term to the left from your search term, etc. 10-Change the Kwic Sort options, and click on the Sort icon (on Mac: click the Start button again). How did your search results change? What happens if you continue to customize or edit the Kwic Sort options? How do you understand a key word differently based on how you tell the program to calculate context?

VISUALIZING KEYWORDS IN CONTEXT

11-Search for “school” in the Concordance Tab. 12-Once your search has loaded, click on the Concordance Plot tab to visualize your search results. 13-Each instance of the keyword is represented as a vertical black line. AntConc visualizes how keyword appearances are distributed across each file in the Corpus Files.

14-Clicking on a specific line takes you to that passage of the text in File View. 15-How useful do you find these preliminary visualizations? How do these visualizations impact your understanding of the text? What questions do you have based on these visualizations?

SEARCH OPERATORS

16-If you’re familiar with Boolean searching, you know symbols can be used in a search to customize or focus your search results. AntConc uses a series of wildcard operators to allow you to further customize your search.

17-Go to Global Settings-> Wildcard Settings to view or edit the full list of available wildcard operators. 18-Search for m?n and wom?n and compare your results.

Note on operators: The * operator is often used in Boolean searching. The ? operator is more specific because it stands in for only one character. For example, searching m*n will bring back results that include men, mean, mellon, etc. Searching m?n will return men, man, and min. Similarly, wom?n will return woman and women.

CLUSTERS AND N-GRAMS

19-Click on the Clusters/N-Grams tab and search for sport. 20-AntConc ranks your search results, calculates frequency, and range (number of files in which the cluster appears), while also displaying the text in the cluster.

21-The default Search Term Position places the search term on the left side of the cluster. Change the Search Term Position selection to On Right and click Start to re-run the search. How did your search results change? 22-Cluster Size determines the range for the number of terms AntConc searches and displays. How are your search results different when you change this range?

EXPORTING IN ANTCONC

23-After you are satisfied with a search result, click File->Save Output to save the search result as a text file (*.txt). 24-Save the file as [SEARCH TERM]_cluster_search or another descriptive name. 25-Conduct another Cluster search for study, customize your results, and export as a text file. 26-Right click on the exported text files and open in Notepad or Notepad++ to compare search results.

COLLOCATES AND WORD LISTS

27-As mentioned earlier in the tutorial, Clusters analyzes what words appear most frequently alongside your search term. 28-Collocates calculates what terms are statistically probable to appear near your search term. Freq calculates overall frequency, Freq(L) looks at frequency for terms to the left of your search term, and Freq(R) calculates frequency for terms to the right of your search term. Stat uses the Mutual Information (MI) and T-score calculations outlined in Stubbs (1995) to calculate the statistical probability of term collocation.

29-Use family as your search term. 30-AntConc will display a pop-up window message about needing to generate a Word List. Click OK to have AntConc generate that list automatically. 31-What terms are statistically likely to appear in proximity to your search term? What happens when you change the Window Span (number of words to the right and left of your search term AntConc will include in the analysis)?

Reflection Questions:

How was your understanding of the text impacted by the analysis we did in AntConc? What questions do you still have?
What would be your next step in analyzing this text, using AntConc?
What types of research questions can you see textual analysis being useful to answer or respond to?
If you have already completed Voyant training: What do you notice is similar about Voyant Tools and AntConc as digital tools for textual analysis? What are the differences?
Which did you prefer working with? Why?