Software Review: NVivo as a Teaching Tool

nvivo-logoFor the past few weeks, DASIL has been publishing a series of blog posts comparing the two presidential candidates this year – Hillary Clinton and Donald Trump – using NVivo, a text analysis software. Given the increasing demand for qualitative data analysis in academic research and teaching, this blog post will discuss the strengths and weaknesses of NVivo as a teaching tool in qualitative analysis.

Efficiency and reliability

Using software like NVivo in content analysis can add rigor to qualitative research. Doing word search or coding using NVivo will produce more reliable results than doing so manually since the software rules out human error. Furthermore, NVivo proves to be really useful with large data sets – it would be extremely time-consuming to code hundreds of documents by hand with a highlighter pen.

Ease of use

NVivo is relatively simple to use. Users can import documents directly from word processing packages in various forms, including Word documents and pdfs, and code these documents easily on screen via the point-and-click system. Teachers and students can quickly become proficient in use of this software.

NVivo and social media

NVivo allows users to import Tweets, Facebook posts, and Youtube comments and incorporate them as part of their data. Given the rise of social media and increased interest in studying its impact on our society, this capability of NVivo may become more heavily employed.

Segmenting and identifying patterns 

NVivo allows users to create clusters of nodes and organize their data into categories and themes, making it easy for researchers to identify patterns. At the same time, the use of word clouds and cluster analysis also provides insight into prevailing themes and topics across data sets.


While NVivo seems to be a great software that serves to provide a reliable, general picture of the data, it is important to be aware of its limitations. It may be tempting to limit the data analysis process to automatic word searches that yield a list of nodes and themes. While it is alluring to do so, in-depth analyses and critical thinking skill are needed for meaningful data analysis.

Although it is possible to search for particular words and derivations of those words, various ways in which ideas are expressed make it difficult to find all instances of a particular usage of words or ideas. Manual searches and evaluation of automatic word searches help to ensure that the data are, in fact, thoroughly examined.

Once individual themes in a data set are found, NVivo doesn’t provides tools to map out how these themes relate to one another, making it difficult to visualize the inter-relationships of the nodes and topics across data sets. Users need to think critically about ways in which these themes emerge and relate to each other to gain a deeper understanding of the data.

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

Historical Data Requires Historical Finesse


Utilizing contemporary tools to analyze historical data provides a unique way to approach historical research, but can prove to be an arduous process as modern tools may not be compatible with historical data. This summer, I have been working with Professor Sarah Purcell to create maps for her book on spectacle funerals of key figures during the U.S. Civil War and Reconstruction. Most commonly, famous bodily remains traveled from city to city on railroads, in some cases on a special funeral train, though they also traveled on rivers and in one case, across the Atlantic Ocean. Nearly every historical figure discussed in the book has an accompanying map which charts their extended funeral processional route. Using GIS technology, we are able to juxtapose census and election data with the geographic routes in highly analytical maps.

In order to layer election data onto the map for Col. Elmer Ellsworth (died 1861), I gathered county-level election data from the Interuniversity Consortium for Political and Social Research (ICPSR) and county-level census data from the National Historical Geographic Information System (NHGIS). I then needed to combine the ICPSR election data and the NHGIS census data in a joined spreadsheet before importing the data into ArcGIS software to link the data to its county location.

At first, I thought we could link the data using something called a “FIPS code.” In an effort to standardize big data and allow for easy joining of tables by location, the Federal Information Processing Standard assigned each county in the United States during the 1990s with a unique five-digit code, more commonly known as a FIPS code. The first two digits are the state FIPS code and the last three are the county code within the state. For example, the FIPS code for Poweshiek County, Iowa is 19157. This code is assigned to the current borders of Poweshiek County. Yet the data I was analyzing is from 1860. Poweshiek County in 2015 represents a different land area than Poweshiek County in 1860. Thus, joining ICPSR and NHGIS data from the 19th century could not be completed using FIPS codes without introducing historical inaccuracy in the maps.

In order to join two tables of data in any computer program, there must be a common column between them. From ICPSR, I had a table of county-level election data from 1860 and from NHGIS, I had a table of county-level 1860 census data. If I were to join data tables of current counties, the FIPS code would serve as my common column. However, instead of using FIPS codes to join the data, I created a common column using the name of the county and state. Creating a unique name for each county assures that I correctly joined the historic county data to the historic county borders. Poweshiek County’s unique identifier would be: “PoweshiekIowa.” I quickly discovered that joining data by this concatenated column was not without error. I went through each county individually to discover discrepancies, many of which resulted from spelling inconsistencies between the two databases.

After cleaning the data, the tables joined neatly. Using GIS, I then linked the combined election and census dataset to the geographic borders of the counties on the electronic map. I color coded the map by political party. The darker shade of each color show where the political party won the majority of votes in the county (greater than 50%), while the lighter shade of the color shows where the party won a plurality of the votes in the county. As you can see from the map’s multiple colors, unlike modern American politics, the 1860 presidential election involved more than two prominent political parties including Republicans, Northern and Southern Democrats, and the Constitutional Union Party. The political divide between North and South is clearly apparent along the Mason-Dixon Line between Pennsylvania and Maryland foreshadowing the sectional conflict of the American Civil War nearly six months after the election.

Mapping historical data is certainly a different process than mapping current data and can prove to be more time-consuming and complex. Though current tools (like FIPS codes) can help standardize mapping techniques, they may not be applicable in historical data settings and current tools may need to be discarded or updated. Historical FIPS codes, anyone?

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

How Traditional Introductory Statistics Textbooks Fail to Serve Social Science Undergraduates

When no weighting variable is used, the estimate is that about 50% of the population know the Jewish Sabbath starts on Friday.

No weighting variable: the estimate is that about 50% of the population knows that the Jewish Sabbath starts on Friday.

When the data is appropriately weighted, the estimate changes by about 5 percentage points.

Appropriately weighted data: The estimate changes by about 5 percentage points, suggesting that only 45% of the population knows the correct start time.

Full disclosure: I approach this topic simultaneously from the perspective of a social scientist and as the instructor of a traditional introductory statistics class for over twenty years. I am, thus, myself part of the problem. While I am mainly following the dictates of some of the most popular text books, it is fully within my power to diverge from the book. When I do not do so, it is really my own fault—a sheep following the sheep dogs.

Our worst failure as statistics teachers is to teach as if all or most of the data that our students will engage with in their future careers are from simple random samples. Continue reading →

Please like & share:

Access to Research Includes Access to Data!

In February 2013, the Director of the White House Office of Science and Technology Policy (OSTP) issued a memorandum to all agency and department heads entitled, “Increasing Access to the Results of Federally Funded Scientific Research”.

The memo directed federal agencies that award more than $100 million in research grants to develop plans for increasing public access to peer-reviewed scientific publications. It also requires researchers to better account for and manage the digital data resulting from their federally funded research. (At the same time, the OSTP directive acknowledges that access to some data needs to be controlled to protect human privacy, confidentiality of business secrets, intellectual property interests, and other reasons.)

The OSTP recognizes that research data are valuable and need to be preserved. Increased public access to data – along with better access to the published literature – is fundamental to research, and permits

  • more thorough critiques of theories and interpretations, including replication of research results,
  • scholarly innovation that builds on past work, and
  • practical application of scholarly discoveries.

Continue reading →

Please like & share: