2016 RNC vs. DNC Convention: Night and Day

Using Nvivo, a text-analysis software, DASIL compared Clinton and Trump’s convention speeches to demonstrate the stark contrast between the two presidential candidates. The previous post briefly examined key themes in each candidate’s address using word clouds. This analysis expands on the previous post with a more in-depth comparison of the two candidates’ approaches to the following themes:

Immigration:

Table demonstrating the frequency of mention of the word “immigration” or “immigrant(s)” by count

Table demonstrating the frequency of mention of the word “immigration” or “immigrant(s)” by count

Table demonstrating the frequency of mention of the word “immigration” or “immigrant(s)” as percentage of total number of words in each speech.

Table demonstrating the frequency of mention of the word “immigration” or “immigrant(s)” as percentage of total number of words in each speech.

In Donald Trump’s speech, 10 out of 13 times in which “immigration” or “immigrant(s)” is mentioned, it’s accompanied by words with negative connotation such as “illegal”, “radical”, “dangerous”, or “uncontrolled”. According to Trump, immigration is deemed the cause of poverty, violence, drug issues, unemployment, and terrorism.

In contrast, Clinton presented herself as an advocate for comprehensive immigration integration, which is clearly demonstrated in her convention speech: 2 out of 4 times Clinton mentioned these words, “immigration” or “immigrant(s)” is accompanied by positive words and phrases. She described immigrants as “contributing to our economy” and “hardworking”.

Jobs:

Table demonstrating the frequency of mention of the word “job(s)” by count

Table demonstrating the frequency of mention of the word “job(s)” by count

Table demonstrating the frequency of mention of the word “job(s)” as percentage of total number of words in each speech.

Table demonstrating the frequency of mention of the word “job(s)” as percentage of total number of words in each speech.

Given the long-standing lag in job growth, outlining a vision for jobs creation and income gains is among the top priorities on the two candidates’ agenda. As mentioned in a previous post, Trump held a pessimistic outlook on the American economy: 4 out of 13 “job(s)” words mentioned by Trump are surrounded by words with negative connotation. The Republican nominee talked about the prospect of jobs and wages reduction with Clinton administration and consider regulation “one of the greatest job-killers of them all.”

On the other hand, Hillary Clinton chose to deliver a more hopeful view of the matter. She highlighted the prospect of good-paying jobs and the effectiveness of her policy in job creation. None of out of 18 times she touched upon the subject of employment did she make a negative remark on the issue.

Patriotism

Table demonstrating the frequency of mention of the word "America(ns)" by count and as percentage of total word count

Table demonstrating the frequency of mention of the word “America(ns)” by count and as percentage of total word count

The two presidential candidates frequently mentioned “America(ns)” in their speech, and the word clouds visualize the frequency of the use of these words between Clinton and Trump. In fact, Trump mentioned “America(ns) almost three times as often as Clinton did – both in terms of count (number of times “America(ns)” is mentioned) and percentage (number of times “America(ns)” is mentioned as a percentage of total word count).

Even though both Trump and Clinton embraced patriotism in their convention speeches, they did so in two strikingly different ways. The Republican Party and its presidential nominee portrayed America as a country under attack by all things foreign; the country is in a dark place and Trump is the one to “make America great again.” In contrast to Trump’s nationalism, Clinton talks about American in optimistic tones, emphasizing the family values – faith, community, and togetherness – that middle-class Americans adhere to.

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

Portraits of Donald Trump and Hillary Clinton

2016 U.S. Presidential Race: Do Convention Speeches Predict the Winner?

After the Republican and Democratic Conventions in July, the 2016 U.S. presidential race is on between Democrat Hillary Clinton, who is making history as the first female presidential nominee from one of the two major political parties, and Republican Donald Trump, the contentious and provocative New York billionaire. The race for the White House this year is undoubtedly one of the most memorable events in the history of American politics, partly because of the stark contrast between the two candidates, from their political and economic agenda to their appeal to voters. Using Nvivo, a text-analysis software, DASIL compared Clinton and Trump’s acceptance speeches at their respective party conventions to further demonstrate these differences.

Main theme and important issues

Word cloud of 30 most frequent words in Donald Trump's speech

30 most frequent words in Donald Trump’s speech

Looking at the 30 most frequent words in Trump’s speech, we can see that the main issues mentioned by the Republican candidate are immigration, national security, and public safety. The most common words in the speech include “violence”, “immigration”, “protect”, “border”, “laws”, “jobs”, and “violence”, highlighting a dark portrait of the current state of America. Trump strongly emphasized that much must be changed in order to fix these issues, and that he, rather than a Democratic leader, will change this grim outcome by restoring law and order.
Word Cloud of 3 most frequent words in Hillary Clinton's speech

30 most frequent words in Hillary Clinton’s speech

Clinton, on the other hand, gave a more optimistic and upbeat speech. While acknowledging the current issues facing America and the work needed to be done, Clinton also highlighted the strengths that the nation brings to overcome these challenges. Some of the most frequent words in her speech are “family”, “people”, “works”, “jobs” and “together”, hinting at some issues that the Democratic presidential candidate wants to tackle. At the same time, these words center around the notion of inclusivity and staying united, which offers stark contrast to Trump’s anti-immigration stance, isolationism, and Americanism.

“We” versus “I”

Using Nvivo, DASIL also compares how often the two presidential candidates used “we” words – such as we, our, ours, and ourselves – versus “I” words – such as I, me, my, mine, and myself in their convention speeches.

Table showing Trump and Clinton's "we" and "I" words

For every time Clinton said “I”, she said “we” 1.83 times, while her Republican opponent said “we” only 1.5 times for each “I”. With a 1.50 “we”-to-“I” ratio, Trump delivered a more self-focused convention speech than his Democratic rival Hillary Clinton, whose speech has a “we”-to-“I” ratio of 1.83. The difference in use of “we” versus “I” words between the two candidates reveals much about their speaking styles, personalities, and even chances of winning the election. A Bloomberg Politics study of convention speeches dating back to 1976 finds that the public tend to favor candidates who use more “we” words relative to “I” words. In nine out of 10 elections since 1976, the general election winners achieved a higher “we”-to-”I”-word score compared to his opponent [Bloomberg]

Bar graph showing the number of "we" words per each "me" words for presidential nominees since 1976

Bloomberg points out that “we” words inspire confidence in others and also reflect the speaker’s self-confidence, which is a key quality in good leadership. Clinton’s “we”-to-“I” victory over Trump in her convention speech suggests that she’s in the lead position to win in November. Trump’s speaking style is more personal; furthermore, his I-word usage reveals feelings of insecurity, perhaps due to a lack of political background and experience on political issues.

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

Meet Yujing Cao, DASIL’s new data scientist!

This year, DASIL welcomes a new member of our staff, Yujing Cao, who will be serving as the new data scientist. In her position at DASIL, Yujing will bring her expertise in data analysis and visualization to further expand DASIL’s capability to help students and faculty members integrate data analysis into research and classroom work.  In today’s big data era, enormous quantities of data are available, and Yujing will help Grinnell students and faculty explore them.

Yujing Cao is excited about joining DASIL and bringing a new level of data analysis to faculty research and teaching!

Yujing Cao is excited about joining DASIL and bringing a new level of data analysis to faculty research and teaching!

Originally from China, Yujing got her bachelor degree in Statistics from Anhui University. Her passion for data science led her to a PhD program in Statistics at the University of Texas at Dallas, where she obtained her degree in 2016. Her research was on graphical modeling of biological pathways in genomic studies. She is also interested in network analysis, machine learning, and trying different tools for data visualization. In her spare time, she enjoys reading, hiking, and exercising.

Yujing was excited about the position at Grinnell because of her strong interests in teaching and in data visualization. As she puts it:

“I wanted to look for a position which provides opportunities to create interesting data visualizations along with other data analysis work. I love using graphs to tell stories behind different data sets.

Working environment is another factor that led to my decision to come to Grinnell.  I strongly resonate with the core values of a liberal arts education. At Grinnell College, I can work in an academic environment helping faculty and students while promoting the use of data in research and learning.

Yujing also discusses a number of skills crucial to succeed in the field of data science. Data science is an interdisciplinary field requiring knowledge from mathematics, statistics, data mining and machine learning. Statistical knowledge and knowledge from other fields can help form good questions and seek direction, while programming skills (e.g. joining data sets and visualizing data) are needed for implementing our ideas. To be a good data scientist, you should possess strong programming and analytical skills.”

According to Yujing, “One of the most important qualities for any data scientist is curiosity. Curiosity encourages us to dig in and make interesting discoveries about data. Also, good communication skills can make a great data scientist. You should be able to clearly articulate your results and the implications of your findings to others, including other data scientists and people who don’t share a similar background.”

Her tip for students interested in a career in data science is to keep an open mind to learn from different disciplines and sharpen your programming skills.  In addition, a student who is interested in being a data scientist should take advantage of any opportunities to get hands-on projects that use real data.”

Faculty or students interested in meeting with Yujing should drop by DASIL(ARH 130) or her office (Goodnow 103) or contact her via email at caoyujin@grinnell.edu for an appointment.

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

A Tool for Visualizing Regression Models

Will sales of a good increase when its price goes down? Does the life expectancy of a country have anything to do with its GDP? To help answer these questions concerning different measures, researchers and analysts often employ the use of regression techniques.

Linear regression is a widely-used tool for quantifying the relationship between two or more quantitative variables. The underlying premise is simple: no more complicated than drawing a straight line through a scatterplot! This simple tool is nevertheless used for everything from market forecasting to economic models. Due to its pervasiveness in analytical fields, it is important to develop an intuition behind regression models and what they actually do. For this, I have developed a visualization tool that allows you to explore the way regressions work.

You can import your own dataset or choose from a selection of others, but the default one is information on a selection of movies. Suppose you want to know the strategy for making the most money from a film. In regression terminology, you ask what variables (factors) might be good predictors of a film’s box office gross?

The response variable is the measure you want to predict, which in this situation will be the box office gross (BoxOfficeGross). The attribute that you think might be a good predictor is the explanatory variable. The budget of the film might be a good explanatory variable to predict the revenue a film might earn, for example. Let’s change the explanatory variable of interest to Budget to explore this relationship. Do you see a clear pattern emerge from the scatterplot? Can you find a better predictor of BoxOfficeGross?

If you want to control for the effects of other pesky variables without having to worry about them directly, you can include them in your model as control variables.

Below the scatterplot are two important measures that are used in evaluating regression models: the p-value and the R2 value. What the p-value tells us is the probability of getting our result just by chance. In the context of a regression model, it suggests whether the specific combination of explanatory and control variables really do seem to affect the response variable in some way: a lower p-value means that there seems to be something actually going on with the data, as opposed to the points being just scattered randomly.  The R2 value, on the other hand, tells us how what proportion of the variability in the response (predicted) variable is explained by the explanatory (predictor) variable, in other words, how good the model is. If a model has a low R2 value and is incredibly bad at predicting our response, it might not be such a good model after all.

score vs runtime plot

If you want to predict a movie’s RottenTomatoesScore from its RunTime, for example, the incredibly small p-value might tempt you to conclude that, yes, longer movies do get better reviews! However, if you look at the scatterplot, you might get the feeling that something’s not right. The R2 value tells us this other side of the story: though RunTime does appear to be correlated to RottenTomatoesScore, the strength of that relationship is just too weak for us to do anything with!

Play around with the default dataset provided, or use your own dataset by going to the Change Dataset tab on top of the page. This visualization tool can be used to develop an intuition for regression analysis, to get a feel of a new dataset, or even in classrooms for a visual introduction to linear regression techniques.

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share:

Visualizing Mass Communications and State Institutions in Wartime China (1937-45)

In China, the study of history has always gone hand-in-hand with the study of geography. When studying China’s modern history, however, focus has shifted toward large-scale processes, such as revolution, and large-scale sociological transformations, such as changing class relations. More recently, however, some historians are starting to bring geography back in. Pathbreaking endeavors such as the China Historical GIS project and Harvard University WorldMap platform-based ChinaMap allow researchers to visualize the transformation of China across space and time. The result has been a new understanding of China and Chinese history highlighting the spatial distribution of ethnic and linguistic diversity, economic development, elite networks, and state institutions. One exciting result of this new understanding is that it allows students and researchers alike to visualize large-scale processes across time periods, which can in turn lead to new questions about how different places might have experienced the same era or event. Through the use of spatial approaches, we are challenged to rethink the applicability of national historical narratives to local human landscapes.

As a teacher and researcher of East Asian history, much of what I do focuses on how media, institutions, and person-to-person networks have connected the modern Chinese state to populations both inside and outside of China. Working in tandem with DASIL, I have begun to build and visualize datasets which describe how the “connective tissue” of state-building looked during the period of China’s War of Resistance to Japan (1937-1945)—a period of intense destruction and dislocation which some historians have also described as key period of modernization. This data is drawn from two editions of The China Handbook: a publication of the Chinese Ministry of Information released in 1943 and again in 1946. I discovered this publication quite by happenstance while searching the Grinnell College Library collections for local gazetteer data related to the period of China’s Republican Era (1911-1949). The value of The China Handbook is that it provides comprehensive provincial and urban data for a number of indicators of state development; here we (myself and DASIL’s outstanding post-bac fellow, Bonnie Brooks ’15) have focused on data concerning communications, education, and health care. To be fair, and as admitted by The China Handbook’s original editor, Hollington K. Tong, this data is not exhaustive, nor is it necessarily reliable given the rapidity of changes brought about by war and resulting partition of China into competing political zones. It does, however, represent at least a starting point for visualizing what China’s wartime states looked like “on the ground,” viewed through the lens of communications and other institutional infrastructure.

Below the level of national boundaries, modern China is divided into numerous separate administrative units known as provinces. However, the number of provinces has changed with time and successive governments, which poses a challenge for those seeking to visualize data at the province level for eras during which the number of these units was larger than it is today—as was the case during the latter half of the Republican Era, which witnessed a proliferation of efforts to tame China’s restive and geopolitically fragile borders through the process of province-building. A key part of Bonnie’s contribution, then—the results of which will hopefully be used and refined by other researchers working at the intersection of geographic information systems (GIS) and modern Chinese history—was the creation of new shapefiles corresponding to each province that existed during the 1937-1945 period. The resulting maps are thus entirely new creations, and will hopefully serve to help bridge the current gap which lies between geospatial research on imperial China and research on contemporary China after Mao.  The shapefiles are available for download in DASIL’s Downloadable Data section.

For the map:

    • The Contents button(contentsbutton) will display all layers. Unclick the checkbox next to the layer name to hide the layer. To view the legend, click on the “Show Legend” icon (contentsbutton) below the layer name.
    • To examine other variables, find the “Change Style” button (contentsbutton) below the layer name you wish to view, then select the desired variable from the “Choose an attribute to show” drop-down menu.  You may alter the map with colors, symbols or size. You may also alter variables (e.g. normalize variables by population).
    • Click on an individual Chinese province to see available data.
    • The shapefiles featured in the map are available for download on the DASIL website. Click here for the download.

 

Enter your e-mail address to receive notifications of new blog posts.
You can leave the list at any time. Removal instructions are included in each message.

Powered by WPNewsman

Please like & share: