2016 Presidential Election Contributions Analysis by Kenneth Curtis

Introduction

This data comes from the Federal Election Commission and includes individual contributions, refunds to individuals and transfers made from authorized committees. To see the full list of variables and other info about this particular dataset, click this link.

For this analysis, I will be using data for all states that made contributions to the 2016 Presidential Campaign to understand the big picture.

The Big Picture

To start, lets look at the contribution data from all states in America. The name of the dataset will be df_all.

## 'data.frame':    7440252 obs. of  18 variables:
##  $ cmte_id          : chr  "C00458844" "C00458844" "C00458844" "C00458844" ...
##  $ cand_id          : Factor w/ 25 levels "P00003392","P20002671",..: 11 11 11 11 11 11 11 11 11 11 ...
##  $ cand_nm          : Factor w/ 25 levels "Bush, Jeb","Carson, Benjamin S.",..: 19 19 19 19 19 19 19 19 19 19 ...
##  $ contbr_nm        : Factor w/ 1393128 levels "'CALL, ALLAN",..: 112879 318178 213391 1077550 1077550 789174 318874 213391 1027867 537615 ...
##  $ contbr_city      : Factor w/ 26456 levels "","'CALLAHAN",..: 24977 1044 6102 780 780 6102 780 6102 780 780 ...
##  $ contbr_st        : Factor w/ 116 levels "20","30","AA",..: 1 2 5 5 5 5 5 5 5 5 ...
##  $ contbr_zip       : Factor w/ 871835 levels "",".","`1136",..: 871767 871771 92745 92485 92485 92710 92704 92745 152 92481 ...
##  $ contbr_employer  : Factor w/ 399182 levels "","''I LIKE COMICS''",..: 334165 234488 101162 372773 372773 98147 102964 101162 144561 292072 ...
##  $ contbr_occupation: Factor w/ 127674 levels "","''''FOR HIRE'''' DRIVER",..: 77537 86545 120534 82249 82249 46214 114905 120534 8157 96806 ...
##  $ contb_receipt_amt: num  175 25 100 200 100 ...
##  $ contb_receipt_dt : Factor w/ 849 levels "01-APR-15","01-APR-16",..: 396 424 528 261 208 179 703 146 478 800 ...
##  $ receipt_desc     : Factor w/ 203 levels ""," SEE REATTRIBUTION",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ memo_cd          : Factor w/ 2 levels "","X": 1 1 1 1 1 1 1 1 1 1 ...
##  $ memo_text        : Factor w/ 1721 levels ""," SEE REATTRIBUTION",..: 1 992 1 1 1 1 1 1 992 1 ...
##  $ form_tp          : Factor w/ 3 levels "SA17A","SA18",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ file_num         : int  1082559 1082559 1056862 1082559 1082559 1082559 1029436 1056862 1082559 1047126 ...
##  $ tran_id          : Factor w/ 7289531 levels "A0000FD2A304E432AAD5",..: 3664753 3665478 3573467 3609940 3607294 3606756 4128523 3643619 3665564 4178277 ...
##  $ election_tp      : Factor w/ 11 levels "","G2015","G2016",..: 8 8 8 8 8 8 8 8 8 8 ...
##    cmte_id               cand_id       
##  Length:7440252     P00003392:3506081  
##  Class :character   P60007168:2063097  
##  Mode  :character   P80001571: 782711  
##                     P60006111: 557581  
##                     P60005915: 248227  
##                     P60006723: 104814  
##                     (Other)  : 177741  
##                       cand_nm                   contbr_nm      
##  Clinton, Hillary Rodham  :3506081   ACTBLUE         :  15244  
##  Sanders, Bernard         :2063097   TRUITT, ROBERTA :   1667  
##  Trump, Donald J.         : 782711   BODNICK, KATIE  :   1326  
##  Cruz, Rafael Edward 'Ted': 557581   AMISIAL, WILFRID:   1098  
##  Carson, Benjamin S.      : 248227   PURCELL, LARRY  :    723  
##  Rubio, Marco             : 104814   SMITH, DAVID    :    689  
##  (Other)                  : 177741   (Other)         :7419505  
##         contbr_city        contbr_st           contbr_zip     
##  NEW YORK     : 207053   CA     :1304346   021440031:  13559  
##  LOS ANGELES  : 102730   NY     : 649460   00000    :   8486  
##  SAN FRANCISCO:  90946   TX     : 548396   99999    :   4144  
##  WASHINGTON   :  89894   FL     : 426057            :   1733  
##  BROOKLYN     :  87472   MA     : 295667   021443132:   1685  
##  (Other)      :6862154   WA     : 292317   974043146:   1667  
##  NA's         :      3   (Other):3924009   (Other)  :7408978  
##       contbr_employer                contbr_occupation  
##  N/A          : 987952   RETIRED              :1662981  
##  RETIRED      : 931955   NOT EMPLOYED         : 625753  
##  SELF-EMPLOYED: 535215   INFORMATION REQUESTED: 245354  
##  NONE         : 453134   ATTORNEY             : 197728  
##  NOT EMPLOYED : 265110   TEACHER              : 139558  
##  (Other)      :4261917   (Other)              :4567753  
##  NA's         :   4969   NA's                 :   1125  
##  contb_receipt_amt   contb_receipt_dt  
##  Min.   :  -93308   29-FEB-16:  69429  
##  1st Qu.:      15   31-MAR-16:  67539  
##  Median :      28   12-JUL-16:  64383  
##  Mean   :     126   11-JUL-16:  57853  
##  3rd Qu.:      94   31-OCT-16:  55354  
##  Max.   :12777706   31-MAY-16:  52813  
##                     (Other)  :7072881  
##                      receipt_desc     memo_cd    
##                            :7337197    :6081375  
##  Refund                    :  54429   X:1358877  
##  REDESIGNATION FROM PRIMARY:  10224              
##  REDESIGNATION TO GENERAL  :  10219              
##  REATTRIBUTION TO SPOUSE   :   4515              
##  REATTRIBUTION FROM SPOUSE :   4466              
##  (Other)                   :  19202              
##                                                          memo_text      
##                                                               :4708273  
##  * EARMARKED CONTRIBUTION: SEE BELOW                          :1987691  
##  * HILLARY VICTORY FUND                                       : 653108  
##  NOTE: ABOVE CONTRIBUTION EARMARKED THROUGH THIS ORGANIZATION.:  15272  
##  REDESIGNATION FROM PRIMARY                                   :  10224  
##  REDESIGNATION TO GENERAL                                     :  10219  
##  (Other)                                                      :  55465  
##   form_tp           file_num                       tran_id       
##  SA17A:6094210   Min.   :1003942   SB28A.1243          :      6  
##  SA18 :1291613   1st Qu.:1077916   A20961603DA6C49AFA23:      5  
##  SB28A:  54429   Median :1098663   A46D29B1678904380A70:      5  
##                  Mean   :1101464   A587D73F5F36B4479B05:      5  
##                  3rd Qu.:1133832   A7A5F60F0D3114D58B3E:      5  
##                  Max.   :1146285   A8C448FEBB0A046ECB9E:      5  
##                                    (Other)             :7440221  
##   election_tp     
##  P2016  :4789297  
##  G2016  :2634954  
##         :  14022  
##  O2016  :   1891  
##  P2020  :     76  
##  P2012  :      3  
##  (Other):      9

By looking at the structure and summary of df_all, we can already make a few observations. In the summary for the cand_nm (candidate name) column, we can see six candidates that received the most contributions, with Hillary Clinton being first at over 3 million contributions. In the contbr_nm (contributor name) column, ActBlue made the most number of contributions to the campaign.

ActBlue is a nonprofit, building fundraising technology for the left. Our mission is to democratize power and help small-dollar donors make their voices heard in a real way. Together, we’ve raised 2,397,348,889 dollars for Democrats and progressive causes in just 14 years. We’ve built more than just a fundraising platform. We’ve created the kind of grassroots power that can take on, and beat back, the power of corporate spending and secretive super PACs.

Further in the investigation, I will analyze the relationship between Act Blue and the candidates.

Other notable observations are from the contbr_city(contributor city), contbr_st (contributor state), and the contbr_occupation (contributor occupation) columns. New York is the top city, California is the top state, and the top occupation that made contributions is “Retired”.

Next we will look at the distribution of the columns. We’ll start by looking at which candidate appears in the dataset the most.

## Warning: Ignoring unknown parameters: binwidth, bins, pad

With this plot we can easily see that Hillary Clinton and Bernard Sanders are candidates that recieved the most number of contributions. With Donald Trump and Ted Cruz coming in third and fourth.

## Selecting by n
## # A tibble: 10 x 2
##    contbr_nm               n
##    <fct>               <int>
##  1 ACTBLUE             15244
##  2 AMISIAL, WILFRID     1098
##  3 BODNICK, KATIE       1326
##  4 JOHNSON, DAVID        635
##  5 PURCELL, LARRY        723
##  6 SAUNDERS, ELIZABETH   676
##  7 SMITH, DAVID          689
##  8 TRUITT, ROBERTA      1667
##  9 TYLER, CAMM           669
## 10 WILLIAMS, JAMES       683

In this dataset, there are over seven million contributor names, the graph above shows the ten most significant. I will create plots that show who contributed the most money to the campaigns later on in the analysis.

## Selecting by n

This graph shows the distribution of the top ten cities that contributed to the campaign.

For the contributor states column, I will show all of the states in one graph from most contributions made, to the least. In this dataset, there were abbreviations in the column that were not U.S. states. I removed the statistics of those particular abbreviations and focused on just the states in America.

# table of state distribution
state_contrb_num <- as.data.frame(table(df_all$contbr_st))

# change column names of table
colnames(state_contrb_num) <- c("state", "n")

# arrange in descending order
state_contrb_num <- state_contrb_num %>%
  arrange(desc(n))

#remove abr. that are not official states
state_contrb_num_2 <- subset(state_contrb_num, state %in% c("AL", "AK", "AZ", 
          "AR", "CA", "CO", "CT", "DC", "DE", "FL", "GA", "HI", "ID", 
          "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD", "MA", "MI", 
          "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", 
          "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", 
          "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY"))

# plot the distribution
ggplot(state_contrb_num_2, aes(reorder(state, -n), n, decreasing = TRUE)) +
  geom_bar(stat = "identity", color = "black", fill = "lightblue") +
  scale_y_continuous(expand = c(0,0), label = comma, breaks = seq(0, 1500000, 50000)) +
  xlab("state") +
  ylab("count") 

California gave the most contributions for the 2016 election, with New York, and Texas coming in behind.

Next we’ll see the contributor occupation graph.

## Selecting by n

The top occupation of the contributors is “retired”. “Not employed” comes in second and “attorney” comes in next. We can aslo see that a good number of people did not choose to give their occupation.

Let’s now look at which dates were the most popular for contributors.

## Selecting by n

With this graph we can see the top 10 dates contributions were made on.

Finally, let’s look at the distribution of the contb_receipt_amt, which shows the amount of the contribution made to each candidate.

## Selecting by n

While their were many different amounts of contributions made in the dataset, these are the top 10 amounts that appeared most throughout. According to an article by the Wall Street Journal, these types of contributions are important for candidates, as they show signs of a “grassroots enthusiasm”.

On the other side of the equation are candidates who have been largely bankrolled by donations of $200 or less. The level of small-dollar donations is often read as a sign of grassroots enthusiasm. Candidates who rely heavily on small donations have a key advantage: They can continue to tap the same donors for more money, while campaigns whose donors have hit the legal donation limit must find new donors to grow their war chests.

The Big Picture: Summary

After looking at the distributions of this dataset, we now know that the most active city for contributions is New York, and the most active state is California. Hillary Clinton receives the most contributions out of any candidate and Act Blue is the most active contributor in this dataset. They may be the most active, but after we look at the data more closely, there should definitely be other contributors that have given the most money. Based on the distribution of the contb_receipt_amt column, there were many different amounts given, and some were in the millions. But, the most common amount given was in the hundreds. Later in the analysis, I will explore which candidate received the most amount of these small donations. I also found out that most of the contributions happened in Februrary, March, and July. I will make plots to show the amount of money contributed throughout the year of 2016, later on in the analysis.

As we now know in 2018 (when this analysis was done) Donald Trump was the winner of the 2016 presidential election, but interestingly enough, he was in third place for the number of contributions received. It will be interesting to dive deeper into the data to see how money flowed to the candidates during this election.

Money Flow

For this section I will look at the relationship between the money and the variables I explored in the earlier section. First, let’s look at the relationship between the candidates and money, who received the most in cash?

Before plotting this graph, I was expecting to have the same order as I did when I looked at the distribution of the number of contributions made to each candidate. Looking at the plot above, even though Berney Sanders had more contributions given to him, Donald Trump still came out ahead. This could mean that Sanders was given many small contributions, and Trump was given a few high dollar donations. As expected Hillary Clinton came out on top and received over $500,000,000 in contributions.

To look at this more closely, lets see which candidates received higher or lower amounts in contributions.

Money Flow: Types of Donations

This graph is slightly vague, but it does show us some important information. We can clearly see that Hillary Clinton alone has received donations that amount in the millions. We also see that candidates such as Berney Sanders stand out from the rest because of the amount of smaller donations he received in his campaign. To look at the data more closley, I will plot the same graph, but with the top .5% of each variable taken away.

## Warning: Removed 8001 rows containing missing values (geom_point).

Using alpha in geom_point, we are able to see the volume of each type of donation for each candidate. Ted Cruz, Donald Trump, Berney Sanders, and Hillary Clinton stand out the most of these graphs. Let’s look at each of these candidates compared to one another.

Now that we can look at these four candidates against each other, we see that Clinton receives more high dollar contributions than any other candidate, with some donations reaching over $10 million. In contrast, Berney Sanders receives more low dollar donations than all of the other candidates. Cruz and Trump both receive the same types of donations.

Let’s look at this plot again, but with the top .5% of contributions removed.

## Warning: Removed 6524 rows containing missing values (geom_point).

Interestingly, each volume of contribution seems to stop at around $2700, this could be because, according to the Federal Election Committee, indiviuals are limited to give 2700 per election and 5000 per year. Rules are differnet depending on whether you’re part of a PAC or party committee.

After analyzing the types of contributions the candidates receive, I became curious about the percentage of donations each candidate receives under $200. Small donations are a good indicator at how much support a candidate has from the population.

## Using cand_nm as id variables

As we know, Clinton leads in total contributions and ones under $200. But as we see in the bottom table, Sanders leads all of the candidates for percentage of donations under $200 at over 70%, with Clinton sitting at 25%. We can also see candidates Cruz and Carson sitting at about half of their donations under $200. Studying this graph gives a good indication of what types of donations each candidate receives.

Now that we understand the scope of amounts made, and we concluded that Hillary Clinton was the only candidate that received donations that amounted more than $1 million, let’s see just how big these donations are, and who gave them.

Money Flow: The Donators

##                    cand_nm                         contbr_nm
## 1  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 2  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 3  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 4  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 5  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 6  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 7  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 8  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 9  Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 10 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 11 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 12 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 13 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 14 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 15 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 16 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 17 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 18 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 19 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 20 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 21 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 22 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
## 23 Clinton, Hillary Rodham HILLARY VICTORY FUND - UNITEMIZED
##    contb_receipt_amt contb_receipt_dt
## 1          2069048.8        31-AUG-16
## 2          6358481.9        24-AUG-16
## 3         12777705.6        31-JUL-16
## 4          1605003.4        30-SEP-16
## 5          7402361.5        30-JUN-16
## 6          4904860.5        31-MAR-16
## 7          1797624.9        31-DEC-15
## 8          1603724.4        04-DEC-15
## 9          1467070.9        31-JAN-16
## 10         3686373.3        29-FEB-16
## 11         3600489.1        29-APR-16
## 12         4575438.6        31-MAY-16
## 13         2976430.3        19-OCT-16
## 14         2445439.1        06-OCT-16
## 15         4560967.1        29-SEP-16
## 16         2595810.1        22-SEP-16
## 17         2669426.3        15-SEP-16
## 18         4126693.2        13-OCT-16
## 19         2274127.9        03-NOV-16
## 20         1951834.0        07-NOV-16
## 21          974215.3        08-NOV-16
## 22         2766224.4        27-OCT-16
## 23         1486569.5        31-OCT-16

As we can see in the table above, all of the large contributions were provided by the Hillary Victory Fund:

The Hillary Victory Fund was a joint fundraising committee for Hillary for America (the Hillary Clinton presidential campaign organization), the Democratic National Committee (DNC), and 33 state Democratic committees. As of May 2016, the Fund had raised $61 million in donations.

The Fund’s promotional materials described it is a way to “support Hillary Clinton and Democrats up and down the ticket.” Individual donations were first allocated to Hillary for America (up to $2,700 or $5,400 for married couples), then to the Democratic National Committee (up to $33,400) and finally divided among state parties. During the primaries, the state parties received little of the funds raised. The Bernie Sanders campaign criticized the Fund and alleged that Clinton’s campaign was “looting funds meant for the state parties to skirt fundraising limits on her presidential campaign.”

Now that we’ve seen high dollar contributions for the Hillary Campaign, lets take a look at Ted Cruz, Donald Trump, and Berney Sanders.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -16603.0    100.0    250.0    547.3    533.0  28750.0

Out of 102,881 donors, Cruz’s top donor is Willie T. Langston who gave $28,750 to the 2016 campaign.

Now we will look at top donors for Berney Sanders

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##  -90761.1     100.0     211.2     395.0     456.0 3095699.9
## # A tibble: 2 x 2
##   contbr_nm    contributed_amount
##   <fct>                     <dbl>
## 1 ACTBLUE                3095700.
## 2 MARTIN, JOHN             10279.

As discussed earlier, Berney Sanders’ campaign took in many small donations, 233,517 to be exact. Interestingly, Sanders has a much smaller average in contributions than Cruz, but received a considerable amount overall. One notable donation in the amount of $3,095,699.89 from ActBlue.

While we are discussing ActBlue, let’s take a look at which candidates they contributed to, and how much.

Sanders was the only candidate ActBlue contributed to in this dataset. They donated to him over 15 thousand times in 2016. According to opensecrets.org, ActBlue spent $645.3 million for the 2016 election.

Next, we look at Donald Trump’s donors.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -17499.2     28.0     71.7    236.0    200.0  22525.3

For the 2016 election, Donald Trump had 526,807 donors. These contributors differ from the other candidates because there are not many sum to more than $20k. Also, even though Trump had a great number of donors, he still did not receive more money than Clinton or Sanders. This could be because of high dollar donations for Clinton from the Hillary Clinton Victory Fund, and the volume of donations for Sanders from ActBlue.

Now that we understand how money relates to each candidate, let’s look at how money moved over time.

Money Flow: Time

This plot gives us a good idea of how much money was given to each candidate throughout 2016. Earlier in our analysis we looked at the distribution of the dates in this dataset and concluded that the most popular months were February, March, and July. Interestingly, these were not months where the most money was given as we can see in this plot.

Let’s look at the four candidates discussed earlier to get a better look at how much money was donated to them through 2016

Plotting this graph shows us that Cruz and Sanders have similar donation habits. Clinton and Trump also show similarity as well. Another interesting observation is that there seem to be spikes of donations for Clinton before the end of every month.

Next we will look at donations made based on an individuals occupation.

Money Flow: Occupations

## Selecting by contribution_total

Out of 127,675 different occupations that contributed to presidential campaigns, “retired” was the top occupation. In second place, there is a blank label, these contributors decided not to give their occupation information.

Lets look at how much each occupation relates to each candidate.

Hillary Clinton has received the most donations in almost every top 25 occupations. One notabale occupation category is “Requested Per Best Efforts”, where candidates like Jeb Bush, Ben Carson, Marco Rubio, and Ted Cruz all have high amounts of donations. Another notable candidate in these plots is Donald Trump, who received most donations from occupations such as “Retired”, “Real Estate”, “Information Requested”, “Owner”, and “Sales”. Let’s take a closer look at the top candidates in a similar plot.

When we narrow down the amount of candidates, we see that Sanders has a few notable occupation categories such as “Not Employed”, “Engineer”, “Software Engineer”, “Student”, “Teacher”, and “Writer”. Donald Trump also has more donations in the “Business Owner” category than Hillary Clinton.

Next, we will look at cities in the same way as we did above.

Money Flow: Cities

## Selecting by contribution_total

As we discovered in the distribution of the cities in this data set, contributors from New York gave the most in 2016.

Let’s look at which candidates the cities gave to.

Unsurprisingly, Clinton leads in all of the top 25 cities except for Houston. Cities such as Dallas, Houston, Las Vegas, Miami, and San Antonio have multiple candidates with significant contributions.

Finally, here are the top candidates compared to each other.

Ted Cruz received most of his donations from cities in Texas such as Houston, and even beat Clinton. His other top cities are San Antonio, Dallas, and Austin.

Money Flow: Summary

For this analysis, my goal was to investigate the flow of money in the 2016 presidential election. First, I discovered that Hillary Clinton received the most money in the campagin compared to all of the candidates, with Berney Sanders, Ted Cruz, and Donald Trump coming in behind her.

I then wanted to explore which candidates received high dollar and low dollar donations and made scatter plots that showed the distribution of donations for each candidate. Hillary Clinton received a great number of the high dollar donations, while Sanders, Cruz, and Trump received a high amount of low dollar donations. According to an article by the Wall Street Journal, these types of contributions are important for candidates, as they show signs of a “grassroots enthusiasm”.

After seeing the types of donations each candidate was receiving, I looked at who these donations were coming from. All of Clinton’s donations that were over $1 million came from the Hillary Victory Fund. This source of money for the Cinton campaign turned out to be controversial as it was being accused of operating illegaly and money laundering. Nevertheless, the fund enabled Clinton to have the most money compared to all of the other candidates. Another notable donor was ActBlue. This organization is a nonprofit that builds fundraising technology for the left. They donated to Sanders over 15 thousand times in the 2016 campaign and did not donate to any other candidate in this data set. According to opensecrets.org, ActBlue spent $645.3 million for the 2016 election.

Finally, I used the data to explore which cities and occupations are donating to each candidate. Overall, donors who were retired gave the most money, and New York was the top city. Hillary Clinton led in all of these categories, but Cruz and Trump had significant donations from cities in Texas.

Final Thoughts

When I first started this project, I wanted to study a single state. I chose Kentucky since I would be moving there soon. Wouldn’t it be great to understand the political spectrum of the state before you move there? I thought so. But then I thought, wouldn’t it be even better to understand all of America? It wasn’t hard to find the data for every single state. There was a simple button called ALL.zip on the Federal Election Commission page. Once downloaded, I realized how big of a dataset this was. 1.4 GB, 7,440,252 rows, and 18 variables, my biggest dataset to explore so far. The tast before me felt daunting. After spending a great amount of time with the dataset, I became much more comfortable with it. The following are my best observations and plots that I think are the most interesting.

Plot: Who’s The Millionaire?

Well, to answer the question in the headline, all of the candidates are millionairs. One candidate that repeatedly stood out was Hillary Clinton. As you can see in the y axis of all the candidates, Hillary’s extends into the millions for the types of contributions. If I were to put the same x and y scale for all of the other candidates, their donations received would look like tiny dots.

With this plot, we are able to compare the types of contributions each candidate received during the 2016 election. Understanding this could lead us to make conclusions to which type of candidate they are. For example, according to an article by the Wall Street Journal, the level of small-dollar donations is often read as a sign of grassroots enthusiasm. They have an advantage by continuing to tap into the same donors for more money, while campaigns whose donors have hit the legal donation limit have to find new donors to keep the money rolling in.

From the plot, we see that Bernery Sanders shows great levels of grassroots enthusiasm with over 300 thousand small contributions. Other candidates such as Donald Trump, Ted Cruz, and Ben Carson, also had a decent amount of small contributions. This point was even further reiterated when I calculated the percentage of donations under $200 for each candidate.

Plot: The Grassroots Candidate

After plotting this graph, it was easy to compare the candidates that showed a more grassroots enthusiasm. Berney Sanders leads the pack with over 70% of his donations under $200, with Ted Cruz, Ben Carson, and Donald Trump behind. Clinton, the candidate with the most donations throughout the dataset, has about 25% of her donations under $200.

Plot: Occupations Supporting Candidates

Understanding which occupations give the most to each candidate could also be a very useful statistic. With this plot, we can better understand the types of people that donate to each candidate. For example, one notabale occupation category is “Requested Per Best Efforts”, where candidates like Jeb Bush, Ben Carson, Marco Rubio, and Ted Cruz all have high amounts of donations. Another notable candidate is Donald Trump, who received most donations from occupations such as “Retired”, “Real Estate”, “Information Requested”, “Owner”, and “Sales”. Sanders has a few notable occupation categories such as “Not Employed”, “Engineer”, “Software Engineer”, “Student”,“Teacher”, and “Writer”.

These three plots in this section are great summaries of this dataset. They help us better understand what kind of money each candidate is receiving, and what kind of people are giving to them.

Reflection

Exploring this data proved to be challening mainly because of the large amount of variables for each plot. It was hard to present such large values on plots in a manner that would be understandable to the audience. The main relationship I studied in this data set was between the candidates and the money that was donated to them, but further exploration can be done. For example, it would useful to map out the United States using the Zip Code column, and show statistics for each state. You could even go as far as building an interactive map for users to explore that informs them of the money that flowed in the campaign.

Overall, exploring this data was fun and eye opening, and I achieved the goal that I set out for myself.

Sources