~hrbrmstr/epidata

ref: 047ff85a9e43657b4ea6c2223ff1645e592ced1c epidata/README.Rmd -rw-r--r-- 4.3 KiB
047ff85aBob Rudis README 4 years ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
output: rmarkdown::github_document
---

`epidata` : Tools to Retrieve Economic Policy Institute Data Library Extracts

The [Economic Policy Institute](http://www.epi.org/data/) provides researchers, media, and
the public with easily accessible, up-to-date, and comprehensive historical data on the 
American labor force. It is compiled from Economic Policy Institute analysis of government
data sources. Use it to research wages, inequality, and other economic indicators over 
time and among demographic groups. Data is usually updated monthly.

The following functions are implemented:

- `get_black_white_wage_gap`:	Retreive the percent by which hourly wages of black workers 
   are less than hourly wages of white workers
- `get_college_wage_premium`:	Retreive the percent by which hourly wages of college graduates 
   exceed those of otherwise equivalent high school graduates
- `get_employment_to_population_ratio`:	Retreive the share of the civilian noninstitutional 
   population that is employed
- `get_gender_wage_gap`:	Retreive the percent by which hourly wages of female workers are 
   less than hourly wages of male workers
- `get_hispanic_white_wage_gap`:	Retreive the percent by which hourly wages of Hispanic 
   workers are less than hourly wages of white workers
- `get_labor_force_participation_rate`:	Retreive the share of the civilian noninstitutional
   population that is in the labor force
- `get_long_term_unemployment`:	Retreive the share of the labor force that has been unemployed 
   for six months or longer
- `get_median_and_mean_wages`:	Retreive the hourly wage in the middle of the wage distribution
- `get_non_high_school_wage_penalty`:	Retreive the percent by which hourly wages of workers 
   without a high school diploma (or equivalent) are less than wages of otherwise equivalent workers who have graduated from high school
- `get_underemployment`:	Retreive the share of the labor force that is "underemployed"
- `get_unemployment`:	Retreive the share of the labor force without a job
- `get_unemployment_by_state`:	Retreive the share of the labor force without a job (by state)
- `get_wages_by_education`:	Retreive the average hourly wages of workers disaggregated by the 
   highest level of education attained
- `get_wages_by_percentile`:	Retreive wages at ten distinct points in the wage distribution
- `get_wage_ratios`:	Retreive the level of inequality within the hourly wage distribution.

### Installation

```{r eval=FALSE}
devtools::install_github("hrbrmstr/epidata")
```

```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```

### Usage

```{r message=FALSE, warning=FALSE, error=FALSE}
library(epidata)

# current verison
packageVersion("epidata")

get_black_white_wage_gap()

get_underemployment()

get_median_and_mean_wages("gr")
```

### Extended Example

```{r message=FALSE, warning=FALSE, error=FALSE, fig.width=10, fig.height=8, fig.retina=2}
library(tidyverse)
library(epidata)
library(ggrepel)

unemployment <- get_unemployment()
wages <- get_median_and_mean_wages()

glimpse(wages)

glimpse(unemployment)

group_by(unemployment, date=as.integer(lubridate::year(date))) %>%
  summarise(rate=mean(all)) %>%
  left_join(select(wages, date, median), by="date") %>%
  filter(!is.na(median)) %>%
  arrange(date) -> df

cols <- ggthemes::tableau_color_pal()(3)

ggplot(df, aes(rate, median)) +
  geom_path(color=cols[1], arrow=arrow(type="closed", length=unit(10, "points"))) +
  geom_point() +
  geom_label_repel(aes(label=date),
                   alpha=c(1, rep((4/5), (nrow(df)-2)), 1),
                   size=c(5, rep(3, (nrow(df)-2)), 5),
                   color=c(cols[2],
                           rep("#2b2b2b", (nrow(df)-2)),
                           cols[3]),
                   family="Hind Medium") +
  scale_x_continuous(name="Unemployment Rate", expand=c(0,0.001), label=scales::percent) +
  scale_y_continuous(name="Median Wage", expand=c(0,0.25), label=scales::dollar) +
  labs(title="U.S. Unemployment Rate vs Median Wage Since 1978",
       subtitle="Wage data is in 2015 USD",
       caption="Source: EPI analysis of Current Population Survey Outgoing Rotation Group microdata") +
  hrbrmisc::theme_hrbrmstr(grid="XY")
```

### Test Results

```{r message=FALSE, warning=FALSE, error=FALSE}
library(epidata)
library(testthat)

date()

test_dir("tests/")
```