Skip to content

Commit 5f5b7c1

Browse files
authored
Merge pull request #25 from jamesdunham/v0.2.12
Merge v0.2.12
2 parents 25dd83f + 73eb8fc commit 5f5b7c1

21 files changed

+585
-589
lines changed

DESCRIPTION

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Package: dgo
22
Title: Dynamic Estimation of Group-Level Opinion
3-
Version: 0.2.11
4-
Date: 2017-10-26
3+
Version: 0.2.12
4+
Date: 2017-11-13
55
Description: Fit dynamic group-level IRT and MRP models from individual or
66
aggregated item response data. This package handles common preprocessing
77
tasks and extends functions for inspecting results, poststratification, and

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ else
1111
R := R
1212
endif
1313

14-
all: clean docs data readme build check install
14+
all: clean docs data readme build check install site
1515

1616
quick: clean
1717

NEWS.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
1+
## dgo 0.2.12
2+
3+
* Allow modeling of unobserved groups using aggregated data. The previous
4+
behavior was to drop rows in `aggregate_data` indicating zero trials. (They
5+
don't represent item responses.) Preserving them has the effect that
6+
unobserved groups, defined partially or entirely by the values of the grouping
7+
variables in zero-trial rows in `aggregate_data`, can be included in a model.
8+
* Fix an unexpected error when 1) `aggregate_data` is used without `item_data`,
9+
2) no demographic groups are specified via `group_names`, and 3) geographic
10+
`modifier_data` is used.
11+
* Fix the check for missing `modifier_data`. Geographic `modifier_data` must
12+
cover all combinations of the geo and time variables in the item response data
13+
(individual or aggregated), but because of a bug in the validation of the
14+
geographic data, this requirement was not always enforced. In some cases a
15+
warning would appear instead of an error.
16+
117
## dgo 0.2.11
218

319
* Add poststratification over posterior samples (closes #21).

R/restrict_input_data.r

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ restrict_modifier <- function(modifier_data, group_grid, ctrl) {
6363
modifier_data <- modifier_data[geo_time_grid, nomatch = 0]
6464

6565
# confirm that modifier data covers all modeled geo and time
66-
missing_geo_time <- modifier_data[!geo_time_grid]
66+
missing_geo_time <- geo_time_grid[!modifier_data]
6767
if (nrow(missing_geo_time)) {
6868
stop("Not all pairs of time periods and geographic areas are in ",
6969
"modifier_data. ", nrow(missing_geo_time), " missing.")
@@ -122,11 +122,6 @@ restrict_aggregates <- function(aggregate_data, ctrl) {
122122
stop("no rows in aggregate data remaining after subsetting to items ",
123123
"in `aggregate_item_names`")
124124

125-
aggregate_data <- aggregate_data[get("n_grp") > 0]
126-
if (!nrow(aggregate_data))
127-
stop("no rows in aggregate data remaining after dropping unobserved ",
128-
"group-item combinations")
129-
130125
extra_colnames <- setdiff(names(aggregate_data),
131126
c(ctrl@geo_name, ctrl@time_name, ctrl@group_names, "item", "s_grp", "n_grp"))
132127
if (length(extra_colnames)) {

R/shape_hierarchical.r

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,10 @@ shape_hierarchical_data <- function(modifier_data, modifier_names, group_grid_t,
1010
hierarchical <- data.table::copy(modifier_data)
1111
hierarchical <- drop_extra_cols(hierarchical, modifier_names, ctrl)
1212
data.table::setkeyv(hierarchical, c(ctrl@geo_name, ctrl@time_name))
13-
unmodeled <- zero_unmodeled(hierarchical, modifier_names, group_grid_t, ctrl)
14-
hierarchical <- rbind(hierarchical, unmodeled)
13+
if (length(ctrl@group_names)) {
14+
unmodeled <- zero_unmodeled(hierarchical, modifier_names, group_grid_t, ctrl)
15+
hierarchical <- rbind(hierarchical, unmodeled)
16+
}
1517
zz <- create_zz(hierarchical, modifier_names, ctrl)
1618
return(zz)
1719
}
@@ -40,7 +42,8 @@ zero_unmodeled <- function(hierarchical, modifier_names, group_grid_t, ctrl) {
4042
paste0(x, unique(group_grid_t[[x]]))[-1]
4143
}))
4244
unmodeled_frame <- expand.grid(c(list(unmodeled_param_levels,
43-
ctrl@time_filter), rep(list(0L), length(modifier_names))))
45+
ctrl@time_filter), rep(list(0L), length(modifier_names))),
46+
stringsAsFactors = FALSE)
4447
unmodeled_frame <- setNames(unmodeled_frame, c(ctrl@geo_name, ctrl@time_name,
4548
modifier_names))
4649
data.table::setDT(unmodeled_frame, key = c(ctrl@geo_name, ctrl@time_name))

README.Rmd

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
---
2+
title: 'dgo: Dynamic Estimation of Group-Level Opinion'
23
output: github_document
34
---
45
[![Build Status](https://travis-ci.org/jamesdunham/dgo.svg?branch=master)](https://travis-ci.org/jamesdunham/dgo)
@@ -7,29 +8,29 @@ output: github_document
78

89
# Introduction
910

10-
dgo is an R package for the dynamic estimation of group-level opinion. The
11-
package can be used to estimate subpopulation groups' average latent
12-
conservatism (or other latent trait) from individuals' responses to dichotomous
13-
questions using a Bayesian group-level IRT approach developed by [Caughey and
14-
Warshaw
15-
2015](http://pan.oxfordjournals.org/content/early/2015/02/04/pan.mpu021.full.pdf+html)
16-
that models latent traits at the level of demographic and/or geographic groups
17-
rather than individuals. This approach uses a hierarchical model to borrow
18-
strength cross-sectionally and dynamic linear models to do so across time. The
19-
group-level estimates can be weighted to generate estimates for geographic
20-
units, such as states.
21-
22-
dgo can also be used to estimate smoothed estimates of subpopulation groups'
23-
average responses on individual survey questions using a dynamic multi-level
24-
regression and poststratification (MRP) model ([Park, Gelman, and Bafumi
11+
dgo is an R package for the dynamic estimation of group-level public opinion.
12+
You can use the package to estimate latent trait means in subpopulations from
13+
survey data. For example, dgo can estimate the average policy liberalism in each
14+
American state over time among Democrats, Independents, and Republicans, given
15+
their answers to survey questions about policy proposals.
16+
17+
dgo accomplishes this using a Bayesian group-level IRT approach developed by
18+
[Caughey and Warshaw
19+
2015](http://pan.oxfordjournals.org/content/early/2015/02/04/pan.mpu021.full.pdf+html).
20+
It models latent traits at the level of demographic and geographic groups rather
21+
than individuals. It uses a hierarchical model to borrow strength
22+
cross-sectionally and dynamic linear models to do so across time.
23+
24+
The package can also be used to estimate smoothed estimates of subpopulations'
25+
average responses to single survey items, using a dynamic multi-level regression
26+
and poststratification (MRP) model ([Park, Gelman, and Bafumi
2527
2004](http://stat.columbia.edu/~gelman/research/published/StateOpinionsNationalPolls.050712.dkp.pdf)).
26-
For instance, it could be used to estimate public opinion in each state on
28+
For instance, you can use dgo to estimate public opinion in each state on
2729
same-sex marriage or the Affordable Care Act.
2830

2931
This model opens up new areas of research on historical public opinion in the
30-
United States at the subnational level. It also enables scholars of comparative
31-
politics to estimate dynamic models of public opinion opinion at the country or
32-
subnational level.
32+
United States at the subnational level. It also allows scholars of comparative
33+
politics to estimate dynamic cross-national models of public opinion.
3334

3435
```{r, knitr-options, echo = FALSE}
3536
# rmarkdown::render("README.Rmd")
@@ -67,7 +68,7 @@ If you don't have already have RStan, follow its
6768
Load the package and set RStan's recommended options for a local, multicore
6869
machine with excess RAM:
6970

70-
```{r, result = 'hide'}
71+
```{r, result = 'hide', message = FALSE}
7172
library(dgo)
7273
rstan_options(auto_write = TRUE)
7374
options(mc.cores = parallel::detectCores())

README.md

Lines changed: 92 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,73 +1,118 @@
1-
2-
[![Build Status](https://travis-ci.org/jamesdunham/dgo.svg?branch=master)](https://travis-ci.org/jamesdunham/dgo) [![Build status](https://ci.appveyor.com/api/projects/status/1ta36kmoqen98k87?svg=true)](https://ci.appveyor.com/project/jamesdunham/dgo) [![codecov](https://codecov.io/gh/jamesdunham/dgo/branch/master/graph/badge.svg)](https://codecov.io/gh/jamesdunham/dgo)
3-
4-
Introduction
5-
============
6-
7-
dgo is an R package for the dynamic estimation of group-level opinion. The package can be used to estimate subpopulation groups' average latent conservatism (or other latent trait) from individuals' responses to dichotomous questions using a Bayesian group-level IRT approach developed by [Caughey and Warshaw 2015](http://pan.oxfordjournals.org/content/early/2015/02/04/pan.mpu021.full.pdf+html) that models latent traits at the level of demographic and/or geographic groups rather than individuals. This approach uses a hierarchical model to borrow strength cross-sectionally and dynamic linear models to do so across time. The group-level estimates can be weighted to generate estimates for geographic units, such as states.
8-
9-
dgo can also be used to estimate smoothed estimates of subpopulation groups' average responses on individual survey questions using a dynamic multi-level regression and poststratification (MRP) model ([Park, Gelman, and Bafumi 2004](http://stat.columbia.edu/~gelman/research/published/StateOpinionsNationalPolls.050712.dkp.pdf)). For instance, it could be used to estimate public opinion in each state on same-sex marriage or the Affordable Care Act.
10-
11-
This model opens up new areas of research on historical public opinion in the United States at the subnational level. It also enables scholars of comparative politics to estimate dynamic models of public opinion opinion at the country or subnational level.
12-
13-
Installation
14-
============
15-
16-
dgo can be installed from [CRAN](https://CRAN.R-project.org/package=dgo):
1+
dgo: Dynamic Estimation of Group-Level Opinion
2+
================
3+
4+
[![Build
5+
Status](https://travis-ci.org/jamesdunham/dgo.svg?branch=master)](https://travis-ci.org/jamesdunham/dgo)
6+
[![Build
7+
status](https://ci.appveyor.com/api/projects/status/1ta36kmoqen98k87?svg=true)](https://ci.appveyor.com/project/jamesdunham/dgo)
8+
[![codecov](https://codecov.io/gh/jamesdunham/dgo/branch/master/graph/badge.svg)](https://codecov.io/gh/jamesdunham/dgo)
9+
10+
# Introduction
11+
12+
dgo is an R package for the dynamic estimation of group-level public
13+
opinion. You can use the package to estimate latent trait means in
14+
subpopulations from survey data. For example, dgo can estimate the
15+
average policy liberalism in each American state over time among
16+
Democrats, Independents, and Republicans, given their answers to survey
17+
questions about policy proposals.
18+
19+
dgo accomplishes this using a Bayesian group-level IRT approach
20+
developed by [Caughey and Warshaw
21+
2015](http://pan.oxfordjournals.org/content/early/2015/02/04/pan.mpu021.full.pdf+html).
22+
It models latent traits at the level of demographic and geographic
23+
groups rather than individuals. It uses a hierarchical model to borrow
24+
strength cross-sectionally and dynamic linear models to do so across
25+
time.
26+
27+
The package can also be used to estimate smoothed estimates of
28+
subpopulations’ average responses to single survey items, using a
29+
dynamic multi-level regression and poststratification (MRP) model
30+
([Park, Gelman, and Bafumi
31+
2004](http://stat.columbia.edu/~gelman/research/published/StateOpinionsNationalPolls.050712.dkp.pdf)).
32+
For instance, you can use dgo to estimate public opinion in each state
33+
on same-sex marriage or the Affordable Care Act.
34+
35+
This model opens up new areas of research on historical public opinion
36+
in the United States at the subnational level. It also allows scholars
37+
of comparative politics to estimate dynamic cross-national models of
38+
public opinion.
39+
40+
# Installation
41+
42+
dgo can be installed from
43+
[CRAN](https://CRAN.R-project.org/package=dgo):
1744

1845
``` r
1946
install.packages("dgo")
2047
```
2148

22-
Or get the latest version from [GitHub](https://github.com/jamesdunham/dgo) using [devtools](https://github.com/hadley/devtools/):
49+
Or get the latest version from
50+
[GitHub](https://github.com/jamesdunham/dgo) using
51+
[devtools](https://github.com/hadley/devtools/):
2352

2453
``` r
2554
if (!require(devtools, quietly = TRUE)) install.packages("devtools")
2655
devtools::install_github("jamesdunham/dgo")
2756
```
2857

29-
dgo requires a working installation of [RStan](http://mc-stan.org/interfaces/rstan.html). If you don't have already have RStan, follow its "[Getting Started](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started)" guide.
58+
dgo requires a working installation of
59+
[RStan](http://mc-stan.org/interfaces/rstan.html). If you don’t have
60+
already have RStan, follow its “[Getting
61+
Started](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started)
62+
guide.
3063

31-
Usage
32-
=====
64+
# Usage
3365

34-
Load the package and set RStan's recommended options for a local, multicore machine with excess RAM:
66+
Load the package and set RStan’s recommended options for a local,
67+
multicore machine with excess RAM:
3568

3669
``` r
3770
library(dgo)
38-
#> Loading required package: dgodata
39-
#> Loading required package: rstan
40-
#> Loading required package: ggplot2
41-
#> Loading required package: StanHeaders
42-
#> rstan (Version 2.16.2, packaged: 2017-07-03 09:24:58 UTC, GitRev: 2e1f913d3ca3)
43-
#> For execution on a local, multicore CPU with excess RAM we recommend calling
44-
#> rstan_options(auto_write = TRUE)
45-
#> options(mc.cores = parallel::detectCores())
4671
rstan_options(auto_write = TRUE)
4772
options(mc.cores = parallel::detectCores())
4873
```
4974

5075
The minimal workflow from raw data to estimation is:
5176

5277
1. shape input data using the `shape()` function; and
53-
2. pass the result to the `dgirt()` function to estimate a latent trait (e.g., conservatism) or `dgmrp()` function to estimate opinion on a single survey question.
54-
55-
Troubleshooting
56-
===============
57-
58-
Please [report issues](https://github.com/jamesdunham/dgo/issues) that you encounter.
59-
60-
- OS X only: RStan creates temporary files during estimation in a location given by `tempdir()`, typically an arbitrary location in `/var/folders`. If a model runs for days, these files can be cleaned up while still needed, which induces an error. A good solution is to set a safer path for temporary files, using an environment variable checked at session startup. For help setting environment variables, see the Stack Overflow question [here](https://stackoverflow.com/questions/17107206/change-temporary-directory). Confirm the new path before starting your model run by restarting R and checking the output from `tempdir()`.
61-
62-
- Models fitted before October 2016 (specifically &lt; [\#8e6a2cf](https://github.com/jamesdunham/dgo/commit/8e6a2cfbe00b2cd4a908b3067241e06124d143cd)) using dgirt are not fully compatible with dgo. Their contents can be extracted without using dgo, however, with the `$` indexing operator. For example: `as.data.frame(dgirtfit_object$stan.cmb)`.
63-
64-
- Calling `dgirt()` or `dgmrp()` can generate [warnings](http://mc-stan.org/misc/warnings#compiler-warnings) during model compilation. These are safe to ignore, or can be suppressed by following the linked instructions.
65-
66-
Contributing and citing
67-
=======================
68-
69-
dgo is under development and we welcome [suggestions](https://github.com/jamesdunham/dgo/issues).
78+
2. pass the result to the `dgirt()` function to estimate a latent trait
79+
(e.g., conservatism) or `dgmrp()` function to estimate opinion on a
80+
single survey question.
81+
82+
# Troubleshooting
83+
84+
Please [report issues](https://github.com/jamesdunham/dgo/issues) that
85+
you encounter.
86+
87+
- OS X only: RStan creates temporary files during estimation in a
88+
location given by `tempdir()`, typically an arbitrary location in
89+
`/var/folders`. If a model runs for days, these files can be cleaned
90+
up while still needed, which induces an error. A good solution is to
91+
set a safer path for temporary files, using an environment variable
92+
checked at session startup. For help setting environment variables,
93+
see the Stack Overflow question
94+
[here](https://stackoverflow.com/questions/17107206/change-temporary-directory).
95+
Confirm the new path before starting your model run by restarting R
96+
and checking the output from `tempdir()`.
97+
98+
- Models fitted before October 2016 (specifically \<
99+
[\#8e6a2cf](https://github.com/jamesdunham/dgo/commit/8e6a2cfbe00b2cd4a908b3067241e06124d143cd))
100+
using dgirt are not fully compatible with dgo. Their contents can be
101+
extracted without using dgo, however, with the `$` indexing
102+
operator. For example: `as.data.frame(dgirtfit_object$stan.cmb)`.
103+
104+
- Calling `dgirt()` or `dgmrp()` can generate
105+
[warnings](http://mc-stan.org/misc/warnings#compiler-warnings)
106+
during model compilation. These are safe to ignore, or can be
107+
suppressed by following the linked instructions.
108+
109+
# Contributing and citing
110+
111+
dgo is under development and we welcome
112+
[suggestions](https://github.com/jamesdunham/dgo/issues).
70113

71114
The package citation is:
72115

73-
Dunham, James, Devin Caughey, and Christopher Warshaw. 2017. dgo: Dynamic Estimation of Group-level Opinion. R package. <https://jdunham.io/dgo/>.
116+
Dunham, James, Devin Caughey, and Christopher Warshaw. 2017. dgo:
117+
Dynamic Estimation of Group-level Opinion. R package.
118+
<https://jdunham.io/dgo/>.

data/toy_dgirt_in.rda

-48 Bytes
Binary file not shown.

data/toy_dgirtfit.rda

-5.36 KB
Binary file not shown.

0 commit comments

Comments
 (0)