Dr. Bradley E. Cox

Lip Service or Actionable Insights?: Linking Student Experiences to Institutional Assessment and Data-Driven Decision Making in Higher Education

Cox, B. E., Reason, R. D., Tobolowsky, B. T., Brower, R. L., Patterson, S., Luczyk, S., and Roberts, K. L. (2017). Lip service or actionable insights? Linking student experiences to assessment, accountability, and data-driven decision making in higher education. Journal of Higher Education, 88(6), 835-862. doi: 10.1080/00221546.2016.1272320

ABSTRACT
Despite an increasing focus on issues of accountability, assessment, and data-driven decision making (DDDM) within the postsecondary context, assumptions regarding their value remain largely untested. The current study uses empirical data from 114 senior administrators and 8,847 students at 57 institutions in five states to examine the extent to which institutional assessment and data-driven decision making shape the experiences of first-year students. Nearly all these schools regularly collect some form of assessment data, and more than half report using assessment data to inform decision making. However, the institutional adoption of policies related to the collection of assessment data or the application of data-driven decision making appears to have no relationship with student experiences or outcomes in the first year of college. Thus, findings from the current study are consistent with the small, but growing, body of literature questioning the effectiveness of accountability and assessment policies in higher education.

 Access the full text below…

PDF - Full Text

Direct Access

DigiNole - Full Text

FSU’s Repository

Publisher's Site

Official Publisher Version

Full text from a pre-publication draft of the article is pasted below…

Lip Service or Actionable Insights? Linking Student Experiences to Institutional Assessment and Data-Driven Decision Making in Higher Education

Roughly one in four students who begins college at a four-year college or university does not return to that institution for a second year (ACT Inc., 2013). This troubling statistic has not changed dramatically over the last 30 years, despite institutions of higher education implementing countless reforms in an effort to increase student success (e.g., learning, persistence, graduation). Among these efforts are hundreds of specific initiatives designed to facilitate student engagement, which has been found to predict student grades and persistence, particularly for underrepresented and underprepared students (Kuh, Cruce, Shoup, Kinzie, & Gonyea, 2008; Kuh, Kinzie, Buckley, Bridges, & Hayek, 2007) – with a strong focus on the critical first year of college (Barefoot et al., 2005; Upcraft & Gardner, 1989; Upcraft, Gardner, Barefoot, & Associates, 2005).

Although various piecemeal initiatives have certainly contributed to improved outcomes at many institutions, they have not appreciably increased national persistence rates. As a result, educational policymakers and administrators have come under growing pressure to address this critical issue. However, this mandate is complicated by economic circumstances that have heightened the need for greater operational efficiency within higher education (Altbach, Berdahl & Gumport, 2005; Paulsen & Smart, 2001). Responsive to these external pressures, and in an effort to demonstrate their effectiveness, many colleges and universities are now spending hundreds of thousands of dollars, annually, to cover the costs of assessment (Cooper & Terrell, 2013), most of which continues to focus on student experiences and outcomes (Bresciani, Gardner & Hickmott, 2009; Ory, 1992; Schuh & Associates, 2009; Schuh & Gansemer-Topf, 2010).

Underlying this approach is the assumption that educational quality is likely to be improved when decision makers develop policies and implement practices informed by relevant assessment data (i.e., “data-driven decision making” or DDDM). However, the assumption that assessment practices and data-driven decision making by institutions of higher education yields improved student experiences and outcomes remains largely untested. Therefore, the purpose of this paper is to articulate distinct perspectives on the use of assessment and DDDM in higher education, document the extent to which these practices have been implemented, and use empirical data from 114 senior administrators and 8,847 students at 57 diverse postsecondary institutions across 5 states to examine linkages between assessment/DDDM policies and student experiences in the first year of college. Specifically, this study addresses two research questions:

  1. To what extent are institutions of higher education employing assessment and data-driven decision making regarding students’ first year of college?
  2. To what extent does institutional adoption of assessment and data-driven decision making correspond to levels of first-year student engagement and perceived gains?

Background and Context

The terms “assessment” and “data-driven decision making” have been used in many educational contexts and have been defined by scholars in numerous ways, with Suskie (2009) noting “the vocabulary of assessment is not yet standardized” (p. 3). Upcraft, Schuh, Miller, Terenzini, and Whitt  (1996) employ a traditional definition of assessment, stating that “assessment is any effort to gather, analyze, and interpret evidence, which describes institutional, divisional, or agency effectiveness” (p. 18). We initially adopted this traditional definition of assessment and used the term data-driven decision making (DDDM) to reflect the application of such evidence to guide decision making. The names of our policy scales reflect these differentiated definitions. However, the distinction between assessment and DDDM has been blurred by more contemporary authors (e.g., Marsh, Pane, & Hamilton, 2006) who argue that DDDM captures the entire cycle of collecting data, interpreting findings, and using the resulting evidence to inform decision making. Therefore, consistent with these more recent definitions and in an effort to ease readability, we use the terms assessment and DDDM interchangeably throughout the remainder of the manuscript except when referencing a specific policy scale generated for this study. 

Throughout our review of the literature, we were struck by the dearth of empirical evidence supporting the claim that assessment practices contribute to positive student outcomes in higher education. Though we did not find large-scale analyses on the effects of DDDM in higher education, comprehensive reviews have been conducted for K-12 education. For instance, a RAND Corporation review of DDDM in education conceded that “like most of the literature to date on DDDM, these studies are primarily descriptive and do not address the effects of DDDM on student outcomes” (Marsh et al., 2006, p. 1). Similarly, a U.S. Department of Education report on using data to guide instructional decision making found little linkage between the two (Hamilton et al., 2009).

Perhaps because of scant empirical evidence linking assessment and educational outcomes, scholars studying the topic have not yet coalesced around a single theoretical framework. Therefore, we seek guidance from both traditional college effects models (e.g., Astin, 1991; Kuh  et al., 2007; Terenzini & Reason, 2005; Pascarella, 1985) and organizational theory’s new institutionalism (e.g., DiMaggio & Powell, 1983; March & Olsen, 1983; Selznick, 1957; Zucker, 1987).

College Effects Models

Many of the traditional college effects models used to frame studies of college student experiences and outcomes implicitly embrace Schön’s (1983) concept of “reflective practice” (p. 1), which suggests that assessment activities are a necessary ingredient for organizational learning and improved operational effectiveness. They also contend that institutional structures, policies, and cultures shape campus environments, which, in turn, shape student experiences and outcomes. Although these assumptions are embedded in decades-old models (e.g., Astin, 1991; Pascarella, 1985), we highlight two relatively new college effects models.

Terenzini and Reason (2005) offer the Parsing the First Year of College model as a framework for studying how college affects students. Although a framework best utilized for multi-institutional studies, the Parsing model grew out of Gardner’s Foundations of Excellence project, an ongoing, campus-based assessment initiative designed to assist campus officials through a yearlong data-driven decision-making process. The Parsing model connects students’ experiences and outcomes to the organizational context, which is influenced by institutional policies through academic and co-curricular programs as well as the faculty culture that is reinforced through policies. According to Terenzini and Reason, as well as the Foundations of Excellence project, data-driven decision making about institutional policies is essential to improving student learning.

Offering a somewhat more fluid model focused on student engagement, which they place at the intersection of “student behaviors” and “institutional conditions” (p. 8), Kuh, Kinzie, Buckley, Bridges, and Hayek (2007) developed their model following a comprehensive review of the literature on college student success. Among the institutional conditions shaping student engagement are the first-year experience, academic support services, and the campus environment. Moreover, Kuh et al. (2007) suggest institutional policies mediate the effects of students’ backgrounds (e.g., high school preparation, family support) on their college experiences. These institutional conditions and policies are themselves shaped by external factors like the economy, state/federal policies, and the push for accountability in higher education. Thus, the Kuh et al. (2007) model exemplifies the manner in which public calls for efficiency and accountability can influence institutional policies and campus conditions that, in turn, affect students’ experiences and outcomes.

Although college effects models imply that assessment policies should contribute to institutional improvement, few studies document empirical connections between assessment and improvement. McCormick, Kinzie, and Korkmaz (2011) provide some evidence that intentional data collection over time may result in improvement on NSSE Benchmark scores, leading those authors to posit that data collected in previous years might be informing better practices; that is, colleges and universities might be engaging in DDDM. Although the study was not able to directly connect the use of NSSE data to institutional policy changes that influenced students’ growth, the findings do suggest that institutional improvement is possible over time. Further, McCormick and colleagues found that the strongest motivation for assessment was leaders’ efforts to create a culture of improvement. These leaders at schools demonstrating improvement in NSSE scores over time reported that governmental mandates and national calls for accountability were not very influential.

Collectively, these college effects models suggest that assessment policies would contribute to an institutional “culture of evidence” (Oburn, 2005, p. 19) in which assessment is integrated into all aspects of university operations. Institutions with such a culture collect assessment data as a matter of habit and routinely apply that data to operational decisions. When they do so, institutions operate in accordance with the principles of reflective practice (Schön, 1983), which suggests that organizational learning can result from assessment practice, but only when knowledge is used to change behavior (Argyris & Schön, 1996; Ebrahim, 2005; Senge, 1990).

Yet even those who draw hope from the optimistic perspective of reflective practice concede that colleges and universities are slow-moving entities in which the collection of assessment data far outpaces their use of results to inform policy or practice. Banta and Blaich (2011) noted that only 6% of institutional examples representing “good practice” in assessment actually had evidence indicating student learning had improved as a result. They summarized, “we scoured current literature, consulted experienced colleagues, and reviewed our own experiences, but we could identify only a handful of examples of the use of assessment findings in stimulating improvements” (p. 22). Nonetheless, Banta and Blaich (2011), as well as Schuh and Associates (2009), hold onto the hope that colleges and universities can use assessment for reflective practice through which institutions can effectively create or maintain environments conducive to student engagement and learning.

Organizational Theory

In contrast, those applying organizational theory’s new institutionalism perspective (March & Olsen, 1984; Zucker, 1987) might worry that coercive isomorphism pushes colleges to adopt assessment policies for primarily symbolic reasons. Such isomorphism occurs as institutions respond to financial, social, and political pressures by adopting policies and practices that have been legitimized by the external environment. Often, these decisions are made not for operational efficiency, but for political advantage or institutional survival (March & Olsen, 1984; Selznick, 1957; Zucker, 1987). Indeed, some theorists (e.g., Scott & Meyer, 1994) “observed that the success of educational organizations depends on the extent to which they conform to established specifications and expectations” (Dey, Milem & Berger, 1997, p. 310).

The resulting policies, processes, and structures are not necessarily reflective of or appropriate for effective implementation of work activities (DiMaggio & Powell, 1983; Meyer & Rowan, 1977). Therefore, coercive isomorphism (DiMaggio & Powell, 1983) can lead institutions to adopt such policies as a symbolic gesture of conformity/compliance while actually operating independently of the structures created to support implementation of those policies. By decoupling these symbolic policies and structures from the day-to-day operations of the organization, administrators can buffer core operations and personnel from external interference. Moreover, because institutions of higher education are “loosely coupled systems” (Weick, 1976, p. 1), where faculty members operate with considerable autonomy and are free from daily monitoring by administrators, individual actors may resist implementation of what they see as merely symbolic policies (Coburn, 2004). Such policies eventually become “rationalized myths” which depict “various formal structures as rational means to the attainment of desirable ends” (Meyer & Rowan, 1977, p. 509) while not actually contributing to the achievement of those ends. Therefore, Weick (1976) warns that institution-level policies may be too far removed from students’ daily lives to have any appreciable effect on their experiences or outcomes.

Applied to the use of assessment policies by colleges and universities, these organizational theories give rise to concerns about “coercive accountability” (Shore & Wright, 2000, p. 57) and “accountability myopia” (Ebrahim, 2005). Ebrahim’s “accountability myopia” thesis warns of the unintended consequences of assessment practices in organizations. For example, assessment may often represent a purely symbolic function in postsecondary institutions (Feldman & March, 1981), used for “legitimating existing activities rather than for identifying problematic areas for improvement” (Ebrahim, p. 67). Further, outcome measures important to powerful stakeholders are more likely to be assessed than outcome measures important to less powerful groups. For instance, the economic benefits of higher education are widely assessed and subsequently reported by the popular press and business groups whereas the psychosocial benefits of college such as prejudice reduction or civic engagement are less frequently assessed and may only be monitored by academic researchers and community activists. Thus, “accountability is also about power, in that asymmetries in resources become important in influencing who is able to hold whom accountable” (Ebrahim,  p. 60). Shore and Wright (2000) refer to this phenomenon in higher education as “coercive accountability” (p. 57).

The contrasting perspectives offered by college effects models and organizational theory provide us two distinct lenses through which to review the effectiveness of institutional assessment policies in higher education – one of the most important and contested issues currently being debated among scholars, practitioners, and policy-makers.

Method

The current study comes out of the Linking Institutional Policies to Student Success (LIPSS) project that seeks to identify specific institution-wide policies that might be leveraged to increase college student engagement. In this section, we describe the project only as it relates to the current analyses. Complete details about the project (e.g., full survey instruments, list of participating institutions, and summary of related literature) can be found at http://cherti.fsu.edu/LIPSS/index.html.

Sample

In January, 2012, project staff contacted administrators at all 108 bachelor’s degree- granting institutions across five states (i.e., California, Florida, Iowa, Texas, and Pennsylvania) that had already signed up to participate in the 2012 administration of the National Survey of Student Engagement – a widely used and validated annual survey about student behaviors related to educationally beneficial activities while in college. This paper draws its data from 8,847 first-year students and 114 senior administrators at the 57 institutions that chose to participate in the LIPSS project (see Table 1). The sample is sizable in number and diverse in composition, inclusive of a wide range of institutional types, sizes, levels of selectivity, and sources of control/funding. Nonetheless, because the sample is not necessarily representative of the nation’s colleges and universities, we urge caution when considering generalizations to other institutions.

INSERT TABLE 1 ABOUT HERE

Institutional Policy Scales

We choose to focus our attention on institution-level policies for three reasons. First, the public consequences of state and/or federal accountability initiatives (e.g., rankings, ratings, or funding) are typically levied upon individual institutions. Second, unlike department- or person-specific activities that typically have a narrowly defined purpose or target a relatively few students, policymaking at the institutional level can affect an entire campus in wide-ranging ways. Third, we focus on institution-level policies because they are actionable. Although student outcomes depend heavily on individual student characteristics and choices, institutions establish environments that can facilitate or discourage student engagement in effective educational practice. Moreover, although administrators must act within varied contextual restraints created by state legislatures, governing boards, campus cultures, and financial circumstances, most senior administrators (e.g., Provosts or Vice Presidents for Student Affairs) have considerable leeway to adapt, and the leverage to enforce, policies and practices within their respective divisions/offices.

Institutional policy data come from the Survey of Academic Policies, Programs, and Practices (distributed to each Chief Academic Officer, typically the Provost) and the Survey of Student Affairs Policies, Programs, and Practices (distributed to the Chief Student Affairs Officer, typically the Vice President for Student Affairs or Dean of Students). These surveys asked administrators to report the extent to which their institution has aligned their programs, policies, and practices (e.g., first-year seminars, emphasis on diversity, recent assessment efforts) with the broad conclusions of the currently available research literature on first-year student success (Cox et al., 2012; Kuh et al., 2007; Pascarella & Terenzini, 1991, 2005; Terenzini & Reason, 2005; Upcraft et al., 2005). Together, these surveys’ items form a priori scales reflecting clusters of institutional policies. A scale score of one indicates an institution’s complete adoption of all measured policies related to a particular policy cluster. It may be helpful to think of the scale scores as approximations where a score of 0.5 indicates that an institution has done approximately one half (50%) of what it could be doing to align its policies with the available research.

Analyses reported in this paper make use of five such policy-cluster scales: (1) Assessment of Student Affairs Programs; (2) Recent Assessment Efforts regarding Student Learning; (3) Recent Assessment Efforts regarding Student Persistence; (4) Data-Driven Decision Making in Student Affairs; and (5) Data-Driven Decision Making in Academic Affairs. These scales relate to the systematic, consistent, and/or recent collection and use of assessment data to inform decision making at the institutional level. For example, the “Data-Driven Decision Making – Academic Affairs” scale has a Cronbach’s alpha of 0.90 and is based on four individual items reflecting university efforts to use data for resource allocation, course development/design, and academic department evaluation and planning. Scale reliability is strong for all five scales (Cronbach’s alpha ranging from 0.79 to 0.90), and details about scale composition are presented in Table 2 and the Appendix.

INSERT TABLE 2 ABOUT HERE

Student Experience and Outcome Data

The student data come from the National Survey of Student Engagement (NSSE), which was completed by 8,847 first-year students attending the 57 participating institutions (see Tables 1 and 2). Institutional sample sizes ranged from 19 to 486 students, with a mean of 155 participating students on each campus. All student-level data are weighted to reflect the gender and full-time/part-time distribution of first-year students at each participating institution. Dependent/criterion variables in these analyses are all scales composed of items from the National Survey of Student Engagement. Table 2 provides details regarding the composition and internal consistency of the nine student-level outcome scales used in these analyses: five traditional NSSE benchmarks of effective educational practice in undergraduate education (i.e., Enriching Educational Experiences, Supportive Campus Environments, Level of Academic Challenge, Student-Faculty Interaction, and Active & Collaborative Learning; Cronbach’s alpha ranging from .60 to .79), a measure of “deep” learning experiences (alpha .85), and three self-reported measures of learning and development (i.e., perceived gains in Practical Competence, General Education, and Personal & Social Development; alphas of .83, .84, and .87, respectively). These scales reflect students’ engagement in educationally beneficial activities as well as their perceptions of growth during the first year of college, which have been linked to students’ first-to-second-year persistence even amid a variety of statistical controls (Hu, McCormick, & Gonyea, 2012).

To isolate the effects of institution-level policies, we included several control variables at the student level. Each model includes variables representing students’ gender/sex, race/ethnicity, age, major, first-generation status, transfer status, full/part-time enrollment, involvement with varsity athletics, on-campus residence, and their ACT Composite score. These student-level covariates were drawn from available NSSE data and are included as control variables because they are explicitly mentioned by Terenzini and Reason (2005) or Kuh et al. (2006), or have been shown by previous analyses of similar data to affect students’ NSSE results (e.g., Gayles, 2015; Umbach, Palmer, Kuh, & Hannah, 2006). Following Raudenbush and Bryk’s (2002) guidelines, we centered all level-1 (student) control variables around their grand means. We also entered several uncentered level-2 control variables reflecting institutional characteristics that are static or not easily manipulated by colleges or universities: public/private, highest degree offered, undergraduate enrollment, racial distribution of student body, Pell grant eligibility of the student body, and admissions selectivity as rated by the Barron’s guide. Primary models include all control variables, one independent variable of interest (i.e., policy scale), and one outcome variable (i.e., student experience and outcome scale).

Missing Data Augmentation

Like most quantitative data in higher education, our dataset included missing data. But unlike most analyses in education (Cox, McIntosh, Reason, & Terenzini, 2014; Peugh & Enders, 2004), we employed an advanced statistical procedure to maintain the integrity of the data without artificially reducing sample size or estimates of standard errors. Specifically, we followed the suggestions of Cox et al. (2014) and used multiple imputation to produce 10 distinct datasets, each created after 100 iterations of an imputation model that included all of the variables used in the eventual analytic model, auxiliary student-level variables, institutional dummy-codes, and several interaction terms that were not subsequently used for analyses, thus creating an imputation model that is more complex than the subsequent analytic model (Allison, 2002; Cox et al., 2014; Graham, 2009; Rubin, 1987; Schafer, 1997). Analyses for this study were conducted using the SPSS v. 22 software package, which uses algorithms derived from Rubin (1987) and Schafer (1997) to pool results across all 10 datasets.

Analytic Approach

After producing descriptive statistics (see Table 1), we continued analyses by building a series of random-intercept multi-level models (Raudenbush & Bryk, 2002) using an unrestricted variance/covariance matrix for each of the dependent/criterion variables. For each of the nine student outcomes, we first ran an empty/null model, which allowed us to calculate an intra-class correlation (ICC) reflecting the extent to which variability in the outcome measure is attributable to differences between institutions. The second model included all level-1 (student) control variables. In the third model, we added the level-2 (institution) control variables. In our fourth and final set of models, for each of the nine outcome variables, we independently added each of the policy scales to our previous models.

Constructing the final analytic models in this manner allowed us to isolate the net effects of each policy scale and interpret resulting model coefficients as indicative of the manner in which increasing levels of policy adoption are associated with student experiences and outcomes for the average student in our sample. Moreover, Raudenbush and Bryk (2002) imply that such a construction makes the coefficients for our level-2 policy variables of interest robust to potential misspecification of the level-1 (student) model. Finally, because our independent variables of interest occur at the institutional level (level-2), where we have 57 institutions, we use the more liberal critical p-value of 0.10 instead of the more traditional 0.05.

Power, Robustness, and Interpretation

We have taken several precautions to avoid threats related to both type 1 and type 2 errors of interpretation. At no point do we base our interpretation on a single coefficient’s statistical significance or an arbitrarily chosen cut point for reduction in level-2 residual variance. We also acknowledge that none of the results would independently remain statistically significant were we to apply individual Bonferreni post-hoc adjustments for critical p-values. Rather, we look for patterns of results that are suggestive of underlying policy effects. Together with the use of multiple-imputation and multi-level modeling, these interpretive cautions specifically protect against type-1 errors (i.e., incorrectly claiming a policy effect).

Moreover, we protect against type-2 errors (i.e., incorrectly dismissing a true policy effect) in two ways. First, we conducted several versions of power analyses using the Optimal Design software (version 3.01) specific to multi-level models (see Figure 1). An initial model that calculated the minimum detectable effect size for a policy variable, assuming it was the only institutional (level-2) variable in the analytic model, revealed that our analyses can, with 90% power, detect effect sizes of .13 to .23 (the precise minimum detectable effect size varies with the intra-class correlation of each dependent variable). Recognizing that the inclusion of level-2 covariates would affect our ability to detect policy effects, and consistent with results from exploratory models using this same dataset, we also ran power analyses in which other level-2 covariates (e.g., size, public/private) explain 17-42% of the intra-class correlations. These power analyses confirm that our final analytic models, those which include all of the level-2 covariates, could detect level-2 effect sizes of .10 to .21.

INSERT FIGURE 1 ABOUT HERE

Second, after seeing the results from our complete models (with level-2 covariates included), we ran an additional set of models in which each single policy variable was the only level-2 variable included. These models effectively maximized the possibility of finding a statistically significant policy effect, even if that effect could be accounted for by other level-2 covariates. The results from these analyses are presented as supplemental findings below.

Limitations

The current study uses advanced statistical methods to analyze a large and unique dataset. Nonetheless, the results of these analyses must be considered in light of a few limitations. Conceptually, this study focuses exclusively on the first year of college and, therefore, makes use of relatively narrow definitions of assessment, data-driven decision making, and student success. The study also relies on self-reported data from both the institutions and the individual students, meaning there is potential that both administrators and students are biased to report that they are doing what they perceive to be the right things. However, the considerable variability of scale scores derived from both institutional and student data may placate concerns about restricted range or social desirability bias among respondents.

Although self-reported data in general, and NSSE specifically, have received some recent criticism (Bowman, 2011; Campbell & Cabrera, 2011; Porter, 2011), scholarly debate on the topic is far from conclusive, with several scholars (e.g., Ewell, McClenney, & McCormick, 2011; Kuh, Hu, & Vesper, 2000; Pike, 2011; 2013) recognizing the appropriate use of such data in a variety of circumstances. Indeed, similar data are used in roughly one-half of the articles published by top-tier journals in the field (Pike, 2011), and even critics like Bowman and Seifert (2011) note that student engagement “promote[s] a variety of positive outcomes, even for students who do not perceive such an effect” (p. 285). Much of the criticism has been directed toward students’ self-assessment of cognitive development or self-reported learning gains, which some of the harshest critics (e.g., Bowman & Herzog, 2011) note may be appropriate for use in large, multi-institution studies when the resulting scales are interpreted not as analogous to longitudinal improvements on criterion-referenced standardized tests but as students’ broad perceptions of their own growth while in college (Gonyea & Miller, 2011). Moreover, Hu et al. (2012) conclude that self-reported gains “retain their positive relation to persistence even after controls are applied” (p. 393-394). We deem the use of NSSE scales appropriate in this paper because a) we draw our data from 57 diverse institutions; b) the random intercept multi-level model, as we use it, addresses variations in the conditional mean scale score for whole institutions, not individual students; and c) we consider student engagement, participation in activities that contribute to deep learning, and perceived development to be proximal indicators of educationally beneficial experiences during the first year of college.

With respect to specific NSSE scales, we acknowledge that the Level of Academic Challenge (LAC) scale and the Deep Learning (DP) scale share some individual items and are, thus, not completely independent of each other. In addition, we note that a few of the individual items in the Enriching Educational Experiences (EEE) scale may not pertain to first-year students (e.g., study abroad, culminating senior experience). However, we retain the EEE scale in its original form for three reasons. First, the majority of items in the scale capture experiences that are both relevant to first-year students and critical to college outcomes (e.g. encounters with diversity, co-curricular activities). Second, by calculating the EEE scale using the syntax provided by NSSE, we maintain consistency with calculations done for participating institutions. Third, almost all published reports using 2002-2012 NSSE data have reported results from all five NSSE benchmarks (EEE, SCE, LAC, SFI, and ACL) side-by-side.

Other limitations of our study relate to the size of our sample and our analytic modeling procedures. With cross-sectional data from 57 institutions, our use of random-intercept HLM without an explicit structural or causal model may be obfuscating more subtle or conditional relationships between institutional policies and student experiences or outcomes. Moreover, although some of the policy scales address institutional assessment efforts taken specifically during the previous three years, other scales reflect the use of assessment data without reference to a specific time frame, thus limiting our ability to make causal claims about the longitudinal effects of such policies. Nonetheless, we have taken several steps to protect against both type-1 and type-2 errors (see Power, Robustness, and Interpretation section) and we make no claims of causal policy effects. In fact, the absence of any relationship between the institutional policy scales and the individual student scales suggests that assessment and DDDM are not causally related to student outcomes; indeed, our results suggest these assessment and DDDM policies are not even correlated with student experiences or outcomes.

Findings

Adoption of Assessment and Data-Driven Decision-Making (DDDM) Policies

Descriptive statistics from our study (see Tables 2 and 3) suggest that institutions are starting to embrace recent calls to increase the use of assessment and DDDM. Mean scale scores indicate that institutions in this study adopted roughly 49–69% of measured policies related to assessment and DDDM.

INSERT TABLE 3 ABOUT HERE

Deeper examination of Table 3 reveals meaningful variation across institutions. For example, although two schools in our study have adopted 100% of the student affairs assessment policies we measured, two other schools have adopted none of those policies. Table 3 documents similar variability for each of the five assessment and DDDM scales. However, we did not detect any patterns regarding which institutions were adopting which policies. T-tests and ANOVA comparisons reveal no statistically significant differences in rates of policy adoption across either public/private or states.

Links between Assessment and DDDM Policies and Student Experiences and Outcomes

Counter to researcher expectations, our hierarchical linear models (presented in Table 4) offer no evidence to suggest that assessment and DDDM policies are related to student engagement or self-reported gains during the first year of college. Although we analyzed a total of 45 models (5 policy scales x 9 student outcomes) and used a liberal critical value for statistical significance (0.10), in none of the analyses did any of the policy scales’ coefficients indicate a statistically significant relationship with any measure of student experiences or outcomes.

INSERT TABLE 4 ABOUT HERE

Our findings of non-significance defy even simple random chance. With a critical p-value of 0.10, researchers would expect to make a type-1 error (i.e., finding statistical significance and incorrectly claiming a policy effect) in approximately 10 percent of analyses. Applied to the current study, we would expect our 45 analyses to produce 4.5 statistically significant policy coefficients simply by chance. That our analyses did not yield even a single statistically significant policy coefficient suggests that our findings of non-significance are interpretable as reflecting the overall lack of connection between students’ experiences during the first year of college and their institutions’ policies on assessment and DDDM.

Supplemental Analyses

Surprised by these findings, we conducted three types of supplemental analyses to assess the robustness of findings from our primary analyses. Our first approach to assessing the robustness of our findings (or lack thereof) addressed Weick’s (1976) concern that institutional policies might be too far removed from students’ everyday experiences to have any detectable influence on our measures of student engagement and outcomes. To address this possibility, we generated 17 additional policy scales based on other questions from the same data sources (i.e., the Survey of Academic Policies, Programs, and Practices and the Survey of Student Affairs Policies, Programs, and Practices). These supplemental scales reflect a range of similarly distal institution-level policy clusters that are unrelated to assessment or DDDM. Additional scales derived from the student affairs survey measured, for example, institutions’ information dissemination practices, staffing policies, and administrative coordination efforts. Scales from the academic affairs survey related to, for example, the diversity of curricular offerings and policies about faculty involvement in first-year courses and events. Cronbach’s alpha for these scales ranged from 0.65 to 0.89. We then replicated the analyses presented in this paper by rerunning the final hierarchical linear models with each of these 17 policy scales in place of the 5 original assessment and DDDM scales. The subsequent analyses yielded 34 instances of statistical significance (32 positive and 2 negative coefficients), an average of 2.0 statistically significant coefficients per policy scale – whereas the 45 analyses with assessment and DDDM policy scales did not yield a single statistically significant coefficient.

For our second robustness check, we removed all of the institution-level covariates (e.g., size, public/private) from our models and reran each of our 45 original analyses with the assessment and DDDM scales as the only level-2 variables in the models. In doing so, we protected against the possibility that coefficients for these scales were unable to reach statistical significance because of multicollinearity among level-2 covariates. By removing all of the institution-level covariates, we maximized the opportunity for the policy scales to display statistically significant relationships with the outcome variables, even if those relationships were, in reality, attributable to other not-so-malleable factors like institutional size or selectivity. Yet, even with these more liberal models, the assessment and DDDM scales yielded statistically significant coefficients in only 2 of the 45 analyses – and 1 of those coefficients was negative. Thus, even when we artificially maximized the opportunity for assessment and DDDM policy scales to generate even spurious findings of statistical significance, the results seem to indicate that these policies have little to no positive connection with student experiences and outcomes.

Finally, replicating the no-covariate models using the 17 supplemental policy scales yielded 43 statistically significant coefficients, only 1 of which was negative. Thus, these supplemental policy scales yielded roughly 2.5 statistically significant (almost exclusively positive) coefficients per policy scale – quite a contrast with the 0.4 statistically significant (and half negative) coefficients for the assessment and DDDM policy scales. The consistent pattern of results from all three sets of supplemental analyses, thus, further substantiate our contention that the institution-level assessment and DDDM policies in this study have no meaningful empirical connection with student experiences and outcomes in the first college of year.

Discussion

We now understand how the contrasting perspectives of reflective practice and accountability myopia drive both the desperate hope and intense fear often expressed by different stakeholders when discussing accountability, assessment, and data-driven decision making in higher education. It is the hope that “reflective practice” (Schön, 1983, p. 1) leads to educational improvements that undergird the push for institutional accountability and widespread assessment use throughout higher education (e.g., Banta & Blaich, 2011; Schuh & Associates, 2009; see also the Spellings Commission report during the Bush administration, and Obama’s efforts to launch a College Scorecard and link financial aid to college value metrics). Yet the fear of “accountability myopia” (Ebrahim, 2005, p. 56) unsettles many faculty members and fuels resistance to more intense government involvement in postsecondary accreditation (Zemsky, 2011) or the adoption of No-Child-Left-Behind-like mandates within higher education (Flaherty, 2013). That our results yield no evidence of linkages between assessment/DDDM and student experiences/outcomes seems more likely to validate than to placate any lingering fears of “accountability myopia” (Ebrahim, 2005, p. 56) or “coercive accountability” (Shore & Wright, 2000, p. 57) in higher education.

Although stakeholders may view the topic from different perspectives, our findings suggest that college and university leaders have started to get the message about implementing assessment practices in higher education. Nearly all of the 57 schools in this study regularly collect some form of assessment data; more than half of those same institutions report using this data to inform decision making about personnel, courses, programs, and/or resource allocation. Nonetheless, because our sample included only schools that were paying to complete the National Survey of Student Engagement (NSSE), we would have anticipated higher rates of data-driven decision making among participating schools. Moreover, counter to researcher expectations and inconsistent with findings from McCormick et al., (2011), the institutional adoption of policies related to the collection of assessment data or the application of DDDM appears to have no relationship with student experiences or outcomes in the first year of college. Thus, it appears our results call into question the efficacy of institutions’ internal efforts at assessment. We offer two possible explanations for our (lack of) correlational findings.

Explanation 1: Improper Implementation of Data-Driven Decision Making

One potential explanation mirrors a widely accepted and frequently repeated sentiment that DDDM is theoretically good, but often poorly implemented; perhaps we found no connection between DDDM and student outcomes because the people charged with collecting and/or using such data are not effectively doing DDDM. Essentially, this “problem of practice” explanation argues that a) policies are not effectively implemented, and/or b) institutional decision makers are not properly interpreting or applying results from the assessment data being collected. Both possibilities merit serious consideration.

According to Mintzberg (1979), policy implementation within professional bureaucracies like institutions of higher education requires complete buy-in and cooperation from individuals at all levels in the organization. Because implementation of assessment policies necessitates active participation from students, faculty members, staff, and administrators at many levels within the bureaucracy, any of these individuals can undermine DDDM policies.

Although these policies are subject to deliberate subversion (e.g., a faculty member manipulating the timing of end-of-course evaluations or a student ignoring instructions and simply marking answer “A” for all questions), those adopting a more hopeful perspective would posit that DDDM policies are more frequently undermined by inadvertent or seemingly unavoidable limitations. For example, Bresciani, Gardner, and Hickmott (2009) described three common challenges affecting the use of assessment by faculty members and administrators: “(a) lack of time, (b) lack of resources, and (c) lack of understanding of assessment” (p. 136). Among student affairs professionals, three additional barriers were identified: (a) lack of understanding of the student development theories that undergird their practice; (b) lack of collaboration across departments, divisions, and/or with faculty members; and (c) disconnect between the services student affairs professionals can provide and the resources actually needed for the outcome to be realized (Bresciani, et al., 2009).

Of course, effective assessment also requires good data. However, obtaining good data may be an inherently challenging task. Often overwhelmed by the frequency and volume of survey requests, students may provide low-quality responses (Chen, 2011), or simply ignore requests for information (Adams & Umbach, 2011; Sax, Gilmartin, & Bryant, 2003). Thus, it seems assessment may be its own worst enemy. As institutions respond to increasing demands for accountability by collecting increasing volumes of assessment data, they simultaneously undermine the quality of the data collected, thereby inhibiting their ability to draw valid and reliable conclusions from the processes intended to facilitate effective data-driven decision making.

Even with good assessment data and complete buy-in, efforts to use that data to make informed decisions will have little impact on student success if the data are misused or misinterpreted by those making decisions. To effectively make informed decisions, institutional decision makers need easy access to timely and relevant data presented in a clear and simple format. There is at least some evidence to suggest that technological improvements are making such access possible; one recent study indicates that institutional research offices on many campuses now manage data “dashboards” that improve ease-of-access to assessment data (Association for Institutional Research, 2013). Still, getting the right data to the right person at the right time and in the right format remains difficult when increasing demands for accountability outpace institutional capacity to collect, analyze, interpret, report, and act on assessment data (Blaich & Wise, 2011)

Although there are surely inefficiencies in assessment policy implementation and limitations to institutions’ ability to leverage associated data to inform practice, our findings can also be viewed as evidence challenging and/or supporting the normative perspectives that frame this study.

Explanation 2: Accountability Myopia or Coercive Accountability

Results from the current study run counter to expectations implicit in many popular college effects models (e.g., Astin, 1991; Terenzini & Reason, 2005; Pascarella, 1985). Instead, the results align with expectations derived from organizational theories associated with new institutionalism (e.g., DiMaggio & Powell, 1983; March & Olsen, 1984; Meyer & Rowan, 1977; Zucker, 1987).

New institutionalism suggests that organizations must be responsive to the pressures exerted by their environments (Scott & Meyer, 1994). With the majority of the studied colleges and universities reporting the adoption of assessment-related policies, it appears institutional leaders have, indeed, responded to growing external pressure to demonstrate operational efficiency and institutional effectiveness. Moreover, we found no statistically significant differences in policy adoption rates between public and private schools, nor between schools from different states. These results suggest that the environmental pressure to adopt DDDM policies is pervasive, extending across state borders and transcending the public/private divide.

Such consistent legitimization of accountability measures seems likely to push institutional leaders, like the senior administrators we surveyed for this study, to validate the isomorphic adoption of assessment policies across the country and at all types of postsecondary institutions (DiMaggio & Powell, 1983). As more institutions have implemented assessment policies, the pressure for hold-outs and nay-sayers to “conform to established specifications and expectations” (Dey, Milem & Berger, 1997, p. 310) has likewise increased. Even skeptical institutional leaders likely perceive that they are now expected (by both policymakers and their peer institutions) to engage in assessment on their campuses. Reluctant administrators might now be adopting assessment policies because a failure to do so could be viewed by others as an indication of institutional negligence which, if exposed, could threaten the organization’s survival or reputational well-being (Meyer & Rowan, 1977).

The result of such coercive and mimetic isomorphism (DiMaggio & Powell, 1983) is widespread adoption of policies that, while appearing consistent with the accountability imperative, serve more as symbolic gestures implemented to maintain status than as mechanisms for long-lasting change (March & Olsen, 1984; Zucker, 1987). Of course, some institutional leaders have rational expectations that more information can lead an organization to make better decisions. But even then, Zucker (1987) notes that policies imposed upon an organization by external forces (as occurs, for example, when funding becomes contingent upon assessment data to justify expenditures for specific programs or services) can end up serving primarily symbolic functions even if they have support from senior administrators.

Implications

Collectively, our findings support conclusions that have implications for theory, policy, and practice. Our data reveal that most institutions have adopted some degree of DDDM as part of their regular institutional operations. This finding is consistent with Kuh et al.’s (2007) argument that external environmental pressures can shape institutional policies. However, our findings are somewhat inconsistent with their expectation that such institutional policies mediate relationships between students’ incoming characteristics and their engagement in educationally effective practices. Likewise, our results cannot validate Terenzini and Reason’s (2005) expectation that institutional policies contribute to distinct institutional environments that have unique effects on students’ experiences and outcomes.

Although there is considerable variability in the rates of assessment/DDDM policy adoption among participating institutions, suggesting that some schools have distinguished themselves from peer institutions by embracing policies consistent with the accountability movement, there is no evidence that such policy differentiation has led to commensurate differentiation in terms of first-year student experiences and outcomes. These results are consistent with the notion from organizational theory’s new institutionalism that although institutional policies have the potential to influence student experiences and outcomes, they are unlikely to do so when the policies are adopted as symbolic responses to external pressures – especially when the policies do not have faculty buy-in, diminish institutional autonomy, or commodify postsecondary institutions through coercive or mimetic isomorphism.

Nonetheless, though our findings give us some reason to challenge the assumption that more assessment will improve student outcomes, we acknowledge that pressures for accountability are not likely to dissipate any time soon. Recognizing that many institutional leaders feel that they must (or at least should) embrace DDDM on their campuses, we encourage these leaders to proactively mitigate the potential problems of practice that might otherwise undermine even well-intentioned efforts to leverage assessment to improve student outcomes. One avenue for addressing the problems of practice is to secure faculty (and student) buy-in for campus-wide assessment initiatives by actively involving them throughout the process. Doing so would involve a bottom-up approach to DDDM that minimizes coercive accountability by incorporating the perspectives of faculty, students, and other campus stakeholders in decisions about what gets assessed and how the results get used. Such a collaborative approach would blur the distinction between those who “assess” and those who “are assessed.”

Conclusion

We began this study believing, as do many who work within or around higher education, that

  1. a) Assessment data provide evidence of effectiveness and highlight areas needing improvement, so institutions ought to collect lots of such data.
  2. b) Decision-making is improved when informed by relevant data, so administrators ought to systematically use assessment data to shape institutional policies and practices.
  3. c) Accountability measures provide incentives for institutions to document and improve student experiences and outcomes, so educational policy leaders ought to allocate resources in ways that reward institutions whose assessment data demonstrate student success.

But upon completion of this study, we have begun to question the validity of these assumptions. Indeed, findings from the current study are consistent with the small, but growing, body of literature questioning the effectiveness of the accountability imperative in higher education. At the state level, Tandberg and Hillman (2013) found that, “while performance funding may have brought forth other outcomes … (e.g., greater accountability and oversight), it [performance funding] has generally not achieved the most basic goal all states believe is central to their performance efforts – improving degree productivity” (p. 7). At the institution level, Banta and Blaich (2011) have noted difficulty in finding evidence that assessment efforts have led to improved student learning. In both cases, the authors hold some hope that, with sufficient tweaking, state accountability and institutional assessment efforts can lead to improved student outcomes.

Although we admire the faith of other authors who conduct research on related topics (e.g., Banta & Blaich, 2011; Tandberg & Hillman, 2013), we tend to lean toward a more cynical conclusion. For, too often, we have observed the arguments supporting or challenging this movement toward accountability reduced to polarized and politicized ideological beliefs about the purpose of higher education and the mechanisms through which the academy can maintain (or develop, depending on your point of view) educational excellence and operational efficiency. For as Ebrahim (2009) notes, “framings of accountability problems and their solutions tend to be driven by normative agendas rather than by empirical realities” (p. 890). Therefore, we fear that accountability, assessment, and data-driven decision making will deteriorate into mere buzzwords – sound bites used by rival stakeholders to point accusatory fingers or defend pre-determined perspectives. And when it comes to the deeply entrenched problems affecting college student experiences and outcomes, real solutions are sure to require far more than just lip-service.

 

References

ACT Inc. (2013). 2013 retention/completion summary tables. Retrieved from http://www.act.org/research/policymakers/pdf/13retain_trends.pdf

Adams, M. J. D., & Umbach, P. D. (2011). Nonresponse and online student evaluations of teaching: Understanding the influence of salience, fatigue, and academic environments. Research in Higher Education, 53(5), 576-591. doi: 10.1007/s11162-011-9240-5

Allison, P. D. (2002). Missing data. Thousand Oaks, CA: Sage.

Altbach, P. G., Berdahl, R. O., & Gumport, P. J. (Eds.). (2005). American higher education in the twenty-first century: Social, political, and economic challenges (2nd ed.). Baltimore, MD: Johns Hopkins University Press.

Argyris, C., & Schön, D.A. (1996). Organizational learning II: Theory, method, and practice. Reading, MA: Addison-Wesley.

Association for Institutional Research. (2013). Over 80% of IR dashboards not publically accessible. eAIR. Tallahassee, FL. Retrieved from http://admin.airweb.org/eAIR/specialfeatures/Pages/DashboardsNotAccessible.aspx

Astin, A.W. (1991). Assessment for excellence: The philosophy and practice of assessment and evaluation in higher education. Phoenix, AZ: Oryx Press.

Banta, T. W., & Blaich, C. (2011). Closing the assessment loop. Change: The magazine of higher learning, 43(1), 22-27. doi: 10.1080/00091383.2011.538642

Barefoot, B. O., Gardner, J. N., Cutright, M., Morris, L.V., Schroeder, C. C., Schwartz, S. W., & Swing, R. L. (2005). Achieving and sustaining institutional excellence for the first year of college. San Francisco, CA: Jossey-Bass.

Blaich, C.F., & Wise, K. (2011). From gathering to using assessment results: Lessons from the Wabash national study.  (NILOA Occasional Paper 8). Retrieved from http://www.learningoutcomeassessment.org/documents/Wabash_001.pdf

Bresciani, M. J., Gardner, M. M., & Hickmott, J. (2009).  Demonstrating student success: A practical guide to outcomes-based assessment of learning and development in student affairs. Sterling, VA: Stylus.

Bowman, N. (2011). Examining systematic errors in predictors of college student self-reported gains. New Directions for Institutional Research, 150, 7-19.

Bowman, N. A., & Herzog, S. (2011). Reconciling (seemingly) discrepant findings: Implications for practice and future research. New Directions for Institutional Research, 2011(150), 113-120. doi:10.1002/ir.393

Bowman, N. A., & Seifert, T. A. (2011). Can college students accurately assess what affects their learning and development? Journal of College Student Development, 52(3), 270-290. doi:10.1353/csd.2011.0042

Campbell, C., & Cabrera, A. (2011). How sound is NSSE? Investigating the psychometric properties of NSSE at a public, research-extensive institution. The Review of Higher Education, 35(1), 77-103.

Chen, P.-S.D. (2011). Finding quality responses: The problem of low-quality survey responses and its impact on accountability measures. Research in Higher Education, 52(7), 659-674. doi: 10.1007/s11162-011-9217-4

Coburn, C. E. (2004). Beyond decoupling: Rethinking the relationship between the institutional environment and the classroom. Sociology of Education, 77(3), 211-244.

Cooper, T., & Terrell, T. (2013). What Are Institutions Spending on Assessment? Is It Worth the Cost? (NILOA Occasional Paper 18). Retrieved from http://learningoutcomesassessment.org/documents/What%20are%20institutions%20spending%20on%20assessment%20Final.pdf

Cox, B. E., McIntosh, K. L., Reason, R. D., & Terenzini, P. T. (2014). Working with missing data in higher education research: A primer and real-world example. Review of Higher Education, 37(3), 377-402. doi: 10.1353/rhe.2014.0026.

Cox, B. E., Reason, R. D., Tobolowsky, B. F., Underwood, R. B., Luczyk, S., Nix, S., Dean, J. & Wetherell, T. K. (2012). Linking institutional policies to student success: Initial results from a five-state pilot study. Tallahassee, FL: Florida State University’s Center for Higher Education Research, Teaching, and Innovation.

Dey, E. L., Milem, J. F., & Berger, J. B. (1997). Changing patterns of publication productivity: Accumulative advantage or institutional isomorphism? Sociology of Education, 70(4), 308–323

Dimaggio, P. J., & Powell, W. W. (1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48(2), 147-160.

Ebrahim, A. (2005). Accountability myopia: Losing sight of organizational learning. Nonprofit and Voluntary Sector Quarterly, 34(1), 56-87.

Ebrahim, A. (2009). Placing the normative logics of accountability in “thick” perspective. American Behavioral Scientist, 52(6), 885-904.

Ewell, P. T., McClenney, K., & McCormick, A. C. (2011). Measuring engagement. In Inside higher education. Retrieved from https://www.insidehighered.com/views/2011/09/20/measuring-engagement

Feldman, M. S., & March, J. G. (1981). Information in organizations as signal and symbol. Administrative Science Quarterly, 26(2), 171-186.

Flaherty, C. (2013). Disappointed, not surprised. Inside Higher Ed. Retrieved from http://www.insidehighered.com/news/2013/08/23/faculty-advocates-react-obamas-plan-higher-ed

Gayles, J. G. (2015). Engaging student athletes. In S. J. Quaye & S. R. Harper (Eds.), Student engagement in higher education: Theoretical perspectives and practical approaches for diverse populations (2nd ed., pp. 209-220). New York, NY: Routledge.

Gonyea, R., & Miller, A. (2011). Clearing the AIR about the use of self-reported gains in institutional research. New Directions for Institutional Research, 150, 99-111.

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60(1), 549-576. doi:10.1146/annurev.psych.58.110405.085530

Hamilton, L., Halverson, R., Jackson, S., Mandinach, E., Supovitz, J., & Wayman, J. (2009). Using student achievement data to support instructional decision making (NCEE 2009-4067). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/wwc/publications/practiceguides/

Hu, S., McCormick, A. C., & Gonyea, R. M. (2012). Examining the Relationship between Student Learning and Persistence. Innovative Higher Education, 37(5), 387-395. doi:10.1007/s10755-011-9209-5

Kuh, G.D., Cruce, T.M., Shoup, R., Kinzie, J., & Gonyea, R.M. (2008). Unmasking the effects of student engagement on first-year college grades and persistence. Journal of Higher Education, 79(5), 540-563.

Kuh, G., Hu, S., & Vesper, N. (2000). They shall be known by what they do:  An activities-based typology of college students. Journal of College Student Development, 41, 228-244.

Kuh, G.D., Kinzie, J., Buckley, J.A., Bridges, B.K., & Hayek, J.C. (2007). Piecing together the student success puzzle: Research, propositions, and recommendations.  ASHE Higher Education Report, 32(5), 1-182.

March, J. G., & Olsen, J. P. (1983). The new institutionalism: Organizational factors in political life. American political science review78(03), 734-749.

Marsh, J. A., Pane, J. F., & Hamilton, L. S. (2006). Making sense of data-driven decision making in education. RAND. Retrieved from http://www.rand.org/content/dam/rand/pubs/occasional_papers/2006/RAND_OP170.pdf.

McCormick, A. C., Kinzie, J., & Korkmaz, A. (2011). Understanding evidence-based improvement in higher education: The case of student engagement. Paper presented at the Annual Meeting of the American Educational Research Association. New Orleans, LA.

Meyer, J. W., & Rowan, B. (1977). Institutionalized organizations: Formal structure as myth and ceremony. American Journal of Sociology, 83(2), 340-363.

Mintzberg, H. (1979). The structuring of organizations: A synthesis of the research. Englewood Cliffs, N.J.: Prentice-Hall.

Oburn, M. (2005). Building a culture of evidence in student affairs. New Directions for Community Colleges, 2005(131), 19-32.

Ory, J.C. (1992). Meta-assessment: Evaluating assessment activities. Research in Higher Education, 33(4), 467-481.

Pascarella, E. T. (1985). Students’ affective development within the college environment. The Journal of Higher Education, 56(6) 640-663.

Pascarella, E. T., & Terenzini, P. T. (1991). How college affects students: Findings and insights from twenty years of research. San Francisco, CA: Jossey-Bass.

Pascarella, E., & Terenzini, P. T. (2005). How college affects students: Findings and insights from twenty years of research:. A third decade of research (Vol. 2). San Francisco CA: Jossey-Bass.

Paulsen, M. B., & Smart, J. C. (2001). The finance of higher education: Theory, research, policy, and practice: New York, NY: Algora Publishing.

Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525-556. doi: 10.3102/00346543074004525

Pike, G. R. (2011). Using college students’ self-reported learning outcomes in scholarly research. New Directions for Institutional Research, 2011(150), 41-58. doi:10.1002/ir.388

Pike, G. R. (2013). NSSE Benchmarks and Institutional Outcomes: A Note on the Importance of Considering the Intended Uses of a Measure in Validity Studies. Research In Higher Education, 54(2), 149-170. doi:10.1007/s11162-012-9279-y

Porter, S. (2011). Do college student surveys have any validity? The Review of Higher Education, 35(1), 45-76.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage.

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.

Sax, L. J., Gilmartin, S. K., & Bryant, A. N. (2003). Assessing response rates and nonresponse bias in web and paper surveys. Research in Higher Education, 44(3), 409-432.

Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.

Schuh, J. H., & Associates. (2009). Assessment methods for student affairs. San Francisco, CA: Jossey-Bass.

Schuh, J. H., & Gansemer-Topf, A. M. (2010). The role of student affairs in student learning assessment (NILOA occasional paper no.7). Urbana, IL: University of Illinois and Indiana University, National Institute for Learning Outcomes Assessment.

Scott, W. R., & Meyer, J. W. (1994). Institutional environments and organizations: Structural complexity and individualism. Sage.

Selznick, P. (1957). Leadership in administration. New York, NY: Harper and Row.

Senge, P. (1990). The fifth discipline: The art and practice of the learning organization. New York, NY: Currency Doubleday.  

Shön, D.A. (1983). The reflective practitioner: How professionals think in action. New York, NY: Basic Books.

Shore, C., & Wright, S. (2000). Coercive accountability: The rise of audit culture in higher education. In M. Strathern (Ed.), Audit cultures: Anthropological studies in accountability, ethics, and the academy, (pp. 57-89). London: Routledge.

Suskie, L. (2009). Assessing student learning: A common sense guide. San Francisco, CA: Jossey-Bass.

Tandberg, D., & Hillman, N. (2013). State Performance Funding for Higher Education: Silver Bullet or Red Herring? (WISCAPE Policy Brief). Madison, WI: University of Wisconsin-Madison, Wisconsin Center for the Advancement of Postsecondary Education (WISCAPE). Retrieved from https://www.wiscape.wisc.edu/docs/WebDispenser/wiscapedocuments/pb018.pdf

Terenzini, P. T., & Reason, R. D. (2005). Parsing the first year of college: A conceptual framework for studying college impacts. Paper presented at the Annual meeting of the Association for the Study of Higher Education, Philadelphia, PA.

Umbach, P. D., Palmer, M. M., Kuh, G. D., & Hannah, S. J. (2006). Intercollegiate athletes and effective educational practices: Winning combination or losing effort? Research in Higher Education, 47(6), 709 – 733.

Upcraft, M. L., & Gardner, J. N. (1989). The freshman year experience: Helping students survive and succeed in college. San Francisco, CA: Jossey Bass.

Upcraft, M. L., Gardner, J. N., Barefoot, B. O., & Associates. (2005). Challenging and supporting the first-year student: A handbook for improving the first year of college San Francisco, CA: Jossey-Bass.

Upcraft, M. L., Schuh, J. H., Miller, T. K., Terenzini, P. T., & Whitt, E. J. (1996). Assessment in student affairs: A guide for practitioners (1st ed.). San Francisco, CA: Jossey-Bass.

Weick, K. E. (1976). Educational organizations as loosely coupled systems. Administrative Science Quarterly, 21, 1-19.

Zemsky, R. (2011). The unwitting damage done by the Spellings Commission. The Chronicle of Higher Education. Retrieved from: http://chronicle.com/article/The-Unwitting-Damage-Done-by/129051

Zucker, L. G. (1987). Institutional theories of organization. Annual Review of Sociology13, 443-464.