The Evaluation Wars

Senior Fellow Emeritus

SUMMARY: __Evaluation studies can’t substitute for site visits and real knowledge of your community, Schambra warns in "The Evaluation Wars." This article looks at the history of evaluation and how giant foundations tend to get bogged down in conflicting theories while smaller donors use local wisdom to find effective grantees. It argues that a purely scientific philanthropy is impossible; small foundations in particular should attempt a more intuitive style of grantmaking based on practical wisdom and keen observations close to home. Schambra also offers tips on things to look for during site visits to assess effectiveness. The full text of this article, which appeared in the May/June 2003 issue of Philanthropy magazine, can be found below.__

Philanthropy cover image, May/June 2003 issue "Am I doing good by giving this money away?" is a question everyone in philanthropy should ask regularly. It is hard for funders to do good; it is all too easy for them to feel good--if they don’t bother to investigate the consequences of their giving.

And so we hear much about choosing "effective" grantees and even more about "evaluating" grants given. In fact, it often sounds as if responsible philanthropy requires legions of scientists to evaluate the results of one’s funding. Does that mean that small- and medium-sized foundations who lack the resources to undertake vast evaluative studies should just give up, perhaps passing their assets to some giant foundation with the scientific where-withal to do the job right?

Wrong. First, the history of philanthropic evaluation reveals that it has certain limitations and even pitfalls for donors of all sizes. Second, a small or medium foundation can often outperform larger foundations when it comes to uncovering truly effective grantees.

Before we consider how to find the best grantees without a staff of hundreds, let’s look at the long saga of evaluation.

From Charity to "Scientific Giving"

Today’s effort to move beyond "sentimental charity" to "rigorous, results-oriented" giving is nothing new. It was the founding principle of large, institution-based giving that began with the Rockefeller, Carnegie, and Russell Sage philanthropies early in the twentieth century. As John Jordan reports in Machine-Age Ideology, these foundations searched for a mathematically rigorous social science that would give enlightened elites a way to achieve "social control" over the benighted masses. The Rockefeller Foundation declared in an early mission statement that it hoped to "increase the body of knowledge which in the hands of competent social technicians may be expected in time to result in substantial social control."

So tightly interwoven were philanthropy and the new social sciences that for all of August each summer in the 1920s, Rockefeller, Carnegie, and Sage officials would gather in the cool shade of Hanover, New Hampshire, with researchers from foundation-backed projects at the major universities and members of the National Bureau of Economic Research and the Social Sciences Research Council. They hoped to create "a vision from which new ideas may emerge," as philanthropist Beardsley Ruml put it.

After such an enthusiastic embrace of scientific, measurable outcomes by the very founders of contemporary philanthropy, how is that, almost a century later, we still talk about the need to move from sentimental giving to rigorous, measurable outcomes as if the idea had just occurred to us?

As it turns out, social science underwent something of a crisis of confidence during the 1960s and ’70s. Social reformers began to discover problems with the scientific, experimental method for studying the effects of social programs. This method required a donor (the government or a private entity) to design a narrowly focused social intervention, introduce it exactly the same way across a series of sites, and then compare the outcomes to "control groups"sites with identical characteristics but not subjected to the treatment. Such efforts were cumbersome, expensive, and tended to produce results far too late to be useful to planning. Above all, experimental social science tended to show that no program ever worked. Hence evaluation expert Peter Rossi’s famous Iron Law: "The expected value for any measured effect of a social program is zero." Thus some analysts became skeptics of government programs, while others persisted with their interventions, insisting they be measured in different ways.

The latter point of view was eloquently expressed by Lisbeth Schorr in two widely acclaimed books from the 1990s, Within Our Reach and Common Purpose. She argued that "the conventions governing traditional evaluation of program impact have systematically defined out of contention precisely those interventions that sophisticated funders and program people regard as most promising." To be evaluated scientifically, programs had to be narrowly defined and inflexibly applied for a limited experimental period. Truly effective programs, by contrast, tended to be "comprehensive, flexible, responsive, and persevering," and to "deal with families as parts of neighborhoods and communities." To evaluate these sorts of "comprehensive community initiatives," a new mode of measurement called "theory-based evaluation" was required. This approach, de rigueur among today’s more sophisticated foundations, employs a "logic model" or "theory of change" to explain how a complex range of community elements may be brought together to produce "social change." Actual outcomes from the subsequent program can then be compared to predicted outcomes.

There is only one problem, as Gary Walker and Jean Grossman point out in their sobering and insightful Philanthropy and Outcomes: Dilemmas in the Quest for Accountability: Without control groups, this approach cannot show scientifically that the outcomes were produced by the program, and so "it is often difficult to know to what degree the outcomes achieved are attributable to the initiative funded." Hence the nonprofit scholar Ira Edelman found that evaluation schemes "failed to provide the kind of certainty" that experimental and statistical methods can offer. "Key players" reacted by rejecting evaluation reports and making "demands for ‘hard’ evidence that programs were workingevidence that was impossible to provide."

The Evaluation Wars rage on, with a bewildering variety of often conflicting approaches to measurement. Community activists champion "participatory evaluation," while wealthy entrepreneurs are drawn to "social return on investment." Other programs taking the field include Patton’s utilization-focused evaluation, Stufflebeam’s decision/accountability-oriented evaluation, Scriven’s goal-free evaluation, Shadish’s needs-based evaluation, and Norton and Kaplan’s balanced scorecard measurement-based management approach. Small wonder that a recent survey of philanthropic leaders by the Center for Effective Philanthropy concluded, "each foundation has developed its own combination of metrics independently, described in different language, and applied in different waysthere is no common vocabulary to permit sharing among foundations nor any coordinated attempt to collect these measures more efficiently."

The Center also found that "even among the 225 largest foundations in the country, which have the greatest resources to evaluate grants, more than 40 percent of CEOs estimate that fewer than one-quarter of their foundations’ grants are evaluated." And so, 80 years after social scientists and philanthropists gathered in New Hampshire to create a hard science of human behavior, the Center can only hope "that one day foundation leaders and trustees will have ready access to comprehensive objective performance data to guide their key decisions."

Until this hopenow almost a century old, with no fulfillment in sightis realized, what can the family-managed or lightly staffed smaller foundation do to evaluate its programs? Even if a modest donor dares to pluck an evaluation method from the maelstrom, the World Economic Forum’s recent study Philanthropy Measures Up warns that although "dozens of interesting efforts at and resources for measuring philanthropic impact" exist, "few organizations have developed systematic approaches" that can be applied "relatively quickly and cost effectively by individual donors, small foundations, and businesses."

Wise And Effective Giving

In short, "scientific" giving is expensive and burdensome to start with, and even when these hurdles are overcome its value remains uncertain. Worst of all, the effort to evaluate grants rigorously may lead donors away from some highly effective potential grantees. Trying to discover, say, the scientifically perfect drug rehabilitation program may force a donor to ignore the most effective drug rehab programs already operating in his own city.

A purely scientific philanthropy is impossible; small foundations in particular should attempt a more intuitive style of grantmaking based on practical wisdom and keen observations close to home. Donors should recognize that their own back yards likely contain dozens of highly effective grassroots groups that should not be overlooked simply because they have never been evaluated by a phalanx of wonkish experts. Bob Woodson at the National Center for Neighborhood Enterprise and Amy Sherman at the Hudson Institute have worked with hundreds of grassroots leaders around the country who not only run solid programs themselves but are also invaluable sources of local wisdom about other programs known on "the street" to be truly effective. These local wise men and women can identify promising candidates for funding, and donors can then vet the candidates through discerning site visits. The visit is critical, for sometimes even the most effective "agents of change" can only explain what they are up to by saying, "Come and see."

Some tips for "eyeballing effectiveness" during the visit: Look first for activity. Typically, effective grassroots groups are quite busy and at times may even appear to be chaotically overwhelmed. If they are truly serving the neighborhood, they wind up pursuing not just their announced mission, but dealing with the full range of human needs brought to their doorsteps by a desperate community. For example, when children started showing up at Cordelia Taylor’s Family House in Milwaukee, she could have sent them away and told them she was running a senior care facilityor she could have tried to meet their obvious need for a caring adult presence. She of course aimed for the latter. She had her son James take the kids under his wing with after-school activities and a homework club; eventually, a volunteer organized karate classes in the basement. This sort of flexible, unplanned, "off-mission" response to whatever the neighborhood needs next is a hallmark of effective grassroots work, but it plays havoc with program budgets and outcome-reporting, and makes it difficult to attract funding from larger foundations who expect to see rigid adherence to program designs and budget line-items. As Leon Watkins, director of Los Angeles’s Family Helpline, told the Capital Research Center, "When someone comes in and tells me their house just burnt down, or they bring in a little girl with serious mental problems and she has no place to stay, what program do you put that under? It’s hard to explain to people that concept. People who pledge support want to see programs. But that’s what life is like herewhatever comes up, that’s the program."

A group’s busyness suggests its valued place in the neighborhood, which is also seen in another visible sign of effectiveness: the way the neighborhood shows tangible respect for the group. Cordelia Taylor had to reassure the construction company working on Family House’s expansion that it needn’t put fencing around its equipment at night, because the neighborhood knew it was helping her. Nothing was ever stolen from the unfenced site. Sister Jennie Lechtenberg notes that one sign of PUENTE Learning Center’s importance to its East Los Angeles community is the fact that no graffiti appears on her facility’s walls. (Conversely, it’s easy to tell a group enjoys an uneasy relationship with the surrounding neighborhood by locked doors and elaborate security systems).

Similarly, on a site visit to an effective grassroots group one will see that the group genuinely respects the neighborhood, no matter how "pathological" it may seem to the experts. The group’s leadership typically lives in the neighborhood it serves (Robert Woodson calls this the "zip code test.") The leaders know the neighborhood thoroughly and can take you to its most forbidding corners without fear. John Worm, a housing specialist at ACTS Community Development Corporation, can drive a donor through his Milwaukee neighborhood, telling the stories of every house and every low-income family he has ever matched up. ("The Lopezes in 608 just built their own back porch. Looks like it probably has seven code violations, but they couldn’t be prouder of it.")

A further sign of respect for the community is the way good grassroots leaders do not refer to the people they serve as "clients" and never treat them as passive, helpless victims of circumstances. Grassroots leaders know and use the names of those they are serving and are quite willing to interrupt a site visit to answer an urgent query. Visitors often see program participants washing dishes or picking up the trash during a tour, because the program shows its respect for the dignity of those it helps by asking something in return.

Grassroots groups also gain respect from neighborhoods and those they serve by having staff who themselves have overcome the problems they help others with. A former drunk with an eighth-grade education can often achieve far better results with current drunks than an entire staff of college-trained doctors, therapists, and social workers. If a nonprofit lacks any staff who can look a participant in the eye and tell him, "I beat this problem, and you can too," it’s unlikely to be effective at reforming people. But if it has some of its own former participants now on staff, with or without professional credentials, it is probably turning lives around.

A site visit should also reveal evidence of good stewardship, well before you ever get to the point of examining the books. In one of my site visits for the Bradley Foundation, Sharon Mays-Ferguson pointed out a somewhat awkward window at Intercessions, her group home for teen mothers, and noted that she had wedged it in during construction because it had been donated by a supporter. When one tours Bob Coté‘s Step 13, a residential addiction treatment center in Denver, one learns which church donated that crucifix for his chapel and which company donated that sofa or that computer. For effective grassroots groups, nothingnot even a cast-off windowis wasted, and every donation of resources and volunteer energy is welcomed, remembered, and acknowledged.

The very method a project uses to tell you about itself says something about its effectiveness. A quality grassroots group will seldom show you into an empty conference room to view a PowerPoint presentation about abstract intervention statistics. Rather, you are usually invited to sit down with a group of people whose lives have been touched by the group. Since many grassroots groups are rooted in religious faith, to understand what they do you are asked to witness comprehensive life transformations, not superficial behavior modifications, and that can only be conveyed by listening to full stories of redeemed souls.

A final indicator of effectiveness appears when the site visit arrives at the funding pitch. From a solid grassroots group, you never hear "unless you make a grant, we cannot do anything." In fact, you may not get an explicit pitch at all. The often implicit message is, "You have seen the fruits of our labor, not a promise about what we might do with more money. If you choose to help us, great. If not, fine. But we will be here, laboring." As Woodson argues, you should look for the groups that were already working before funding became available and that will continue to be there if it no longer is. Good groups especially do not depend on the vagaries of government funds for their existence.

One mistake foundations often make is to rush past these truly significant if subtle indicators of effectiveness, in the haste to see an audited statement, a sophisticated accounting system, or an elaborate method for tracking and reporting outcomes. A better approach would be to rely on favorable first impressions and make a small initial grant to the group. You will have the opportunity later, now as a trusted supporter, to ask for improvements in accounting and reportingthings you should be willing to help the group provide. Woodson’s National Center works with several assistance providers who are particularly sensitive to the strengths of grassroots groups, and Amy Sherman has identified a number of "intermediaries" around the country that also fit this mold (see her article in the July/August 2002 issue).

Once you have established funding relationships with grassroots leaders, they begin to lead you to other effective programs. The late Bill Lock at Community Enterprises of Greater Milwaukee was not only himself an effective program director but also a font of community wisdom about who else was doing good work in the city. He never hesitated to share that information, even if it pointed beyond his church’s denominational walls, because he understood all effective neighborhood leaders to be doing God’s work.

By now, it should be clear that eyeballing effectiveness entails less what a group does than what it is. To be a transforming and healing agent in a neighborhood, a group must be embedded in the neighborhood, reflecting its best traditions and hopes for the future. It is not enough simply to deliver services, however efficiently.

While the largest foundations continue to tie themselves in knots in the Evaluation Wars, smaller foundations are free to undertake the deliberate, cumulative, intuitive process of mapping networks of effective grassroots groups in their own backyards, compiling over time their own checklists of effectiveness, based on their own local experiences. Measurable outcomes are no substitute for this deeper wisdom.

Domestic Policy