The
lead local story in the TU today describes the rather large diescrepency between the proportion of African American citizens in Albany and the proportion of African Americans who have jobs in the Jennings administration. As a social scientist,
I must say that I am aghast at the statistical analysis the paper uses. But first the facts:
In a city where African-Americans comprised 22 percent of the overall work force in the 2000 census, they now hold only 11 percent of about 1,400 jobs in the administration of Mayor Jerry Jennings, based on a Times Union analysis of the city work force as of this March.
That disparity is even stronger in the top posts, the kind that provide both a good paycheck and an opportunity to wield influence over how things run in a city where everything from getting a pothole fixed to finding a municipal job is handled out of City Hall.
Blacks were about 13 percent of all professional workers living in Albany, according to the census, but hold less than 1 percent of the city's 86 professional-level jobs -- lawyers, accountants or engineers, as well as lieutenants in the police and fire departments.
Under Jennings, roughly 5 percent of department heads and top policymaking administrators are African-Americans, while census figures show that blacks comprised about 14 percent of administrative workers living in the city in 2000.
Blacks fare better in getting mid- and low-level city jobs, according to the Times Union analysis.About 18 percent of the ranks of clerical and paraprofessional jobs -- clerks, dispatchers, recreation aides and summer counselors -- are African-Americans. That is closer to black proportions among all city residents in the work force, at 23 percent in clerical jobs and 35 percent in paraprofessional ones, in the census. African-Americans also hold about 15 percent of city maintenance and service jobs -- about half the proportion of all such workers who live in Albany.
This apparent imbalance comes as the city's black population has been growing in recent decades, from about 16,200, or 16 percent, in 1980, to nearly 27,000, or 28 percent, in 2000, a period when the city's overall population fell from about 101,000 to just under 96,000.
As one might expect, the political ramifications of this report were felt instantly:
The issue of minority employment in the city was cast as a campaign issue Monday by [Mayorla candidate Archie] Goodbee.
"These data paint a sorry picture of a lack of official attention and commitment by this administration," he said. "Leaders in the minority community that I talk with have expressed a high degree of frustration in their efforts to improve this situation."
The response from city hall was less than pleasant:
A spokesman for the mayor questioned the timing of McLaughlin's inquiry. "Why is this coming up now?" said Jennings spokesman Joe Rabito. "We are making aggressive efforts to increase minority recruitment," he said. "We welcome council member McLaughlin's interest now. We wish she would have taken an interest before an election year, because we have been working on it for some time."
We're going to mostly talk statistics in this post because the TU does a shoddy job with their data today, but we should first note the
mayor's response to this was completely out of line. As I'm going to show below,
I think there is a case to be made that the observed discrepancy in racial job holdings is not evidence of racism or even intentional practices. However, to brush aside a report is ludicrous. As
Democracy In Albany has pointed out, the mayor's response is simply an accusation that people are trying to play politics for less than honest motives. But the crux is this: the mayor's offce can agree with the report or it can dispute the report,
but the mayor can not simply brush it aside as if it was done with less than honest motives. This wasn't some fly-by-night interest group, it was the local newspaper of record!
But, as I said, we need to get the whole picture of what is going on with the statistics. As a social scientist by trade, I spend my days trying to make sense out of such data. Articles like this one are notorious for two types of problems:
1)
Taking some data and drawing a logical, but completely incorrect, conclusion: The implicit conlcusion in the story is that this is a purposeful result of hiring practices in the Jennings administration. Maybe. Maybe not. But the mere statistics do not get us there. We need to do a little more work to get to that.
2)
Incorrect use of statistical analysis: Journalists routinely employ shoddy statistical methods in many cases, and then don't use some very helpful statistical methods in other cases.
For those interested, I'm now going to walk through some of the statistics (in layman's terms) that we can use to further analyze the data provided by the Times Union.
What we have here is
data. By itself, data tells you nothing. In order to properly interpret data, you need to
develop a hypothesis and then
make sure you're data is suitable for testing that hypothesis. If it is, you can then
do some proper statistical testing and
draw some valid conclusision.
Obviously, newspapers don't have time to go through stats lessons in every article. So a lot of this becomes implicit in the article. Let's look at the implicit stuff in the TU article from today:
Data: the data they use is a breakdown of job holdings in the Jennings administration by race, as well as census data on the racial profile of the city of Albany.
Hypothesis: A number of hypotheses seem to be swirling through the article. The most gentle claim is that there is a racial imbalance in Jennings administration jobs. Some stronger hypotheses - for instance that this is intentional racism - might be implied but are never stated.
Suitability: The TU never seems to question the suitability of their data. This is an error.
Do some testing: The only testing the TU does is a straight correlation. This is also an error.
Conclusions: The TU doesn't really draw a conclusion - it simply lays out the imbalance in the correlation. This, from a newspaper perspective, is ok. But from a social science perspective is an error.
To illustrate how you might go about a semi-serious analysis of this data, I'll walk through the data the way I would approach this as a social scientist.
Step 1:
Notice the potential problem and describe it. The TU does a nice job here. Blacks make up 22% of the city population. They make up 11% of the city employees. They are most underrepresented at the top-level jobs, and most well-represented at the bottom-level jobs. The TU gives some nice breakdowns of the proportion of workers by race in each of these categories, as well as some sectors of city jobs.
Step 2:
Confirm that this is statistically significant. We need to make sure that such a result couldn't happen by random chance. For this, we use a complicated statistical technique that tests proportions. In this case, it is definitely statistically significant - there would be less than a 1 in 1000 chance of observing an imbalance this great if city hall workers were selected at random from the population of the city of albany. (The z-stat was over 100). However...
Step 3:
Confirm your underlying assumptions about your variables. That statistical significance assumes that the two samples we are working with - city workers and the city residents - are "nested populations," (i.e. all the workers are drawn from the city residents). As it turns out, they are not! According to the mayor's office:
He said it is not fair to consider only the city's racial characteristics, because the Jennings administration draws potential workers from outside Albany's boundaries. Only 56 percent of city workers live in Albany, Cavazos said.
This throws a serious wrench in the TU analysis, because we don't know the racial composition of the surrounding area from where the workers are drawn. For the statistical comparison to be correct, we need the racial data from sample #1 (city workers) to be compared to the racial data from the true population pool (i.e. the general population of all the areas that city workers come from). Right now this is not the case. City residents is being used as the true population pool, but city workers is drawn from a larger pool (city residents and non-city residents). And 44% of the city workers are coming from this non-city portion of the true population pool.
The racial statistics of the true population pool is
crucial missing information. What we know is the racial statistics of the city portion of the true population pool. So, for instance,
if the non-city portion of the true population pool is 93% white and has a population of 100,000 residents, then the imbalance in city hall jobs all of a sudden goes away! If it is 100% white and a population of 200,000 residents, then all of a suden there is an imbalance in favor of African Americans! Note that i have no idea what the population size or racial composition is of the non-city portion of the true population pool. If it's a very low population, or if it's not overwhelmingly white, the imbalance in jobs remains statistically significant. Also note that this says nothing about the types of jobs people get at city hall, only the number of workers.
UPDATE: A comment over at
Democracy in America points to the Albany County racial statistics: apparently 11% of Albany County residents are African American. So
if the true population pool for city jobs is the whole county, there is no numerical imbalance.
(Of course, it only raises t
he political question of why albany city workers are being mostly brought in from outside albany - to the detriment of the african american populaton in the city. But that would be a political question, not a statistical one. )
Step 4:
Come up with some hypotheses.
Although the recognition of the racial statistics of the true population pool makes the problem look a lot less serious, there is the disturbing realization that most of the African-Americans who work for the city work in the lowest jobs, and almost none work in the highest jobs. (We could again go back and make sure this observation (1 African American in 86 jobs) is statistical significant, but i already have - it is. There is less than a .01% chance you would observe this result if the jobs were pulled randomly from the true population pool(the corrected one that accounts for the whole county).
One hypothesis we could make is that there is some sort of racial discrimination, either explicit ("we don't hire blacks as policy"), implicit ("we don't have a policy, be we all know that we don't hire blacks"), or subconcious ("we have no problem hiring blacks, but when we interview them we unintentionally discount them as candidates") that is causing the low number of African Americans in the top positions. This seems to be the quick conclusion some people have come to, and also the implicit TU conclusion.
A second hypothesis would be that there is a
lurking variable here. A lurking variable is a third variable that explains away a correlation or an imbalance. For instanace, you might look at statistics about car crashes and see that tons more accidents happen in NYC than in Saratoga. It would be
wrong to draw the conclusion that the roads in Saratoga are safer than NYC? Why - can you see the lurking variable?
Tons more cars drive on the roads in NYC, so there are lots more potential accidents. In fact, per mile driven per car, the roads in NYC are almost certainly
safer than in Saratoga. The vast difference in miles driven on the two roads in an average day is the lurking variable that explains the incorrect conclusion.
What might be a lurking variable that could reject a racism conclusion here? Well, the mayor's office suggests one:
"While a certain percentage of the population may be women or minority, a smaller percentage of that group actually posses the necessary skills and qualifications to be considered for employment," he wrote.
Of course, people tend to gristle when they hear stuff like this. A response from council woman Carolyn McLaughlin:
Cavazos' statement also drew a rebuke from McLaughlin, who is assistant budget and planning officer for the New York State Teachers Retirement System. "It's insulting to say that," she said. "What kind of brainpower do we have in the city of Albany? I have a master's degree and so do a number of my friends."
Of course, both sides are partial right here. The mayor's office is correct in saying that the African American population in the city is disproportionately less educated than the population as a whole. McLaughlin is correct however, because the imbalance at the top-jobs in the mayor's office is just
far too great for this imbalance in education to be making a difference. We are not talking about filling 10,000 jobs that require a Ph.D. here. We're talking about only having 1 African American in the top 80 jobs.
So while I sympathize with the possibility that lurking variables - such as education - can throw a wrench into these statistical problems, in this case it's somewhat ludicrous, because the numbers are so small and the population of jobs is tiny compared to the population of the city.
Step 5: Draw some tentative conclusions and look for more data: My conlcusion (from the pitiful amount of data we have) is that the imbalance in jobs by race in the city as reported by the TU is due to a combination of three things:
1) Bad statistical work. Not correctly counting the population the jobs are drawn from is a huge mistake. If workers are coming from the suburbs, then the racial composition of the suburbs is a fact we need to know. Shame on the TU for not even considering this. Of course, the politics of why city jobs are being given to people who don't live there is a question I would like answered.
2) Lurking variable of education. The mayor's office is probably right, at least partially. The percentage of qualified candidates for the top jobs in the mayor's office who are African American is probably small than 22%. It might be 10%. So that explains the imblance partially. However, it doesn't explain a 1 for 80 result. Not even close.
3) Some sort of discrimination, or at least non-aggresiveness at minority job hiring. Note that I didn't say racism here. I doubt it is a simple story like that. The most serious problem in my view is the outside-the-city hiring. I'm sure that the city could find plenty of qualified candidates - and plenty of qualified African Americans - if city jobs were restricted (at least partially ) to city residents. The mayor's office could certainly do more to aggressively seek out qualified minority candidates. Again, a good start would be to aggresively seek out candidates who live in the city - that might clear the problem up to a large degree by itself. I'm confident that this strand of the problem is built on laziness and entrenched networks, not racism really. However, i'm open to the possibility there is at least some of the subconcious racial discrimination in the hiring process, if not more implicit things. This, of course, can be very difficult to root out, especially if it at the level of the individual.
I'd like to see more data on all of this. Most crucially would be a description of the population pool (not just the city). I'd also like to see statistics on educational background, as well as some data on who the city interviewed for the top-posts. More data is always more helpful.
UPDATE: I want to make it clear - it apparently wasn't in my original post - that i'm not advocating any sort of affirmative action or diversty investment on the part of the city. If the city is acting on good faith and being truly colorblind, and the job results are simply coming out as they are because the best candidates are getting the jobs, then i'm fine with that. Really. I don't think the imbalance itself is reason to for alarm unless its connected to racism or some other policy that is systematically denying African Americans who are qualified the jobs. In fact, the point of my statistical analysis was largely to tear apart the TU article. But i can't ignore a situation where only 1 of the top 80 jobs goes to an African American, can I? You don't need high powered statistics to see a problem with that, right?