In fall 2022, The Daily Californian introduced for the first time a source audit under the purview of the managing editor of race and equity, or MERE, DEI Committee chair and editor in chief. The purpose of this initiative was to trial an audit for the newspaper in a way that would allow for more awareness and transparency in sourcing.
Given that the source audit was a new introduction, its integration took time and saw changes in the auditing process. This included meeting with departments to discuss feedback on the audit and overall user accessibility. After implementation of the source audit from September to December 2022, a decision was made to trial a static audit for content from January to April 2023. This version of auditing relied on information that was already publicly available rather than staffers asking sources directly for demographic details.
By piloting both a source and static audit of content, I was able to gauge demographic trends in the paper’s coverage. Furthermore, both forms of auditing allowed me to analyze the strengths and weaknesses of the different types of audits as well as the feedback from staff regarding each one.
The following report details the structure, data and takeaways from these audits.
The purpose of gathering this information was to create an index of sources at The Daily Californian, so that the paper and reporters could determine which identities are underrepresented in editorial content. In doing so, the MERE, DEI Committee chair and DEI liaisons planned to work with each department to determine ways to improve outreach to a more diverse range of sources in order to better represent the Berkeley community. We also wanted to reinforce the value of championing historically marginalized voices in our coverage in a way that accurately represented different identities and intersectionalities of identities.
The source audit for fall 2022 was piloted with the idea of having individual departments enter data for sources from respective content. Given that the news department and projects pieces typically relied the most on source-based content, these two departments were the ones which utilized the audit.
This version of auditing sources also relied on an audit-as-you-go approach, meaning that writers were asked to enter data on sources as the relevant articles were published, Additionally, a script and source audit guide were made to help direct writers in how to ask sources questions that would help answer parts of the audit.
One of the primary benefits of this methodology was the ability to hear directly from sources how they identify. This allowed for accuracy in auditing as well as minimized the need to answer questions with “unknown.” However, sources were still able to skip questions or choose not to disclose. They were also informed that all information would be anonymized and the data would only be used for overall trend analysis.
Staffers were also provided a guide for using the source audit, including a script for asking questions during the interview and advice on how to respect the choice of sources who declined to answer. However, staff described difficulty implementing the audit into the natural flow of interviews with sources, and some sources were resistant to answering demographic questions.
For this reason, roundtable discussions were held between the MERE, DEI chair, EIC and news department to explore the option of alternative auditing formats.
The solution to source audit resistance led the MERE to conduct a static audit for the following spring 2023 semester.
Data for the static audit was gathered by the MERE from news story documents and a source registry. Information to answer each question on the audit also came from online research on the background of sources. For cases where an educated guess could not be made about any element of information on the source, a “prefer not to disclose” or “unknown” option was available. The option "other" was also included to account for information that does not fit into any of the categories included for each question.
Like the source audit, the static audit focused on news and project pieces from a four-month time period. Each article audited had, on average, two to three sources. Repeat sources were also included to account for the frequency that one specific source may appear in the paper’s content.
In addition to recording data on the source, the MERE also recorded what type of article the source was cited in — including campus, city, state and national coverage.
Age of source
Data for the fall 2022 and spring 2023 semesters come from 95 and 280 responses, respectively. For questions with multiple selection options, percent results indicate the percentage of students selecting each option, meaning that some percentages add up to over 100%. Additionally, “PND” represents “prefer not to disclose” or “unknown.”
Of the data collected, there were 57 campus-based sources, 30 city-based sources and eight sources that can be grouped as “other” to account for less common articles focused on obituaries and state or national affairs. This totaled to a sample size of 95 sources.
Given that the sample size of 95 sources does not reflect all sources for the given time period, there is potential for significant nonresponse bias in the fall 2022 audit. The following analysis of data therefore serves the purpose of reflecting the sources entered by staff for the given time period, but it is not intended to be representative of all sources for that period of time.
The overall age breakdown for the 95 respondents is as follows: 1.1% younger than 18 years old, 21.1% between 18 and 25, 40% between 26 and 50, 21.1% between 51 and 65, and 10.5% older than 65. 6.3% of the sample preferred not to disclose.
For the campus-based pieces of the sample, 24.6% of respondents were between 18 and 25 years old, 40.4% between 26 and 50 years old, 15.8% between 51 and 65 years old, and 8.8% older than 65. For these pieces, 10.5% of respondents preferred not to disclose. Given that campus consists predominantly of students and faculty or campus administration, the prevalence of sources between 26 and 50 years old suggests that there could be more sources who are campus authority or faculty than students who would most likely occupy a younger demographic of 18 to 25 years old.
On the city side of the sample, 3.3% of sources were less than 18 years old, 20% were between 18 and 25 years old, 46.7% between 26 and 50 years old, 26.7% between 51 and 65 years old and 3.3% were older than 65 years old. This city breakdown suggests that sources may typically come from older demographics under the age of 65.
For the 95 respondents, the breakdown of pronouns was 55.8% he/him, 36.8% she/her, 5.3% they/them and 1.1% she/they. The remaining respondents preferred not to say their pronouns.
Of the sources that were included in the 57 campus-based pieces, 52.6% were he/him, 42.1% were she/her, 1.8% were they/them and 1.8% preferred not to disclose. For sources that were in the 30 city-based pieces, 56.7% were he/him, 30% were she/her, 10% were they/them and 3.3% were she/they. This means that of the 95 recorded respondents, men comprised more than half of the sources for both campus and city coverage.
Of the 95 respondents, 43.2% were white, 16.8% were Latine or Hispanic, 13.7% were Asian or Asian American, 9.5% were Black, African or African American, 9.5% were mixed race, 1% were Middle Eastern and 6.3% preferred not to disclose.
For campus-based pieces, there were 40.4% white, 17.5% Latine or Hispanic, 14% Asian or Asian American, 10.5% Black, African or African American, 5.3% mixed race and 1.8% Middle Eastern respondents. 10.5% of sources for campus pieces did not disclose. In city-based pieces, 43.3% of sources were white, 20% of sources were Latine or Hispanic, 16.7% were mixed race, 10% were Asian or Asian American and 10% were Black, African or African American.
The spring 2023 source audit drew on a greater sample size of 280 sources for the static audit to minimize nonresponse bias. Of the 280 sources, there were 144 associated with campus sources, 110 with the city and 26 that fell under the “other” category that includes state and national affairs and obituaries.
As with the fall 2022 source audit, this static audit analysis focuses on the demographics broken down based on campus and city coverage.
Of the 280 sources, the age ranges included 1.1% respondents younger than 18 years old, 18.9% between 18 and 25, 51.8% between 26 and 50, 8.2% between 51 and 65, 8.6% older than 65 years old and 11.4% with unknown ages. This distribution reveals that more than half of all sources in this sample group were ages 26 to 50 years old.
For campus-specific pieces, sources had the following age distribution: 23.6% between 18 and 25, 52.8% between 26 and 50, 7.6% between 51 and 65, 9% older than 65 years old and 6.9% with unknown ages.
Similar to the overall sample, the sources for campus coverage were predominantly between 26 and 50 years old. This mirrors the near-majority representation found from the fall 2022 source audit as well. Based on this consistency, a trend can be noted of campus coverage relying on the sourcing of individuals older than the typical age of campus students, but younger than more elderly members of the community.
For the city side, 2.7% of sources were younger than 18 years old, 13.6% were between 18 and 25 years old, 50.9% between 26 and 50 years old, 8.2% between 51 and 65 years old, 8.2% older than 65 years old and 16.4% had unknown ages. Once again, this distribution reveals how most sourcing for city-based content remains within the age range between 26 and 50 years old.
For gender-based identification, 53.6% of sources identified as he/him, 37.9% as she/her, 1.8% as they/them, 2.5% as she/they and 1.1% as he/they. Of this overall sample, there were 3.2% of sources who represented multiple identities or whose pronouns and gender could not be determined.
Broken down into campus coverage, 50% of sources used he/him pronouns, 43.1% used she/her, 2.8% used they/them, 2.8% used she/they and 1.4% used he/they. On the city side, 57.3% used he/him, 30% used she/her, 0.9% used they/them, 2.7% used she/they, 0.9% used he/they and 8.2% remained unknown.
For race and ethnicity, the overall source distribution was as follows: 38.2% white, 12.1% Latine or Hispanic, 14.3% Asian or Asian American, 13.6% Black, African or African American, 2.5% Middle Eastern, 1.1% Indigenous or Native American and 2.1% mixed race. 17.9% of overall respondents’ racial and ethnic backgrounds were unknown.
The distribution for campus coverage is very similar to the overall sample, with 39.6% white respondents, 14.6% Latine or Hispanic, 13.9% Asian or Asian American, 12.5% Black, African or African American, 4.2% Middle Eastern, 2.1% Indigenous or Native American and 0.7% mixed race. There were 13.2% of sources whose race and ethnicity were unknown.
Drawing more holistic analysis from the small sample size of the fall 2022 audit would prove misleading in the actual representation of sources for that time period. However, comparing the larger sample size of the spring 2023 audit to the demographics of the nearest census for Berkeley helps put into perspective the diversification of coverage at the Daily Cal.
The July 2023 census for Berkeley only outlined the demographics for ages of persons under 5 years old, under 18 years old and 65 years or older. These percentages were recorded as 3.2%, 12.2% and 15.9%, respectively.
For Daily Cal sourcing, articles that had the most diversity in age for spring 2023 covered Berkeley City Council’s discussion of affordable housing, the SPUR program monitoring biodiversity, an obituary to a UC Berkeley statistics professor emeritus and a discussion on campus active-shooter preparedness.
The July 2023 census for Berkeley did not have a category specific to gender orientation and only had a category for sex, for which 51.1% of respondents were female.
In terms of Daily Cal coverage that did help bring in different gendered perspectives at the paper, some highlights include the articles featuring understaffing of UC Berkeley’s Disabled Students’ Program and UC Berkeley Library’s awarded grants in arts and humanities.
In the July 2023 census, the city of Berkeley had 51.9% of its population identify as white (with no Hispanic or Latino background), 20.8% identify as Asian, 12.1% as Hispanic or Latine, 7.8% as Black or African American, 9.8% as mixed race, 0.7% as Native American or Alaska Native and 0.2% as Native Hawaiian or Pacific Islander.
This means that compared to the 2023 population at Berkeley, the sample size of 280 sources from spring 2023 Daily Cal coverage often included more representation of sources of color than that accounted for by the census. However, because of the large percentage of sources with unknown racial and ethnic backgrounds, the diversity may not be as significant as the known background suggests.
While the paper is continually improving how it represents the Berkeley community, the presence of perspectives often excluded in journalism signals a culture that is working toward cultivating diversity in racial and ethnic representation.
Some examples of coverage that contributed to racial and ethnic diversity in spring 2023 coverage include the articles on UC Berkeley’s 13th annual TEDx conference, closure of space for the Latinx Student Resource Center, tenant advocacy for Alameda County, Berkeley Forum’s spring 2023 speakers and Berkeley City Council plans for affordable housing.
In terms of methodology, the static audit seemed to be more effective for ensuring accurate representation of the amount of sources the paper engages with. However, this version of the audit did have some background left unknown because there was no enforced requirement for reporters to collect information on the demographic background of sources.
If the static audit were to be implemented once more for the paper, it would benefit from including a training that instructs reporters to have questions regarding pronouns as standard practice. This would help minimize the likelihood of misidentifying sources. A similar practice could be integrated to ask for other details (such as age and race or ethnicity) that would help produce a more informative audit. Furthermore, larger sample sizes of audited content can help lessen the potential for nonresponse bias.
One downside that must be noted is the inability for the static audit to consider other facets of identity, such as disabilities, socioeconomic status, first-generation individuals, international individuals, student parents, reentry students and other identifications. For this reason, the audit works best when it is viewed as only a snippet of the greater conversations that must be had regarding coverage and community representation in the newsroom.
I would like to thank Arfa Momin and Nikki Iyer for producing the data visualizations for these audits. I would also like to thank Katherine Shok, the former editor in chief, for helping introduce the idea of a source auditing system to the Daily Cal. Finally, I want to thank the DEI chair Michael Temprano for advocating on behalf of the audit during staff conversations.
This project was developed by the Projects Department at The Daily Californian.
Data for this project come from an audit of the paper’s sources in fall 2022 and spring 2023.
Questions, comments or corrections? Email projects@dailycal.org.
Code, data and text are open-source on GitHub.
We are a nonprofit, student-run newsroom. Please consider donating to support our coverage.
Copyright © 2022 The Daily Californian, The Independent Berkeley Student Publishing Co., Inc.