Abstract
Empirical philosophers of science aim to base their philosophical theories on observations of scientific practice. But since there is far too much science to observe it all, how can we form and test hypotheses about science that are sufficiently rigorous and broad in scope, while avoiding the pitfalls of bias and subjectivity in our methods? Part of the answer, we claim, lies in the computational tools of the digital humanities, which allow us to analyze large volumes of scientific literature. Here we advocate for the use of these methods by addressing a number of large-scale, justificatory concerns—specifically, about the epistemic value of journal articles as evidence for what happens elsewhere in science, and about the ability of DH tools to extract this evidence. Far from ignoring the gap between scientific literature and the rest of scientific practice, effective use of DH tools requires critical reflection about these relationships.