"Hypothesis formation is more than a banal step in the scientific method. Prior beliefs can influence study results by informing how studies are designed and if their findings are interpreted as spurious or true. Outcomes from observational studies are particularly vulnerable to being biased by authors’ prior beliefs1 via mechanisms ranging from subconscious priming2 to overt p-hacking.3 This may explain why researchers invested in certain study outcomes may be more likely to find their sought-after associations.4 Given this, best-practice guidelines for reporting observational research recommend describing authors’ a priori hypotheses.5 Thus, we sought to characterize the frequency of explicitly stated hypotheses in articles across major general medicine journals.
We conducted a repeated cross-sectional analysis of studies in four general medicine journals (JAMA, Annals of Internal Medicine, The BMJ, The New England Journal of Medicine) published in 1999 and 2019. Observational research published as original articles or brief reports was included. Data extracted from each article included the publication year, author degrees, presence or absence of an explicitly stated hypothesis, and the direction of the primary hypotheses and study findings (association, no association, and other including mixed or unclear directionality of findings). The primary outcome of interest, the presence or absence of a hypothesis, was defined as an explicitly written statement about the authors’ prior belief about the direction of the primary outcome of the study made anywhere in the article. Statements were considered explicit if they conveyed the authors’ prior beliefs regarding the primary outcome in the study (e.g., “we postulated”) but not if they simply described the standard setup of hypothesis tests for statistical analyses.
One of the three authors (AC, AZ, SR) extracted the data from the four journals with a fourth author (JN) reviewing 10% of papers to assess for agreement in the presence/absence of a hypothesis. There was substantial agreement on coding of presence/absence of a hypothesis (kappa 0.88, 95% CI 0.76–0.99).
Descriptive statistics and Pearson’s chi-squared tests were used to characterize and assess differences between categorical variables, respectively. Analyses were performed using R version 3.6.0.
Eight hundred twenty-three articles were reviewed. Of these, 495 (60%) reported associations, 93 (11%) reported no associations, and 235 (29%) reported mixed or unclear directionality to findings in the main analyses. One hundred eleven (13.5%) articles had a clearly stated hypothesis, of which 99 (89.2%) hypothesized finding associations in the main analyses. Articles with a hypothesis were more likely to report associations in the main analyses (76% vs 57%, p<0.01) and less likely to have mixed or unclear outcomes (11% vs 32%, p<0.01).
Articles published in 1999 compared to 2019 did not differ in the prevalence of reported hypotheses (12.8% and 14.6%, p=0.52). Additionally, the presence of a PhD author was also not associated with a difference in hypothesis prevalence (Table 1). While the first and last authors with “other” degrees tended to have the lowest percentage of articles with a hypothesis, these associations did not reach statistical significance (p=0.06 and p=0.46, respectively)."
Read more in the Journal of General Internal Medicine.