I only glanced at the paper, but I wonder how much of this is explained by just random chance?
It looks like they used multiple choice quizzes to determine both knowledge in science and a propensity to respond "don't know" indicating confidence. Any "don't know" response was counted as an incorrect response, while a correct guess increased the participants "science knowledge".
Thus, a willingness to guess something at random in the multiple choice test would both increase "science knowledge" as well as make the participant appear overconfident.
I mean, the data modeling assumed people guessing were doing so completely at random without eliminating any options (In the section "Simulation").
If I'm looking at the right document, one question was about which city out of Chicago, New York, and LA have the greatest annual temperature range (accompanied with a plot).
Almost all respondents said New York or Chicago, rather than LA or "All equal".
As research slowly migrates to independent open-access journals and conferences, I wonder if there is an opportunity to coordinate (and incentivize) the peer review process in a decentralized manner.
Maybe a digital currency that's created by reproducing (or disproving) existing papers (weighted by lack of reproduction) and with which the author of a submitted manuscript "pays" reviewers and editors, which they in turn use to submit their own papers?
The coin of science is truth, and history has shown repeatedly that it responds quite poorly when any other goal is incentivised.
That includes "publish or perish" (promotes low-value journal-milling), impact factor (dittos), various forms of degree conferring and/or tenure-gating (journals become gatekeepers to disciplines and careers, and can feed back influence to senior faculty and/or administrators to enshrine that status), and more.
Schopenhauer's essay on authorship is strongly recommended.
Peer review, for all the awe it inspires presently, is quite modern as a significant factor, dating to the 1980s.
Huh, I didn't know that peer review was that recent.
For all of its issues, I still think there would be some value in encouraging people to find issues in papers and publicly note them instead of relying on an oral history of which papers are accurate (hence my suggestion to encourage people to reproduce or disprove papers, while slightly penalizing submitting new research).
Many fields have reproducibility issues, and my experience has been that research groups informally build their own internal lists of (seemingly) inaccurate publications.
It would be naive to believe that even the most devoted individual would opt to starve rather than pivot their work to eat. As it stands, there isn't much an incentive other than "truth" or general political reputation management to document the reproducibility of other work.
Erroneous works make it difficult to find relevant, accurate ones.
There's an interesting sociology and economics of science and academia which could use more attention and ... research. This includes why and how specialisation and focus on narrowly-scoped rather than synoptic, integrative, systemic, or "generalist" study, scholarship, and instruction emerge. The problems are not especially new or specific to our time. Yes, there were in the past scientific generalists who studied or contributed to a large number of fields, but there've also always been those who guarded their own domains, not necessarily scientific, against such approaches. Jealousies between science and religion, law, medicine, and crafts or (technical) arts, for example.
I think that trend is more generally human and speaks to tribalistic tendencies. Those themselves emerge for numerous and complex reasons. Innate complexities of fields, difficulty in judging or assessing new contributions, an inherent conservativism focused on what is rather than what might be, of preserving established order against change, and the like, seem major components.
I'll note that in the computer and technical field, as I age and hopefully experience, I increasingly find myself in the more conservative camp myself, just for the record.
On compensation and reward: it's been observed in education where much has been made of metrics focusing on individual teachers and students that those indicators seem to hold very little validity. The strongest evidence of this is for teachers working at multiple institutions, either simultaneously or transferring within a given year. Looking at those datasets, there's no correlation between performance between the two.
I'd argue similarly that attemtping to reward or penalise researchers based on individual performance or papers is ... a false metricism. That true assessments of value are difficult (and that most metrics used --- papers published and patents issued, for example) are meaningless. There's little agreement even on what, say, the ten most significant inventions of all time have been. (And were any of those developed in the past 100 years?)
Cultivating a research cadre, or sets of cadres, ensuring that they are supported, testing where appropriate against measurable criteria, but allowing for a high level of fuzzing and uncertainty in those metrics, and looking more to aggregate performance of teams, institutions, disciplines, and the like, might be a better approach. Markets work poorly here. Institutions and patronage in its various forms (including national support) seem far better suited.
And yes, revisiting and testing of recent and older results should be a constant part of that process.
It looks like they used multiple choice quizzes to determine both knowledge in science and a propensity to respond "don't know" indicating confidence. Any "don't know" response was counted as an incorrect response, while a correct guess increased the participants "science knowledge".
Thus, a willingness to guess something at random in the multiple choice test would both increase "science knowledge" as well as make the participant appear overconfident.