Skip to content Skip to navigation

METRICS International Seminar

Stefan Schandelmaier (Dec 17)

A new library of methods guidance and other plans to improve knowledge translation from methods developers and meta-researchers to primary researchers


Meta-research and systematic reviews document the use of inappropriate and outdated research methods in primary health research. Methods developers express frustration about the poor uptake of recommendations made in research papers and text books. In his presentation, Stefan will argue why better knowledge translation may have great potential to improve the quality of primary research. Using the example of subgroup analysis, he will demonstrate the remarkable inefficiency of methods papers in influencing research practice and discuss potential causes. A comparison between clinical and methodological research will reveal a lack of knowledge translation structures. Stefan will present potential solutions including plans to develop a methods library, discuss the different types of methodology guidance, the need to clarify terminology, and ways to increase the quality and usefulness of methodology guidelines. Critical comments, ideas for alternative solutions, and any other input will be most welcome!  


Stefan is a broadly interested methodologist currently working at Basel University in Switzerland. After graduating in medicine and classical music from the University of Freiburg in Germany, he developed an interest in research methodology and joined a group of researchers trained in clinical epidemiology at the University of Basel. Stefan was involved in social insurance medicine, systematic reviews, and meta-research before he completed a PhD in Health Research Methods at McMaster University. During his PhD, Stefan addressed issues related to subgroup analyses with the main publication being the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN). Other areas of methodological expertise are clinical trials and meta-analysis, early stopping of trials, patient recruitment, reporting issues, and methods for interpreting patient-reported outcomes. Stefan's new focus is finding ways to improve knowledge translation from methodologists to research practitioners.


Terry Klassen (Dec 10)

RCTs in Child Health: Frequentist approaches failing, is the future Bayesian analyses?


We have been following child health RCTs every five years using a random sample of RCTs starting in 2007. Initially we were focused on how well designed and implemented the trials were, using the Cochrane Risk of Bias tool. With recent evidence that p-hacking and publication bias are a serious issue in many RCTs, we were curious what was happening in child health. We used our sample of 300 RCTs from each year in 2007 and 2017 to examine to what extent p-hacking and publication bias were potentially influence a distortion of evidence. We reflect on whether we need a cultural shift or turn to Bayesian analyses to address the issues identified.


Terry Klassen is a Pediatric Emergency Physician and Clinician Scientist at the Children’s Hospital Research Institute of Manitoba (CHRIM) and University of Manitoba in Winnipeg. He is the CEO and Scientific Director of CHRIM. He is a Tier 1 Canada Research Chair in Clinical Trials. He is focused on the design and conduct of randomized controlled trials and systematic reviews to improve the outcomes of acutely ill and injured children presenting to the emergency department. He is working to improve the methods and conduct of research in this area.


Katrin Auspurg and Alexander Tekles (Dec 3

Do male researchers disregard the work of female researchers? The role of gender in citation decisions


Katrin Auspurg holds a full professorship in sociology, specializing in quantitative empirical research, at the Department of Sociology at the Ludwig-Maximilians-University (LMU) Munich, Germany. She is currently speaker of the LMU Open Science Center. Her research interests include incentive structures and gender inequalities in science, and how to explore risk factors and evidence for publication bias in observational social science studies. She also investigates how inequalities in the labor market and the family intersect. Recent publications examine possible bias caused by sampling strategies (Research on Social Stratification and Mobility), the fairness of earnings and gender status beliefs (American Sociological Review), ethnic discrimination on housing markets, and possible publication bias in this research field (Journal of Ethnic and Migration Studies).

Alexander Tekles is a PhD candidate at the Department of Sociology at the Ludwig-Maximilians-University Munich, Germany and the headquarters of the Max-Planck-Society. His doctoral research centres on gender inequalities in science. He has also been doing research on the disambiguation of author names in bibliometric databases and the validation of bibliometric indicators.


Despina Koletsi (Nov 19)

GRADE…ing Quality of the Evidence. Where do we stand?


Effectiveness of health care interventions should be supported by good evidence so as to be useful and beneficial for clinical decision making. The introduction of Grading of Recommendations, Assessment, Development and Evaluations (GRADE) system has offered the potential to rank the quality of the evidence from systematic reviews and meta-analyses in a less subjective way, while it has become evident that high- quality evidence is generally scarce. Therefore, initiatives to identify medical interventions that are supported by high/ low quality of the evidence, or are in need of update, are considered crucial for fostering actions that may guide clinical research and enhance research design, conduct, analysis and dissemination, for informed decisions in practice. The aim of this presentation is to map evolution of evidence quality in biomedical research. In addition, current state of evidence in oral health and orthodontic research will be presented.


Despina Koletsi is currently Senior Scientist/ Lecturer at the University of Zurich, Switzerland. She qualified from the School of Dentistry, University of Athens and received her orthodontic training from the same University, where she completed a Master of Science in Orthodontics. She earned a Doctorate from the University of Bonn and a second Master’s Degree in Epidemiology from the London School of Hygiene and Tropical Medicine, University of London. She also completed the Postgraduate Program in Learning and Teaching in Higher Education of the University of London and obtained the University of London Worldwide Recognized Tutor Status. She has been in private orthodontic practice in Athens, Greece, since 2012. Her research interests and areas of expertise focus on orthodontics, oral health, epidemiology, research methodology, and meta- epidemiology. She is member of the editorial board of 5 peer- reviewed journals and serves as an Independent Reviewer for 30 international journals in Orthodontics, Paediatric Dentistry, Dentistry, Medicine, Epidemiology and Evidence- based research.


Andreas Schneck (Nov 12)

Are most published research findings false? Trends in statistical power, publication bias and p-hacking as well as the false discovery rate in psychology (1975–2017)


The replicability crisis (or cases of fraud) challenges scientific integrity and does not only result in a loss of trust by society; it may also lead to wrong or even harmful policy or medical decisions. The question is: how reliable are scientific results, and how does this reliability develop over time? Based on 39,324 papers in psychology published between 1975 and 2017, this presentation examines three measures of scientific integrity empirically: statistical power, publication bias and p-hacking, as well as the false discovery rate. In sum, at 44.4%, the mean statistical power was too low, while publication bias and p-hacking were substantial, while only the statistical power showed a slight upward trend over time. Taken together, the false discovery rate of 31.9% indicates that about one-third of all statistically significant findings were only statistical artifacts rather than substantial results.


Andreas Schneck is a postdoctoral researcher at the LMU Munich (Department of Sociology). He has been doing meta-research on publication bias as well as research on the scientific reward system and social inequalities.


Janneke van 't Hooft (Nov 5)

The road towards more Useful randomised controlled trials (RCTs). Preterm birth RCTs as an example.


Clinical research can provide knowledge that facilitates medical decision making and therefore helps patients and societies by increasing health, reducing harm and targeting an efficient use of available resources. It is absolutely critical that the knowledge derived from clinical research is meaningful and useful, so that by itself, or when integrated with other studies in evidence synthesis, decision models or guidelines, it leads to favourable health outcomes. Unfortunately evidence suggests that a high proportion of research is not useful or may even provide false claims. Ineffective research practices - a problem leading to ‘research waste’ - are estimated to consume as much as 85% of the global medical research budget.

To target the gap between clinical research and its impact on clinical decision making, research should have solid methodology to increase its validity and truthfulness (e.g. low risk of bias, proper reporting and proper use of statistics). Many collaborative initiatives have developed checklists to address these issues, like the CONSORT checklist for reporting and Cochrane Risk of Bias tool. However, reporting thoroughness and bias appraisal do not capture whether this research is useful or not. There are more aspects to consider in order to declare usefulness. A paper published in 2016 by Ioannidis ‘Why most clinical research is not useful’ elaborated on eight features that can characterize useful research. Those include problem base (“is there a health problem that is big/important enough to fix?”), context placement (“has prior evidence been systematically assessed to inform (the need for) new studies?”), information gain (“is the proposed study large and long enough to be sufficiently informative?”), pragmatism (“does the research reflect real life?”), patient centeredness (“does the research reflect top patients priorities?”), value for money (“is the research worth the money?”), feasibility (“can this research be done?”), and transparency indicators (“are methods, data, and analysis verifiable and unbiased?”).

In this talk I will present the results of the operationalized 8 previously proposed usefulness criteria for clinical research applied to 350 randomised controlled trials (RCTs) in preterm birth prevention.


Janneke van ‘t Hooft, MD, PhD and Masters in Clinical Epidemiology, currently specialist in training at the department of obstetrics and gynaecology at the Amsterdam UMC, University of Amsterdam, The Netherlands. In 2019-2020 she worked as a Postdoc at the Meta Research and Innovation Center at Stanford (METRICS). She combines her clinical work with research in which she supervises several PhD students. Her previous research focused on improving evaluation research in obstetrics and gynecology by coordinating the development of core outcome sets and setting (international) research priorities in obstetrics and gynecology.


Robert J MacCoun (Oct 29)

p-Hacking: A Strategic Analysis


The phenomenon of p-hacking occurs when researchers engage in questionable practices that enable them to report findings as being statistically significant. I offer four models of p-hacking behavior – unconditional, strategic, greedy, and restrained – and explore the implications of each model. I then discuss the implications of recent reforms (routine replication, pre-registration, and blinded data analysis) with respect to these models.


Robert MacCoun is a social psychologist and public policy analyst who has published numerous studies on a variety of topics, including illicit drug use, drug policy, judgment and decision-making, citizens’ assessments of fairness in the courts, social influence processes, and bias in the use and interpretation of research evidence by scientists, journalists and citizens. A preeminent scholar working at the border of law and psychology, his scholarship involves a mix of experimental and non-experimental empirical research as well as computational modeling and simulation.

MacCoun’s recent publications include “Hide Results to Seek the Truth,” Nature (2015), “Half Baked: The Retail Promotion of Marijuana Edibles,” New England Journal of Medicine (2015), and “The Burden of Social Proof: Shared Thresholds and Social Influence,” Psychological Review, (2012).

MacCoun’s book with Peter Reuter, Drug War Heresies (Cambridge, 2001) is considered a landmark scholarly analysis of the drug legalization debate. MacCoun has also written extensively on the military’s “Don’t Ask, Don’t Tell” policy. His publications and expert testimony on military unit cohesion were influential in the 1993 and 2010 policy debates about allowing gays and lesbians to serve openly in the US military.

Prior to joining SLS in 2014, MacCoun was a member of the faculties of the Law School and the Goldman School of Public Policy at UC Berkeley. He has been a visiting professor at Stanford Law School and at Princeton’s Woodrow Wilson School. From 1986 to 1993 he was a behavioral scientist at RAND Corporation, where he served as a staff member at the Institute for Civil Justice and the Drug Policy Research Center as well as a faculty member at the RAND Graduate School of Policy Studies.

MacCoun also holds a joint appointment as a Senior Fellow at the Freeman Spogli Institute at Stanford University and a courtesy appointment in the Department of Psychology.


Atle Fretheim (Oct 22)

Randomized trial of school closures in Norway (spoiler alert: it didn’t happen)


As Norway went into lock down and closed all schools, researchers at the Norwegian Institute of Public Health planned for a randomized trial of school re-openings – to assess the impact of school closure on spread of COVID-19, and other outcomes. The trial gained support among key policy makers, but was in the end dropped by the government. It was also rejected by the Ethical Committees. The primary investigator and driving force for the trial, Atle Fretheim, will present his story, which entails methodological, practical and political hurdles. His introduction will serve as a starting for a discussion about a global problem: Practically no randomized trials of population scale infection control measures are being undertaken, anywhere. The Norwegian team came close, but failed – at least they have, so far. Is it realistic to call for RCTs in the area? Should we resort to observational approaches that are more practically feasible?


Atle Fretheim, MD, is a Research Director at the Norwegian Institute of Public Health and Adjunct Professor at Oslo Metropolitan University. He has been doing health services and policy research over the last 20 years, including impact evaluations (e.g. cluster trials) and systematic reviews. He heads the Centre for Informed Health Choices at the Norwegian Institute of Public Health. In 2011-12 he spent a year with the Dept. of Population Medicine, Harvard Medical School, as a Harkness Fellow in Health Policy and Practice.


Hank Greely (Oct 15)

CRISPR Babies: Assessing Human Germline Genome Editing


In late November 2018, word leaked, then spewed, out that two embryos that He Jiankui had edited using CRISPR had become baby girls. The ensuring controversy engendered continuing factual uncertainties and ethical and political controversies. In this talk, adapted from my forthcoming book, CRISPR People: The Science and Ethics of Editing People (MIT Press, Feb. 2021), I will very briefly sketch what we think He Jiankui did, assess the ethics of his experiments, suggest some lessons of “Science” from it, and end with some reflections on the possible future role of “proven safe” human germline genome editing.


Henry T. (Hank) Greely is the Deane F. and Kate Edelman Johnson Professor of Law; Professor, by courtesy, of Genetics; and Director of the Center for Law and the Biosciences at Stanford University. He specializes in ethical, legal, and social issues arising from the biosciences. He is a founder and immediate past President of the International Neuroethics Society; chairs the California Advisory Committee on Human Stem Cell Research; chairs the Ethical, Legal, and Social Issues Committee of the Earth BioGenome Project; and currently serves on two National Academy of Sciences Committees, the first on Developing a Research Agenda and Research Governance Approaches for Climate Intervention Strategies that Reflect Sunlight to Cool Earth and the second on Ethical, Legal, and Regulatory Issues Associated with Neural Chimeras and Organoids. He serves on the NIH BRAIN Initiative’s Multi-Council Working Group and co-chairs the Initiative’s Neuroethics Work Group. He published THE END OF SEX AND THE FUTURE OF HUMAN REPRODUCTION in 2016. His next book, CRISPR PEOPLE: THE SCIENCE AND ETHICS OF EDITING HUMANS, will be published in February 2021.

Professor Greely graduated from Stanford in 1974 and Yale Law School in 1977.  He served as a law clerk for Judge John Minor Wisdom on the United States Court of Appeals for the Fifth Circuit and Justice Potter Stewart of the United States Supreme Court.  After working during the Carter Administration in the Departments of Defense and Energy, he entered private law practice in Los Angeles in 1981.  He joined the Stanford faculty in 1985.


Deborah Zarin (Oct 8)

Lack of Harmonization of Outcome Measures in COVID trials, and other Musings


Deborah A. Zarin, MD, is the Program Director of the Multi-Regional Clinical Trials Center of Brigham and Women’s Hospital, Advancing the Clinical Trials Enterprise. She was the Director of between 2005 and 2018.  In that capacity, she oversaw the world’s largest clinical trials registry, as well as the development and implementation of the first public database for summary clinical trial results. She also played a major role in the development and implementation of key legal and policy mandates for clinical trial reporting, including regulations under FDAAA (42 CFR Part 11) and the NIH trial reporting policy. Dr Zarin’s recent research has been on the quality of trial reporting, as well as issues in the design and analysis of clinical trials.

Previous positions held by Dr. Zarin include the Director, Technology Assessment Program, at the Agency for Healthcare Research and Quality, and the Director of the Practice Guidelines program at the American Psychiatric Association. In these positions, Dr. Zarin conducted systematic reviews and related analyses in support of evidence based clinical and policy recommendations.

Dr. Zarin graduated from Stanford University and received her doctorate in medicine from Harvard Medical School. She completed a clinical decision making fellowship and a pediatric internship, and is board certified in general psychiatry as well as in child and adolescent psychiatry.


Peter Gøtzsche (Oct 1)

Mental health survival kit and withdrawal from psychiatric drugs


This book will help people with mental health issues survive and come back to a normal life. The general public believes that drugs against depression and psychosis and admission to a psychiatric ward are more often harmful than beneficial, and this is also what the science shows. Even so, most people continue taking psychiatric drugs for many years. This is mainly because they have developed drug dependence. The psychiatrists and other doctors have made hundreds of millions of people dependent on psychiatric drugs and yet have done virtually nothing to find out how to help them come off them safely again, which can be very difficult. The book explains in detail how harmful psychiatric drugs are and tells people how they can withdraw safely from them. It also advises about how people with mental health issues may avoid becoming psychiatric “career” patients and lose 10 or 15 years of their life to psychiatry.

Web page:

Pdf edition (214 pages): DKK150 (about €20). See Contents here.

Please either:

1) wire DKK150 using IBAN Account number DK5867710005437934 with swiftcode LAPNDKK1 (Laegernes Bank). Write your email address in the message field. If needed, the account holder is: Peter Gøtzsche, bank address: Dirch Passers Allé 76, DK-2000 Frederiksberg, or

2) go to my GoFundMe account, donate DKK150 and write to me via this homepage and give me your email address.

The book has also appeared in Danish (see just below), and will appear in Dutch, Greek, Icelandic, Portuguese, Spanish and Swedish.


Professor Peter C Gøtzsche graduated as a Master of Science in biology and chemistry in 1974 and as a physician 1984. He is a specialist in internal medicine; worked with clinical trials and regulatory affairs in the drug industry 1975-1983, and at hospitals in Copenhagen 1984-95. Co-founded the Cochrane Collaboration (the founder is Sir Iain Chalmers), and established the Nordic Cochrane Centre in 1993. Became professor of Clinical Research Design and Analysis in 2010 at the University of Copenhagen and has been a member of the Cochrane Governing Board twice. Co-founded Council for Evidence-based Psychiatry in the UK in 2014 and International Institute for Psychiatric Drug Withdrawal in Sweden in 2016. Founded the Institute for Scientific Freedom in 2019. Currently works as researcher, lecturer, author and independent consultant, e.g. in lawsuits. Visiting professor, University of Newcastle.

Twitter: @PGtzsche1

Websites: and


Peter Grabitz, Maia Salholz-Hillel, Nicholas Devito (Sept 10)

“Rapid Results Dissemination of Registered COVID-19 Clinical Trials”


Dissemination of clinical trial results is important for medical decision making. Public health emergencies, like the COVID-19 pandemic, further amplify the importance of complete and timely reporting of results. Normal results dissemination timelines expect results within 12 months of completion; however WHO suggests that this timeframe should be “greatly shortened” in an emergency context. The primary aim of this project is to assess the extent and timeliness of reporting for registered COVID-19 clinical trials as summary results on trial registries, preprints, and peer-reviewed publications. Using a combination of automated and manual search strategies, we will collect data linking results to trial registries at six month intervals in order to provide ongoing information on results dissemination during the pandemic.

We will present an outline of our methods and some of our experiences working on this project to date. We hope this will lead to an interesting discussion about the issue of pandemic results dissemination and feedback on the project.

The study protocol is available at


Nicholas J. DeVito is a doctoral student and researcher at the DataLab and the Centre for Evidence-Based Medicine at the University of Oxford. His research focuses on topics in health policy, research integrity, and transparency.

Maia Salholz-Hillel is a doctoral student and researcher at the QUEST Center for Transforming Biomedical Research within the Berlin Institute of Health, Charité Universitätsmedizin Berlin. Her research focuses on metrics for biomedical research evaluation. She has a background in cognitive neuroscience and previously worked in science policy.

Peter Grabitz is a physician turned meta-researcher at the QUEST Center for Transforming Biomedical Research within the Berlin Institute of Health, Charité Universitätsmedizin Berlin. He works on transparency and integrity in the biomedical sciences.


Jelte Wicherts (Sept 3)

The continuing secrecy surrounding psychological research data


In this talk, I will address the question why the majority of psychological researchers continue to publish irreproducible results by failing to share their data after publishing the results. I discuss meta-scientific recent findings related to scientific norms, p-hacking, privacy issues, data handling, and misconduct. I also discuss what individual researchers, academic institutions, funders, and journals can do to improve the reproducibility and transparency of social science research.


Jelte M. Wicherts is a meta-researcher, co-founder of the Meta-Research Center at Tilburg University, methodology professor, and head of the Department of Methodology and Statistics at the Tilburg School of Social and Behavioral Sciences. Among other things, he studies data sharing, statistical errors, publication bias, biases in analyses, statistical intuitions, misconduct, peer review, reproducibility, replicability, and psychometric models of intelligence and other psychological traits.


Mario Malički (Aug 27)

From amazing work to I beg to differ: analysis of bioRxiv preprints that received one public comment till September 2019


Despite 70% of preprint servers allowing users to post comments on their platforms, and researchers perceiving the possibility of receiving comments as one of the advantages of preprints compared to traditional publishing, no research (to the best of our knowledge) has examined the nature of comments or actors involved in preprint commenting. We conducted a cross-sectional study of 1,983 bioRxiv preprints that received a single comment on the bioRxiv platform between 21 May 2015 (as bioRxiv comment API does not provide comments posted before this date) and September 9, 2019 (study data collection date). More than two thirds of those comments were posted by non-authors (n=1,366, 69%) while the remainder were posted by the preprint’s authors (n=617, 31%). Twelve percent of non-author’s comments (n=168) were full review reports resembling those traditionally submitted during the journal peer review process.


Mario Malički is a Postdoctoral Fellow at METRICS. After finishing School of Medicine at the University of Zagreb, Croatia, he obtained an MA in Literature and Medicine at King’s College, London, UK, and then worked at the University of Split School of Medicine in Departments of Medical Humanities and Research in biomedicine and health, where he obtained his PhD in Medical Ethics titled: Integrity of scientific publications in biomedicine. He has been researching authorship, peer review, duplicate publications, and publication bias. From 2017-2019 Mario has been a postdoc at AMC and ASUS Amsterdam, Netherlands, and in 2020 he joined METRICS where he will focus on meta-research of preprints.


Nikolaos Pandis (Aug 20)

Who is running the show? The role of the industry in the practice of dentistry and a brief overview of evidence quality in dental research


This short presentation will consist of three parts. In the first part, I will attempt to discuss the role of the industry in the practice of dentistry. Secondly, I will provide a brief overview of the quality of the evidence in dental research considering aspects of design, conduct, analysis, and reporting. In the last part, I will outline the active implementation scheme for RCT reporting followed by the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO).

Short CV

Associate professor: University of Bern, Switzerland
Private practice: Corfu, Greece
Degree of dental surgery: University of Athens, Greece
Orthodontic specialty certificate and MS: The Ohio State University, USA
Fellowship in craniofacial orthodontics: University of Texas, Dallas, USA
Dr. med dent in orthodontic biomechanics: University of Bonn, Germany
MSc in clinical trials: London School of Hygiene & Tropical Medicine, UK
PhD in Epidemiology: University of Ioannina, Greece
MS: Biostatistics, University of Hasselt, Belgium
Diplomate of the American Board of Orthodontics
External tutor for the MS in clinical trials: London School of Hygiene & Tropical Medicine, UK
Associate Editor in the AJODO: RCT handling and monthly column on research design methodology


David R Grimes (Aug 13)

Lies, Damned lies, and statistics - the influence of bad science on public understanding


Scientists and physicians rank amongst the most trusted professions in the world, and diligent research underpins crucial medical and societal advances. But scientists are far from infallible, and the consequences of dubious research echoes far beyond the confines of academia. When this happens, the ramifications can be severe, and difficult to shake. To skew matters further, a long-standing and narrow-minded fixation on novel results has led to an explosion of dubious publications, and the rise of predatory publishers - all of which conspire to confound public understanding. And whether it is the furore over hydroxychloroquine or Andrew Wakefield's bloody legacy on vaccination rates, there is a substantial cost when questionable research prospers. To ascertain the damage this can cause, and predict how to circumvent it, then understanding how research is understood by the public is crucial, as is understanding the actors that drive questionable data, from the anti-vaccine movement to alternative health advocates. In this talk, we'll explore how the proliferation of untrustworthy results impacts not just research quality, but has a toxic influence on public trust and health policy - and how we may address it.


Dr David Robert Grimes is a physicist, cancer researcher, and author. He is an assistant professor at Dublin City University, and a visiting researcher at University of Oxford. His research focuses chiefly on the application of radiotherapy physics, and oxygen modelling, and on factors influencing public perception and understanding of science. He is also a science writer, frequently contributing to the Guardian, Irish Times, Sunday Business Post, and BBC on a wide spectrum of science, society and philosophical topics. He was joint recipient of the 2014 Nature / Sense about Science Maddox Prize for Standing Up for Science. His first book, "The Irrational Ape - Why Flawed Logic Puts Us All at Risk and How Critical Thinking Can Save the World", has just been published by Simon & Schuster.


Kevin Boyack (July 30)

A detailed open access model of the PubMed literature


Portfolio analysis is a fundamental practice of organizational leadership and is a necessary precursor of strategic planning. Successful application requires a highly detailed model of research options. We have constructed a model, the first of its kind, that accurately characterizes these options for the biomedical literature. The model comprises over 18 million PubMed documents from 1996-2019. Document relatedness was measured using a hybrid citation analysis + text similarity approach. The resulting 606.6 million document-to-document links were used to create 28,726 document clusters and an associated visual map. Clusters are characterized using metadata (e.g., phrases, MeSH) and over 20 indicators (e.g., funding, patent activity). The map and cluster-level data are embedded in Tableau to provide an interactive model enabling in-depth exploration of a research portfolio.


Kevin Boyack has been with SciTech Strategies since July 2007. Previously he worked at Sandia National Laboratories in areas of combustion, transport processes, socio-economic war gaming, and science mapping. His recent work and current interests include detailed mapping of the structure and dynamics of science and technology, accuracy of maps and classifications, merging of multiple data types and sources, identification and prediction of emerging topics, and development of advanced metrics.


Sally Cripps (July 23)

A Tale of > 2 Models


Forecasting models have been influential in shaping decision-making in the COVID-19 pandemic in two main areas. The first area is the use of prediction models for decision making in resources allocation, and second is the use of models for decision making in regard to the impact of non-pharmaceutical interventions (NPIs). However, there is concern that predictions and inference from these models may have been misleading. In this talk I will discuss findings regarding the accuracy of four prediction models for daily death counts in the state of New York in the early stages of the pandemic, as well as inference on the impact of non-pharmaceutical interventions on the effective reproduction number for COVID-19 in various European countries. Our conclusions are that models performed poorly in the prediction of daily deaths and need to be subject to pre-specified real time performance tests, before their results are provided to policy makers and public health officials. In addition we conclude that different trajectories of the effective reproduction rate give rise to the same trajectory of daily deaths calling into question the effectiveness of NPIs such as lockdown in reducing the spread of COVID-19.


Sally Cripps is an internationally recognised scholar and Professor in Bayesian statistics, a Professor of Maths and Statistics at the University of Sydney and Director of the Centre for Translational Data Science (CTDS). She holds a bachelor degree in Chemical Engineering from the University of Sydney, an MBA from the University of Western Australia and a PhD in statistics from the University of NSW. She is also Chair of the International Society for Bayesian Analysis’s section Bayesian Education and Research in Practice.

 Sally’s research focus is the development of new and novel probabilistic models which are motivated by the need to solve an applied problem with the potential for impact. She has particular expertise in the use of mixture models for complex phenomenon, modelling longitudinal data, nonparametric regression, the spectral analysis of time series, and the construction of transitions kernels in MCMC schemes which efficiently explore posterior distributions of interest.


Mario Malički (July 2)

How should we classify changes between manuscript versions?


The transparency and efficacy of peer review of scholarly manuscripts has received significant criticism in the last decade and has led to an increase on studies of peer review. However no standards have been proposed or implemented for reporting of changes between submitted manuscript versions (S) and published versions of record (VoR), nor on changes between preprints (P) and S or VoR. I will present a proposal for mapping those changes, and the results of a pilot study I conducted. I would also like to hear your opinions on how (and if) you would like to see journals or researchers report what changes occurred in manuscripts due to peer review and publication.


Mario Malički is a Postdoctoral Fellow at METRICS. After finishing School of Medicine at the University of Zagreb, Croatia, he obtained an MA in Literature and Medicine at King’s College, London, UK, and then worked at the University of Split School of Medicine in Departments of Medical Humanities and Research in biomedicine and health, where he obtained his PhD in Medical Ethics titled: Integrity of scientific publications in biomedicine. He has been researching authorship, peer review, duplicate publications, and publication bias. From 2017-2019 Mario has been a postdoc at AMC and ASUS Amsterdam, Netherlands, and in 2020 he joined METRICS where he will focus on meta-research of preprints.

Vinay Prasad (June 18)

The lay of the land of cancer research


Vinay Prasad MD MPH is a practicing hematologist-oncologist and Associate Professor of Medicine at the University of California San Francisco. He studies cancer drugs, health policy, and evidence-based medicine. He is author of over 240 academic articles, and the books Ending Medical Reversal (2015), and Malignant (2020). He hosts the oncology podcast Plenary Session. He tweets @VPrasadMDMPH

Valentin Danchev (June 11)

Designing a Clinical-Data Marketplace to Accelerate Sharing and Reuse of Covid-19 Clinical Data and Beyond


The availability of individual participant data (IPD) from clinical trials is more vital now than ever for harnessing Covid-19 treatment discovery, reproducibility, and transparency. Yet sharing of clinical trial data is rare in general (Danchev, Min, Borghi, Baiocchi, Ioannidis, 2020), and appears even rarer among Covid-19 trials. For example, among the 1,324 Covid-19 trials registered in as of May 11 2020, only 144 plan to share IPD, and the rest are either reluctant (n=572), undecided (n=266), or miss a data sharing statement (n=342).

How could we accelerate clinical trial data sharing? One approach is by mandating data sharing. Many proposals have been made but these have faced so far considerable resistance. Another approach is by incentivizing trialists and sponsors, e.g., by citing shared datasets. There exist, however, a considerable time lag between the sharing of a clinical trial dataset and the accumulation of potential credit (if any), making the approach less suitable for a pandemic response.

In this talk, I will outline a third approach that draws an intuitive analogy between data sharing and online marketplaces. Similarly to a pre-Airbnb lodging marketplace, in which many would host relatives and friends but not strangers, many investigators would still share data with coinvestigators and collaborators but not with scientists at large. In the pre-pandemic world, Airbnb has transformed the lodging marketplace making it possible for numerous unknown hosts and travelers to securely connect. I leverage multidisciplinary insights from diverse bodies of knowledge, ranging from network science and online marketplaces to meta-research and sociology of science, to prototype a clinical-data marketplace—a secure socio-technological infrastructure that could, if implemented, generate incentives for data sharing, integrate the fragmented landscape of clinical trial data, and ultimately accelerate Covid-19 clinical trial data sharing and reuse at scale. I will also outline key principles as well as (many) challenges of a clinical-data marketplace, and will discuss how it would differ from other networked marketplaces.


Valentin Danchev is a postdoctoral fellow at the Meta-Research Innovation Center at Stanford (METRICS). Valentin’s research is at the intersection of meta-research and computational sociology, with a special interest in social network analysis, open data and data sharing, and reward systems in science.


Chirag Patel (June 4)

Real-world meta-science to identify environmental disparities and non-genetic correlates of phenotypesAbstract:

Disentangling how environmental factors lead to critical health disparities has been largely unfeasible to date. Most studies consider a single disease or environmental factor at a time, losing the holistic picture of trajectory to disease risk and multimorbidity. There are several potentially promising avenues to begin to shine light complex nature of disease. First, large administrative health and biobanked-derived cohorts are a popular source health data that enables more comprehensive analysis across “exposomes” — the totality of human environmental exposure — and “genomes”. Second, our ability to assay 1000s to millions of phenotypes and exposures in large scale, such as exogenous and endogenous bacteria and viruses is almost routine. How can we make these observational data and new measurements useful? In this talk, Chirag Patel will give a tour of the use of observational administrative and large-scale exposomic data to explain disease risk and the many challenges ahead.


Chirag Patel is an associate professor of biomedical informatics at Harvard Medical School where his group develops methods to elucidate the role of genetic and environmental factors in health disparities and disease risk. He attained a PhD in biomedical informatics at Stanford University.


Denes Szucs (May 28)

Sample size evolution in neuroimaging research


We evaluated 1038 of the most cited structural and functional (fMRI) magnetic resonance brain imaging papers (1161 studies) published during 1990-2012 and 273 papers (302 studies) published in top neuroimaging journals in 2017 and 2018. 96% of highly cited experimental fMRI studies had a single group of participants and these studies had median sample size of 12, highly cited clinical fMRI studies (with patient participants) had median sample size of 14.5, and clinical structural MRI studies had median sample size of 50. The sample size of highly cited experimental fMRI studies increased at a rate of 0.74 participant/year and this rate of increase was commensurate with the median sample sizes of neuroimaging studies published in top neuroimaging journals in 2017 (23 participants) and 2018 (24 participants). Only 4 of 131 papers in 2017 and 5 of 142 papers in 2018 had pre-study power calculations, most for single t-tests and correlations. Only 14% of highly cited papers reported the number of excluded participants whereas about 45% of papers in 2017 and 2018 reported excluded participants. Targeted interventions from publishers and funders could facilitate increase in sample sizes and adherence to better standards.



Dr Szucs is Reader in Cognitive Neuroscience and Psychology at the University of Cambridge, United Kingdom. He is deputy director of the Centre for Neuroscience in Education at the Department of Psychology in Cambridge and is an official fellow of Darwin College, Cambridge where he is also a graduate student tutor. Dr Szucs is Senior Fellow in the Science of Learning at the UNESCO, United Nations (from 2018). Dr Szucs has held various research grants from UK, European and USA funders. Dr Szucs was the recipient of the James S McDonnell Foundation Scholar Award in 2013 and was elected as a Fellow of the Association for Psychological Science (USA) in 2019.


Perrine Janiaud (May 21)

100 DAYS LATER - The worldwide clinical trial research agenda in the beginning of the COVID-19 pandemic: results from COVID-evidence


Never before had clinical trials more public attention than those testing interventions for COVID-19. In this follow-up of the COVID-evidence project (, Perrine Janiaud will present the characteristics and time trends of trials testing interventions to treat or prevent COVID-19 during the first 100 days of the pandemic (from December 30th, 2019 to April 9th, 2020).


Perrine is a postdoctoral fellow at METRICS. She completed her PhD in clinical epidemiology at Claude Bernard University in Lyon, France, on the rational use of available evidence before extrapolating the benefit risk ratio from adults to children. Perrine’s research focuses on clinical trial methodology with a strong interest in pragmatic trials but also on the reporting and quality of the evidence use to inform decision making.


Ben W Mol (May 14)

Cost-effectiveness of liberal versus strict lock-down for COVID-19: are we using the correct metrics?


Objectives: To balance the costs and effects comparing a strict lock-down versus a flexible social distancing strategy for societies affected by COVID-19 disease.
Design: Cost-effectiveness analysis.
Participants: We used societal data and COVID-19 mortality rates from the public domain
Interventions The intervention was a strict lock-down strategy as followed by Denmark, Finland and Norway. The reference was a flexible social distancing as is currently applied in Sweden.
Outcomes:  We derived mortality rates from COVID-19 statistics, assumed the expected life years lost from each COVID-19 death to be 11-years and calculated expected mortality rates with modelling. We compared scenarios where a vaccine would be available after 1 month, 6 months, 12 months or never.
The incremental financial costs of the intervention were calculated as costs of closing kindergartens and schools <16 years and a reduction in revenues in hotels and restaurants with 40% and 60%. Calculations were projected per one million inhabitants.
Main outcome measure Life years saved.
Results: In Sweden, the number of people dying with COVID-19 per million inhabitants was 1,290, resulting in 14,185 life years lost. In neighbouring countries with a strict lockdown strategy, the number of people dying with COVID-19 varied between 159 and 363 per million, resulting in 1,752 to 4,013 life years lost. The incremental costs to save 1 life year were between Euro 90,062 and Euro 111,094 for Norway and Denmark, respectively.
Conclusions: Comparisons of public health interventions for COVID-19 should take into account life years saved and not lost lives. A comparison between a strict lock-down and a more liberal policy of social distancing renders strict lock-down to be cost-effective when society is willing to pay Euro 100,000 per life year saved.
PS: Costs will be adjusted based on input from KPMG Sweden that we expect to get on Tuesday

Ben W. Mol , Jonathan Karnon


Ben (Willem) Mol is Professor of Obstetrics and Gynaecology at Monash University. He is focused on the organisation of multi-centric evaluative research in Obstetrics, Gynaecology and Fertility. The research is focused mainly upon everyday practices. Born and trained in the Netherlands, Ben was instrumental in setting up a nationwide research consortium in women’s health that produced many landmark studies.

Ben came to Australia in 2014 and started at Monash in 2018. Since his arrival, Ben holds continuous NHMRC funding, including a prestigious investigator grant. During his time in Australia, he developed extensive relations with Asian universities. Ben has mentored >100 PhD students and published >1200 papers, many in high impact journals.  A publication in Nature acknowledged Ben as one the most proliferative medical scientists. His professional adage is ‘A day without randomisation is a day without progress.'



Cathrine Axfors (May 7)

Population-level COVID-19 mortality risk for non-elderly individuals


In this METRICS International Forum, I will present updated findings on the “Population-level COVID-19 mortality risk for non-elderly individuals overall and for non-elderly individuals without underlying diseases in pandemic epicenters” written together with John Ioannidis and Despina Contopoulos-Ioannidis (medrXiv preprint originally posted April 8, see attached update).

The update includes data from 11 European countries, Canada, and 12 US states, with >800 COVID-19 deaths as of April 24. We calculated the proportion of COVID-19 deaths in people <65 years old (overall and without underlying diseases); absolute risk of COVID-19 death as of May 1, 2020, in people <65 and in those ≥80 years old; and the absolute COVID-19 death risk expressed as equivalent of death risk from driving a motor vehicle. The results corroborate the differential risk of COVID-19 death depending on age, with a very low risk for COVID-19 death among the vast majority of the workforce, and provide a reference for risk estimates in terms of a ubiquitous everyday activity.

The presentation will make room for a meta-research discussion on COVID-19 mortality statistics. Given the time, we may also dwell on a Swedish perspective of the COVID-19 pandemic.


Cathrine Axfors is a Visiting Postdoctoral Scholar at METRICS and affiliated with the Department for Women’s and Children’s Health at Uppsala University in Sweden. She completed her MD training and her PhD (on trait anxiety in the peripartum period and its association with depression and health care use) at Uppsala University. Her main research interest is meta-research in biomedicine, in particular regarding register-based research, women’s health and physical exercise.


Stephan Bruns (Apr 30)

Estimating the extent of inflated significance in economic


Using a sample of 138,876 published p-values from 372 meta-analyses, we estimate the counterfactual distribution of p-values that would have occurred if all studies had estimated the respective genuine effects unbiasedly. Comparing factually observed p-values with counterfactual p-values suggests that 43% of all p-values are mislocated. A p-value is mislocated if it is published as being statistically significant but actually expected to be non-significant or published as being non-significant but actually expected to be significant. For p-values published as being statistically significant, about 55% are expected to be non-significant. Sensitivity analyses suggest that these numbers are fairly robust. Exploratory regression analysis indicates that p-values are more reliable in microeconomics compared to macroeconomics and more reliable the larger the share of adequately powered studies in the respective research field.


Stephan Bruns is an Assistant Professor at the Centre for Environmental Science at Hasselt University in Belgium and affiliated with the Econometrics and Statistics Group at the University of Göttingen in Germany. He obtained his PhD on the role of meta-analysis in economics from the Max-Planck-Institute of Economics. His main research interest is meta-research in economics and environmental science. 


Noah Haber and Sarah Wieten (Apr 23)

DAG With Omitted Objects Displayed (DAGWOOD): A method for confronting causal assumptions




Directed acyclic graphs (DAGs) are frequently used in epidemiology as a guide to assess causal inference assumptions. However, DAGs show the model as assumed, but not the assumption decisions themselves. We propose a framework and method which algorithmicly reveals these hidden assumptions, both conceptually and graphically.

The DAGWOOD framework combines a root DAG (representing the DAG in the proposed analysis), a set of branch DAGs (representing alternative hidden assumptions to the root DAG), a graphical overlay (represents the branch DAGs over the root DAG), and a ruleset governing them. All branch DAGs follow the same rules for validity: they must 1) change the root DAG, 2) be a valid, identifiable causal DAG, and either 3a) require a change in the adjustment set to estimate the effect of interest, or 3b) change the number of frontdoor paths. The set of branch DAGs corresponds to a list of alternative assumptions, where all members of the assumption list must be justifiable as being negligible or non-existent. A graphical overlay helps show these alternative assumptions on top of the root DAG.

We define two types of branch DAGs: exclusion restrictions and misdirection restrictions. Exclusion restrictions add a single- or bi-directional arc between two existing nodes in the root DAG (e.g. direct pathways and colliders), while misdirection restrictions represent alternative pathways that could be drawn between objects (e.g., reversing the direction of causation for a controlled confounder turning that variable into a collider). Together, these represent all single-change assumptions to the root DAG.

The DAGWOOD framework 1) makes explicit and organizes important causal model assumptions, 2) reinforces best DAG practices, 3) provides a framework for critical evaluation of causal models, and 4) can be used in iterative processes for generating causal models.


Noah Haber is a postdoctoral researcher at the Meta Research Innovation Center at Stanford (METRICS). He specializes cross-discipline causal inference between econometrics and epidemiology, systematic strength of evidence review, health economics, meta research, statistics, and HIV/AIDS.

Sarah Wieten is the Clinical Ethics Fellow at Stanford Health Care and a postdoctoral researcher at the Stanford Center for Biomedical Ethics. She specializes in interdisciplinary projects at the intersection of epistemology and ethics in health care.


Florian Naudet (Apr 16)

Will the ICMJE clinical data sharing policy achieve his intended objectives?


In 2016, the International Committee of Medical Journal Editors (ICMJE), one of the most influential groups of medical journal editors, published an editorial stating that “it is an ethical obligation to responsibly share data generated by interventional clinical trials because participants have put themselves at risk”. The ICMJE considered that there is an implicit social contract imposing an ethical obligation that results should lead to the greatest possible benefit to society. The ICMJE suggested that de-identified IPD should be made publicly available no later than 6 months after publication of the main trial results. However, this proposal triggered debate and a large number of trial designers and investigators were reluctant to adopt this new norm. Their concerns were expressed in terms of the question of the feasibility of the proposed requirements, the resources needed, the real or perceived risks for trial participants, and the need to protect the interests of patients and researchers. As a consequence, the ICMJE stepped back from their initial proposal before it came into action. Their final requirements do not make data-sharing mandatory, but “merely” requires a data-sharing plan to be included in each paper from 1st of July 2018 (and pre-specified in study registration for clinical trials that will begin enrolling participants on or after 1st of January 2019), as a condition for publication. With 5645 ICMJE-affiliated journals claiming to follow the ICMJE recommendations this policy was expected to result in a huge movement toward the goal of full transparency and data-sharing in order to maximize the value of clinical trials’ results. The talk will details outputs of the Reproducibility in Therapeutic Research program (ANR-17-CE36-0010) including various surveys of journals and funders, a retrospective analysis of RCTs published at Annals of Internal Medicine and a scoping review. All these outputs will give you a taste of the possible impact of the policy and call for changes to the current ICMJE policy.


Florian Naudet is a psychiatrist, meta-researcher and former post-doctoral fellow at METRICS. He's currently Professor of Therapeutics at Rennes 1 University, France. His research interests are to evaluate and develop methodological solutions to assess treatments in patients, primarily but not exclusive in psychiatric research. He has a strong interest in studying research waste and data sharing practices. He has no COI to disclose.


Benjamin Djulbegovic (Apr 9)

Probability of discovering new treatments


The talk assumes that treatment is really not discovered until is clinically evaluated (and found to be efficacious or effective). It further assumes that (clinical) research is conducted in order to address uncertainties (unknown) about a given phenomenon (e.g., effect of treatment) of interest. Thus, acknowledgment and articulation of uncertainties represents a key scientific pillar of research enterprise. But, because clinical research invariably involves humans, acknowledgment of uncertainty also represents a moral requirement for human experimentation.

It is through this application of the clinical trial method that we construct rational response to clinical uncertainties by linking the theory of human experimentation with the theory of rational choice. However, in doing so necessary constraints are introduced in how much we can learn and discovered. The talk will introduce hypotheses, predictions and provide empirical data (some more conclusive than others) about treatment success as a function of our response to resolving unknowns in the hierarchy of clinical uncertainties.


Benjamin Djulbegovic is a Professor at City of Hope where he serves as Director of Program for Evidence-based Analytics at City of Hope. His main academic and research interest lies in attempts to measure and optimize clinical research and practice of medicine by understanding both nature of medical evidence and decision-making. To this effect, his work aims to integrate methods and techniques across evidence-based medicine (EBM), predictive analytics, health outcome research, and decision-sciences with the goal of improvement of health care. The role of uncertainty and rationality in science and clinical medicine has been one of the common themes across his work, particularly evident in his analysis of equipoise and the role of regret. Dr. Djulbegovic has systematically applied science of EBM and decision analysis to the entire fields of hematology and oncology that resulted in two books; the book on “Reasoning and decision-making in hematology” was listed one of the best books by J Natl Cancer Inst and the book “Decision Making in Oncology. Evidence-based management” was assessed as “one of the first and best attempts to apply an evidence-based approach to the practice of medical oncology”.  

As of December of 2019, he has published 345 papers in peer-review journals, 195 abstracts and numerous book chapters. He also received numerous awards for his work, which has also been published in major scientific and medical journals including Nature, Lancet, JAMA, New England Journal of Medicine, etc. He has also widely taught on these subjects. During last 20 years, Dr. Djulbegovic has received continuous external funding by both federal and private entities. He was also selected in the Newsweek’s list of Top Cancer Doctors and since 2011 continuously selected in 1% of top US doctors by the US News & World Report in the field of hematology. His h-index is 63; in 2018 he is included in the list of highly cited researchers which “recognizes world-class researchers selected for their exceptional research performance, demonstrated by production of multiple highly cited papers that rank in the top 1% by citations for field and year in Web of Science.”


Ioana Cristea (Apr 2)

Reliability and accessibility of findings and tools reported in highly-cited research

Description: In this METRICS International Forum, Ioana Cristea will discuss ongoing research about the reliability of effects from highly-cited studies in emotion research, and the accessibility of highly-cited measurement instruments across all sciences. A couple of working ideas about mental health issues and interventions related with the COVID-19 epidemic will also be discussed.

Bio: Ioana Cristea is Assistant Professor at the Department of Brain and Behavioral Sciences, University of Pavia, Italy and a Research Affiliate at the Meta-Research Innovation Center at Stanford University, USA (METRICS). Between 2016 and 2017, she was a Fulbright Visiting Senior Scholar at Stanford University. Her work focusses on critically appraising the efficacy and safety of various psychological and pharmacological interventions for mental disorders, as well as documenting the systematic effects of diverse strains of bias, such as financial and non-financial conflicts of interest.


Joshua Wallach (March 26)

Opportunities for sharing and evaluating research: medRxiv and YODA project


During the second METRICS International meeting, Joshua Wallach will discuss Yale University-related research and data sharing initiatives, including medRxiv (preprint server) and the Yale Open Data Access Project (clinical trial data request platform). He will outline opportunities for participation and collaboration.


Joshua Wallach is an assistant professor within the Yale School of Public Health. He received his MS and PhD in Epidemiology and Clinical Research at Stanford University. Over the past few years, his work with the Collaboration for Research Integrity and Transparency (CRIT) at Yale has focused on the tools, standards, and approaches used to assess the safety, efficacy, and performance of FDA-regulated products. Within the Department of Environmental Health Sciences at Yale School of Public Health, his work combines meta-research and real-world evaluations of medical products. He is a member of the Yale Open Data Access Project, a medRxiv affiliate, and a METRICS faculty affiliate.


Lars Hemkens (March 19)

The COVID-evidence project: brainstorming on a new initiative


COVID-19 is a challenge for those who need to make treatment decisions, plan new trials, or develop clinical practice guidelines and systematic reviews. Hundreds of trials are planned, underway, and some results are expected to be published soon. We aim to support evidence-based decision-making for COVID-19 with a new initiative, started last week: COVID-evidence.

COVID-evidence is a database of the currently available evidence on treatments for COVID-19. It collects information from various data sources about all planned, ongoing, or completed trials on any intervention for patients with SARS-CoV-2-infection. With a freely available, continuously updated database we aim to provide an overview on available trial evidence on benefits and harms of interventions for SARS-CoV-2 infection.

In this Forum session, I would like to present the initiative, brainstorm, and obtain feedback on features and application.


Lars G. Hemkens, MD, MPH is Senior Scientist and Deputy Director of the Basel Institute for Clinical Epidemiology and Biostatistics (ceb), Department of Clinical Research, University Hospital Basel. Previously he was at the Stanford Prevention Research Center (Stanford University) and at IQWiG (Department of the Director). His key interests include answering health care questions with data that were not made for answering those questions. His work focuses on routinely collected data and big data, pragmatic trials and meta-research. As head of the study design and methods team at the Department of Clinical Research, he co-designs various clinical studies in a wide range of medical fields. He is board member of the Network for Evidence Based Medicine and co-speaker of its methods section, associate editor of Trials, sits in the editorial board of BMJ Evidence-Based Medicine and in the working committee of the RECORD reporting guideline group. He is coordinator of the RCD for RCT initiative, which aims to explore the use of routinely collected data for clinical trials. He has published more than 70 peer-reviewed articles, including publications in JAMA, BMJ, CMAJ, and PLoS Medicine and has co-authored 9 IQWiG reports.