The p-Value Debate and Statistical (Mal)practice – Implications for the Agricultural and Food Economics Community

Bibliographic information


Cover of Volume: GJAE - German Journal of Agricultural Economics Volume 72 (2023), Edition 01
No access

GJAE - German Journal of Agricultural Economics

Volume 72 (2023), Edition 01


Authors:
, , , , , , , ,
Publisher
dfv Mediengruppe, Frankfurt am Main
Publication year
2023
ISSN-Online
2191-4028
ISSN-Print
2191-4028

Chapter information


No access

Volume 72 (2023), Edition 01

The p-Value Debate and Statistical (Mal)practice – Implications for the Agricultural and Food Economics Community


Authors:
, , ,
ISSN-Print
2191-4028
ISSN-Online
2191-4028


Preview:

A vivid debate is ongoing in the scientific community about statistical malpractice and the related publication bias. No general consensus exists on the consequences and this is reflected in heterogeneous rules defined by scientific journals on the use and reporting of statistical inference. This paper aims at providing an overview on the debate, discussing how it is perceived by the agricultural economics community, and deriving implications for our roles as researchers, contributors to the scientific publication process, and teachers. Following a ‘Mixed Methods Review’, we start by summarizing the current state of the p-value debate in the context of the replication crisis and commonly applied statistical practices in our community. This is followed by motivation, design, results and discussion of an explorative and descriptive survey on statistical knowledge and practice among the researchers in the agricultural economics community in Austria, Germany and Switzerland. Instead of providing specific guidelines or rules, we derive implications for our roles in the scientific process to support a needed long-term cultural change regarding empirical scientific practices. Acceptance of scientific work should largely be based on the theoretical and methodological rigor and where the perceived relevance arises from the questions asked, the methodology employed, and the data used but not from the results generated. Revised and clear journal guidelines, the creation of resources for teaching and research, and public recognition of good practice are suggested measures to move forward. Keywords statistical inference; p-hacking; publication bias; replication crisis; pre-registration DOI: 10.30430/gjae.2023.0231

Bibliography


  1. AAEA (Agricultural & Applied Association) (2021): Call for Papers for a Special Issue on ‘Replications in Agricultural Economics’ in Applied Economic Perspectives and Policy. http://blog.aaea.org/2020/09/call-for-papers-for-special-issue-on.html. Call: 28.10.2021. Open Google Scholar
  2. ACZEL, B., R. HOEKSTRA, A. GELMAN, E.-J. WAGENMAKERS, I.G. KLUGKIST, J.N. ROUDER, J. VANDEKERCKHOVE, M.D. LEE, R.D. MOREY, W. VANPAEMEL, Z. DIENES and D. VAN RAVENZWAAIJ (2020): Discussion points for Bayesian inference. In: Nature Human Behaviour 4 (6): 561-563. Open Google Scholar
  3. ALBERS, C. (2019): The problem with unadjusted multiple and sequential statistical testing. In: Nature Communications 10 (1): 1921. Open Google Scholar
  4. ALTMAN, D.G. and J.M. BLAND (1995): Absence of evidence is not evidence of absence. In: BMJ 311 (7003): 485. Open Google Scholar
  5. AMRHEIN, V., S. GREENLAND and B. MCSHANE (2019): Scientists rise up against statistical significance. In: Nature 567 (7748): 305-307. Open Google Scholar
  6. AMRHEIN, V., F. KORNER-NIEVERGELT and T. ROTH (2017): The earth is flat (p 0.05): significance thresholds and the crisis of unreplicable research. In: PeerJ 5: e3544. Open Google Scholar
  7. ANGRIST, J. and J.-S. PISCHKE (2008): Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press, Princeton. Open Google Scholar
  8. ARPINON, T. and R. ESPINOSA (2022): A Practical Guide to Registered Reports for Economists. Working Paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id= Open Google Scholar
  9. ASKAROV, Z., A. DOUCOULIAGOS, H. DOUCOULIAGOS and T.D. STANLEY (2022): The Significance of Data-Sharing Policy. In: Journal of the European Economic Association. Open Google Scholar
  10. BAKKER, M., C.L.S. VELDKAMP, M.A.L.M. VAN ASSEN, E.A.V. CROMPVOETS, H.H. ONG, B.A. NOSEK, C.K. SODERBERG, D. MELLOR and J.M. WICHERTS (2020): Ensuring the quality and specificity of preregistrations. In: PLOS Biology 18 (12): e3000937. Open Google Scholar
  11. BANERJEE, A., E. DUFLO, A. FINKELSTEIN, L. KATZ, B. OLKEN and A. SAUTMANN (2020): In Praise of Moderation: Suggestions for the Scope and Use of Pre-Analysis Plans for RCTs in Economics. Working Paper 26993. National Bureau of Economic Researc. Open Google Scholar
  12. BARKASZI, L., S. KESZTHELYI, E. K. CSATÁRI and C. PESTI (2009): FADN Accountancy Framework and Cost Definitions. FACEPA Deliverable No. D1.1.1 - July 2009. In: http://facepa.slu.se/documents/Deliverable_D1-1-1_ Open Google Scholar
  13. LEI.pdf. Call: 9.3.2020. Open Google Scholar
  14. BARREIRO-HURLÉ, J. (2021): Spanish Journal of Agricultural Research Editorial Policy Update: Pre-registration of submissions based on primary data. In: Spanish Journal of Agricultural Research 19 (4): e01105. Open Google Scholar
  15. BASTARDI, A., E. L. UHLMANN and L. ROSS (2011): Wishful thinking: belief, desire, and the motivated evaluation of scientific evidence. In: Psychological science 22 (6): 731-732. Open Google Scholar
  16. BEKKERMAN, A. (2015): The role of simulations in econometrics pedagogy. In: Wiley Interdisciplinary Reviews: Computational Statistics 7 (2): 160-165. Open Google Scholar
  17. BENDTSEN, M. (2018): A Gentle Introduction to the Comparison Between Null Hypothesis Testing and Bayesian Analysis: Reanalysis of Two Randomized Controlled Trials. In: Journal of Medical Internet Research 20 (10): e10873. Open Google Scholar
  18. BENJAMINI, Y. (2016): It's not the p-values' fault. In: The American Statistician, Online Discussion 70: 1-2. Open Google Scholar
  19. BINGHAM, E., J.P. CHEN, M. JANKOWIAK, F. OBERMEYER, N. PRADHAN, T. KARALETSOS, R. SINGH, P. SZERLIP, P. HORSFALL and N.D. GOODMAN (2019): Pyro: Deep Universal Probabilistic Programming. In: Journal of Machine Learning Research 20 (1): 973-978. Open Google Scholar
  20. BLANCO-PEREZ, C. and A. BRODEUR (2020): Publication Bias and Editorial Statement on Negative Findings. In: The Economic Journal 130 (629): 1226-1247. Open Google Scholar
  21. BRODEUR, A., N. COOK and A. HEYES (2020): Methods Matter: p-Hacking and Publication Bias in Causal Analysis in Economics. In: American Economic Review 110 (11): 3634-3660. Open Google Scholar
  22. BRODEUR, A., M. LÉ, M. SANGNIER and Y. ZYLBERBERG (2016): Star Wars: The Empirics Strike Back. In: American Economic Journal: Applied Economics 8 (1): 1-32. Open Google Scholar
  23. BRUNS, S.B., I. ASANOV, R. BODE, M. DUNGER, C. FUNK, S.M. HASSAN, J. HAUSCHILDT, D. HEINISCH, K. KEMPA, J. KÖNIG, J. LIPS, M. VERBECK, E. WOLFSCHÜTZ and G. BUENSTORF (2019): Reporting errors and biases in published empirical findings: Evidence from innovation research. In: Research Policy 48 (9): 103796. Open Google Scholar
  24. BRUNS, S.B. and M. KALTHAUS (2020): Flexibility in the selection of patent counts: Implications for p-hacking and evidence-based policymaking. In: Research Policy 49 (1): 103877. Open Google Scholar
  25. BUCK, S. (2021): Beware performative reproducibility. In: Nature 595 (7866): 151. Open Google Scholar
  26. BUTTON, K.S., J.P.A. IOANNIDIS, C. MOKRYSZ, B.A. NOSEK, J. FLINT, E.S.J. ROBINSON and M.R. MUNAFÒ (2013): Power failure: why small sample size undermines the reliability of neuroscience. In: Nature Reviews Neuroscience 14 (5): 365-376. Open Google Scholar
  27. CAMERER, C.F., A. DREBER, E. FORSELL, T.-H. HO, J. HUBER, M. JOHANNESSON, M. KIRCHLER, J. ALMENBERG, A. ALTMEJD, T. CHAN, E. HEIKENSTEN, F. HOLZMEISTER, T. IMAI, S. ISAKSSON, G. NAVE, T. PFEIFFER, M. RAZEN and H. WU (2016): Evaluating replicability of laboratory experiments in economics. In: Science 351 (6280): 1433-1436. Open Google Scholar
  28. CHRISTENSEN, G., J. FREESE and E. MIGUEL (2019): Transparent and Reproducible Social Science Research. How to Do Open Science. University of California Press, Berkeley, California. Open Google Scholar
  29. CHRISTENSEN, G. and E. MIGUEL (2018): Transparency, Reproducibility, and the Credibility of Economics Research. In: Journal of Economic Literature 56 (3): 920-980. Open Google Scholar
  30. CLEMENS, M.A. (2017): The meaning of failed replications: A review and proposal. In: Journal of Economic Surveys 31 (1): 326-342. Open Google Scholar
  31. COLQUHOUN, D. (2014): An investigation of the false discovery rate and the misinterpretation of p-values. In: Royal Society Open Science 1 (3): 140216. Open Google Scholar
  32. DUMOUCHEL, W. and G.J. DUNCAN (1983): Sample Survey Weights in Multiple Regression Analyses of Stratified Samples. In: Journal of the American Statistical Association 78 (383): 535-543. Open Google Scholar
  33. ELLIOTT, M.R. and R. VALLIANT (2017): Inference for Nonprobability Samples. In: Statistical Science 32 (2): 249-264. Open Google Scholar
  34. FAUL, F., E. ERDFELDER, A.-G. LANG and A. BUCHNER (2007): G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. In: Behavior Research Methods 39 (2): 175-191. Open Google Scholar
  35. FERRARO, P. J. and P. SHUKLA (2020): Feature—Is a Replicability Crisis on the Horizon for Environmental and Resource Economics? In: Review of Environmental Economics and Policy 14 (2): 339-351. Open Google Scholar
  36. FERRARO, P.J. and P. SHUKLA (2022): Credibility crisis in agricultural economics. In: Applied Economic Perspectives and Policy. Open Google Scholar
  37. FISHER, R.A. (1925): Statistical methods for research workers. Oliver and Boyd, Edinburgh. Open Google Scholar
  38. FRICKER, R.D., K. BURKE, X. HAN and W.H. WOODALL (2019): Assessing the Statistical Analyses Used in Basic and Applied Social Psychology After Their p -Value Ban. In: The American Statistician 73 (sup1): 374-384. Open Google Scholar
  39. GELMAN, A. (2016): The Problems With P-Values are not Just With P-Values. In: The American Statistician, Online Discussion. Open Google Scholar
  40. GELMAN, A. and J. CARLIN (2017): Some Natural Solutions to the p -Value Communication Problem—and Why They Won’t Work. In: Journal of the American Statistical Association 112 (519): 899-901. Open Google Scholar
  41. GEWEKE, J., G. KOOP and H. VAN DIJK (2011): Introduction. In: : The Oxford Handbook of Bayesian Econometrics: 1-8. Open Google Scholar
  42. GIGERENZER, G. (2004): Mindless statistics. In: The Journal of Socio-Economics 33 (5): 587-606. Open Google Scholar
  43. GIGERENZER, G. (2018): Statistical Rituals: The Replication Delusion and How We Got There. In: Advances in Methods and Practices in Psychological Science 1 (2): 198-218. Open Google Scholar
  44. GIOFRÈ, D., G. CUMMING, L. FRESC, I. BOEDKER and P. TRESSOLDI (2017): The influence of journal submission guidelines on authors' reporting of statistics and use of open research practices. In: PLOS ONE 12 (4): e0175583. Open Google Scholar
  45. GOODMAN, S.N. (2001): Of P-Values and Bayes: A Modest Proposal. In: Epidemiology 12 (3): 295. Open Google Scholar
  46. GRANT, M.J. and A. BOOTH (2009): A typology of reviews: an analysis of 14 review types and associated methodologies. In: Health information and libraries journal 26 (2): 91-108. Open Google Scholar
  47. GREENLAND, S. (2019): Valid P -Values Behave Exactly as They Should: Some Misleading Criticisms of P -Values and Their Resolution With S -Values. In: The American Statistician 73 (sup1): 106-114. Open Google Scholar
  48. GREENLAND, S. (2020): The causal foundations of applied probability and statistics. https://arxiv.org/pdf/2011.02677. Open Google Scholar
  49. GREENLAND, S., S.J. SENN, K.J. ROTHMAN, J.B. CARLIN, C. POOLE, S.N. GOODMAN and D.G. ALTMAN (2016): Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. In: European Journal of Epidemiology 31 (4): 337-350. Open Google Scholar
  50. HARVEY, C. (2017): Presidential Address: The Scientific Outlook in Financial Economics. In: The Journal of Finance 72 (4): 1399-1440. Open Google Scholar
  51. HAVEN, T.L. and L. VAN GROOTEL (2019): Preregistering qualitative research. In: Accountability in Research 26 (3): 229-244. Open Google Scholar
  52. HECKELEI, T., S. HÜTTEL, M. ODENING and J. ROMMEL (2023): Replication Data for: The p-value debate and statistical (mal)practice - implications for the agricultural and food economics community. GRO.data. Open Google Scholar
  53. HIRSCHAUER, N. (2021): The debate on p-values and statistical inference: What are the consequences for our community? Problems and solutions in statistical practice. GEWISOLA 2021 Pre-Conference Workshop. Open Google Scholar
  54. HIRSCHAUER, N., S. GRÜNER, O. MUßHOFF and C. BECKER (2021): A Primer on p-Value Thresholds and α-Levels - Two Different Kettles of Fish. In: German Journal of Agricultural Economics 70 (2): 123-133. Open Google Scholar
  55. HIRSCHAUER, N., S. GRÜNER, O. MUßHOFF, C. BECKER and A. JANTSCH (2019): Can p-values be meaningfully interpreted without random sampling? In: Statistics Surveys 14: 71-91. Open Google Scholar
  56. HIRSCHAUER, N., G. SVEN, O. MUSSHOFF, F. ULRICH, T. INSA and W. PETER (2016): Die Interpretation des p-Wertes - Grundsätzliche Missverständnisse. In: Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik) 236 (5): 557-575. Open Google Scholar
  57. HUNTINGTON‐KLEIN, N., A. ARENAS, E. BEAM, M. BERTONI, J.R. BLOEM, P. BURLI, N. CHEN, P. GRIECO, G. EKPE, T. PUGATCH, M. SAAVEDRA and Y. STOPNITZKY (2021): The influence of hidden researcher decisions in applied microeconomics. In: Economic Inquiry 59 (3): 944-960. Open Google Scholar
  58. IMBENS, G.W. (2021): Statistical Significance, p -Values, and the Reporting of Uncertainty. In: Journal of Economic Perspectives 35 (3): 157-174. Open Google Scholar
  59. IOANNIDIS, J.P.A., T.D. STANLEY and H. DOUCOULIAGOS (2017): The Power of Bias in Economics Research. In: The Economic Journal 127 (605): F236-F265. Open Google Scholar
  60. IONIDES, E.L., A. GIESSING, Y. RITOV and S.E. PAGE (2017): Response to the ASA’s Statement on p -Values: Context, Process, and Purpose. In: The American Statistician 71 (1): 88-89. Open Google Scholar
  61. KANG, H. (2021): Sample size determination and power analysis using the G*Power software. In: Journal of Educational Evaluation for Health Professions 18. Open Google Scholar
  62. KRANZ, S. and P. PÜTZ (2022): Methods Matter: p-Hacking and Publication Bias in Causal Analysis in Economics: Comment. In: American Economic Review 112 (9): 3124-3136. Open Google Scholar
  63. KRUEGER, J.I. and P.R. HECK (2019): Putting the P -Value in its Place. In: The American Statistician 73 (sup1): 122-128. Open Google Scholar
  64. LEMKEN, D. (2021): The price penalty for red meat substitutes in popular dishes and the diversity in substitution. In: PLOS ONE 16 (6): e0252675. Open Google Scholar
  65. LOGG, J.M. and C.A. DORISON (2021): Pre-registration: Weighing costs and benefits for researchers. In: Organizational Behavior and Human Decision Processes 167: 18-27. Open Google Scholar
  66. LOKEN, E. and A. GELMAN (2017): Measurement error and the replication crisis. In: Science 355 (6325): 584-585. Open Google Scholar
  67. MARGARIAN, A. (2022): Beyond P-Value-Obsession: When are Statistical Hypothesis Tests Required and Appropriate? In: German Journal of Agricultural Economics 71 (4): 213-226. Open Google Scholar
  68. MCCLOSKEY, D.N. and S.T. ZILIAK (1996): The Standard Error of Regressions. In: Journal of Economic Literature 34 (1): 97-114. Open Google Scholar
  69. MERVIS, J. (2014): Research Transparency. Why null results rarely see the light of day. In: Science 345 (6200): 992. Open Google Scholar
  70. NEUENFELDT, S. and A. GOCHT (2014): Integrating Econometric and Mathematical Programming Models into an Amendable A Handbook on the use of FADN Database in Programming Models. Thünen Working Paper No. 35. https://literatur.thuenen.de/digbib_extern/dn054328.pdf. Open Google Scholar
  71. NEYMAN, J. and E.S. PEARSON (1933): On the problem of the most efficient tests of statistical hypotheses. In: Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character 231 (694-706): 289-337. Open Google Scholar
  72. O’BOYLE, E.H., G.C. BANKS and E. GONZALEZ-MULÉ (2017): The Chrysalis Effect. In: Journal of Management 43 (2): 376-399. Open Google Scholar
  73. OAKS, M. (1986): Statistical inference: A commentary for the social and behavioral sciences. Wiley, New York. Open Google Scholar
  74. OLKEN, B.A. (2015): Promises and Perils of Pre-Analysis Plans. In: Journal of Economic Perspectives 29 (3): 61-80. Open Google Scholar
  75. PÜTZ, P. and S.B. BRUNS (2021): The (Non‐)Significance of Reporting Errors In Economics: Evidence from Three Top Journals. In: Journal of Economic Surveys 35 (1): 348-373. Open Google Scholar
  76. RAHWAN, Z., E. YOELI and B. FASOLO (2019): Heterogeneity in banker culture and its influence on dishonesty. In: Nature 575 (7782): 345-349. Open Google Scholar
  77. ROMANO, J.P., A.M. SHAIKH and M. WOLF (2010): Multiple Testing. In: The New Palgrave Dictionary of Economics 4. Open Google Scholar
  78. ROMMEL, J. and M. WELTIN (2021): Is There a Cult of Statistical Significance in Agricultural Economics? In: Applied Economic Perspectives and Policy 43 (3): 1176-1191. Open Google Scholar
  79. SCHOOLER, J.W. (2014): Metascience could rescue the 'replication crisis'. In: Nature 515 (7525): 9. Open Google Scholar
  80. SERRA-GARCIA, M. and U. GNEEZY (2021): Nonreplicable publications are cited more than replicable ones. In: Science Advances 7 (21). Open Google Scholar
  81. SMITH, T.M.F. (1983): On the Validity of Inferences from Non-random Sample. In: Journal of the Royal Statistical Society. Series A (General) 146 (4): 394. Open Google Scholar
  82. STEEGEN, S., F. TUERLINCKX, A. GELMAN and W. VANPAEMEL (2016): Increasing Transparency Through a Multiverse Analysis. In: Perspectives on psychological science : a journal of the Association for Psychological Science 11 (5): 702-712. Open Google Scholar
  83. VAN DE MEENT, J.-W., B. PAIGE, H. YANG and F. WOOD (2018): An Introduction to Probabilistic Programming. In: https://arxiv.org/pdf/1809.10756. Open Google Scholar
  84. VERHULST, B. (2016): In Defense of P Values. In: AANA journal 84 (5): 305-308. Open Google Scholar
  85. WASSERSTEIN, R.L. and N.A. LAZAR (2016): The ASA Statement on p -Values: Context, Process, and Purpose. In: The American Statistician 70 (2): 129-133. Open Google Scholar
  86. WASSERSTEIN, R.L., A.L. SCHIRM and N.A. LAZAR (2019): Moving to a World Beyond “ p < 0.05”. In: The American Statistician 73 (sup1): 1-19. Open Google Scholar
  87. WEHRDEN, H. von, J. SCHULTNER and D.J. ABSON (2015): A call for statistical editors in ecology. In: Trends in Ecology and Evolution 30 (6): 293-294. Open Google Scholar
  88. YOUNG, C. and K. HOLSTEEN (2017): Model Uncertainty and Robustness. In: Sociological Methods & Research 46 (1): 3-40. Open Google Scholar
  89. ZILIAK, S. and D. MCCLOSKEY (2008): The Cult of Statistical Significance. How the Standard Error Costs Us Jobs, Justice, and Lives. University of Michigan Press, Ann Arbor, MI. Open Google Scholar
  90. ZILIAK, S.T. and D.N. MCCLOSKEY (2004): Size matters: the standard error of regressions in the American Economic Review. In: The Journal of Socio-Economics 33 (5): 527-546. Open Google Scholar

Citation


Download RIS Download BibTex
No access
You do not have access to this content.