TrueTuesday: journal impact factor Mar 19, 2019
Mevrouw Noemie AUBERT BONN
Being a scientist is so much more than just a title. We all share an environment, a culture and most importantly a community. Yet despite the fact that scientific findings and knowledge are covered on all possible media, the factors that affect, characterize, and influence this community and its environment are rarely discussed and often even taboo. Every six weeks, Ph.D. student Noémie Aubert Bonn targets such a topic together with 5 experts for TrueTuesday. The third one is the Journal Impact Factor.
A CONTROVERSIAL TOOL
If you’re a scientist, you certainly know what the Journal Impact Factor (JIF) is, and you’ve probably used it many times. But the JIF is a highly controversial tool: some love it, others hate it. So how can such a small number raise such controversy?
Let’s start with the basics. According to Garfield, largely known as the father of the JIF, “A journal’s impact factor is based on 2 elements: the numerator, which is the number of citations in the current year to items published in the previous 2 years, and the denominator, which is the number of substantive articles and reviews published in the same 2 years”. In other words, the JIF indicates the average number of citations papers in journal X received in their first two years of publication.
Initially, the JIF was introduced to help librarians subscribe to the most useful (i.e., most cited) journals. Yet, being such a convenient and easy metric, it quickly became a tool for researchers to decide which journal would give them the most visibility, and for evaluators to assess researchers or individual papers. So here comes the debate: Is the JIF a reliable indicator of researchers’ work?
While this question is still unsolved to this day, we decided to start with some facts about what the JIF is not:
- The JIF is not an article-level metric. It takes an average from the journal’s output and is typically highly skewed. Let’s say a journal publishes three papers, paper 1 has 100 citations, and papers 2 and 3 have zero, the JIF will be 33.
- The JIF does not measure quality, it measures average journal citations, so possibly visibility, relatedness, and timeliness. Whether these correlate is part of the debate.
- The JIF is not inherently normalized (i.e. it depends on citing traditions and thus differs between disciplines and article).
- The JIF is not immune to manipulation.
But what do the experts have to say on this matter? Some hate it, some see value in it, and some might simply have a funny story to tell about it.
ANNA HATCH ON THE DECLARATION ON RESEARCH ASSESSMENT
The Declaration on Research Assessment (DORA) started as a collaborative effort in the 2012 Meeting of the American Society for Cell Biology in San Francisco. DORA’s main objective is to recognize the need to improve ways in which the outputs of scholarly research are evaluated. We talked with Anna Hatch, DORA Community Manager, to hear more about it.
Q: What triggered DORA? What was the decisive event that started the idea?
A: As you mentioned, the idea for DORA started at the 2012 Annual Meeting for the American Society for Cell Biology when a group of scholarly editors and publishers were discussing the influence of the Journal Impact Factor and journal prestige in hiring, promotion, and funding decisions. They recognized the need to improve the ways the outputs of research are evaluated, and this conversation led to the declaration.
Q: Why is the impact factor such an important target of DORA?
A: The declaration outlines a number of reasons, but a major one is that the impact factor does not describe the scientific quality of research in an individual article. Not only that, but researchers make a variety of scholarly contributions—data, reagents, software, policy changes, teaching, public outreach, and more—that cannot be described by a single metric.
Q: DORA is signed by over 1000 organisations and over 13 500 individuals. Do you feel that, beyond its signatories, DORA also initiated a change in practice?
A: We are starting to see change! For example, as part of its open access 2020 policy, all Wellcome-funded organizations must publicly commit to assessing research outputs based on the intrinsic merit of the work, not the title of journal or publisher. We have a collection of good practices on our website highlighting progress and innovation in research evaluation.
ALFONSO RODRIGUEZ-NAVARRO ON FLIPPING A COIN
Continuing with our topic of the journal impact factor, we reached out all the way to Spain to speak with Alfonso Rodriguez-Navarro, Prof. Emeritus, and researcher in Biotechnology, Genetics and Molecular Biology at Universidad Politécnica de Madrid. Together with Prof. dr. Ricardo Brito, a researcher at Universidad Complutense de Madrid, Alfonso just published a paper entitled "Evaluating research and researchers by the journal impact factor: Is it better than coin flipping?” in the Journal of Infometrics; a preprint is also available at arXiv.
Q: What are the main conclusions of your study?
A: Using the JIF to compare two papers implies a risk of failure. This failure occurs when the paper published in the journal with the higher impact factor receives a lower number of citations than the other paper published in the journal with the lower impact factor. We found that comparing papers from journals where the JIFs are not so different yields a failure probability close to 0.5, which is equivalent to evaluating by flipping a coin. In other words, the JIF is not a convenient indicator for research assessment. We support the San Francisco DORA.
Q: Can you briefly explain how you came to these conclusions?
A: We confirm previous conclusions taking advantage of the universally constant distribution of citations, which is a lognormal function from researchers to countries or journals. This function allowed us to calculate the failure probabilities of the JIF method.
Q: As of now, how have your findings been received in the community? Have you faced more resistance or more appreciation?
A: Our study is a confirmation for recalcitrant users of the JIF for research assessment, which still abounds. Despite DORA, some researchers and research administrators are still using the JIF for funding projects or analyzing the merits of researchers. We performed our study as an answer to a paper published in Nature World News cited in our report.
LUDO WALTMAN ON THE VALUE OF THE JIF
Up until now, we have mostly highlighted weaknesses of the impact factor, and especially of using the impact factor to evaluate researchers and individual papers. Nevertheless, an interesting counter-opinion arose last year when Prof. Ludo Waltman (together with Prof. Vincent Traag), from Centre for Science and Technology Studies in Leiden University, shared a paper entitled "Use of the journal impact factor for assessing individual articles need not be wrong” on arXiv.org.
Q: In your paper, you use a statistical model to show that using the impact factor for assessing individual papers is not necessarily worse than other techniques. Can you explain your reasoning in a few words?
A: Citation counts of individual articles may offer only a moderately accurate estimate of the value of an article. If journals publish articles that are of reasonably similar quality, the impact factor of a journal may offer a more accurate estimate of the value of an article. This is because idiosyncrasies in article-level citation counts may partly cancel out in journal-level citation statistics.
Q: At the moment, there seems to be a movement leading against the use of Impact factors for individual paper evaluation. Has this general opinion influenced the way your paper was received by the scientific community?
A: Yes, when I published my paper in the arXiv, many colleagues were interested in it. I believe this is because there is so much debate going on about the way impact factors are used.
Q: In 2015, you participated in writing The Leiden Manifesto, an important document which highlights ten principles to guide research evaluation. Are the conclusions of this paper compatible with the principles of the Manifesto?
A: Yes, they are compatible. The Leiden Manifesto doesn’t say whether or not it is acceptable to use journal metrics at the level of individual articles. Therefore there is no inconsistency between my paper and the Leiden Manifesto.
MIKE TAYLOR ON ALTMETRICS
Even though the impact factor is the most well-known impact metric there is, a number of other metrics exist, for example, altmetrics or alternative metrics. To understand what altmetrics are and what the company Altmetric does, we contacted Mike Taylor, Head of Metrics Development at Digital Science.
Q: Before anything, what are altmetrics? What do they look at, and what do they measure?
A: Altmetrics are a set of data that capture and count non-traditional citations and mentions of scholarly work. For example, policy citations by governments, patent citations, references in Wikipedia and mentions on social media. At Altmetric.com (a Digital Science company that provides altmetrics to the scholarly community) we describe altmetrics as a measure of attention. Our donut visualisation shows an idea of diversity by using different colours to represent each attention type, and users can click through to that to get more detail. The Altmetric Attention Score, which is often found in the middle of those donuts, is a weight count of the online attention we’ve found for a publication.
Q: If you compare it to the Impact Factor, what type of advantages and disadvantages would the Altmetric have?
A: Most importantly, altmetrics data are much more diverse than citation-based metrics like the Impact Factor - they provide a different set of perspectives of the reach and influence of a research publication and focus on the article, rather than being journal-based. Timing is another important difference: traditionally, citations can take a long time to appear, whereas altmetrics can be picked up (or ‘tracked’) by Altmetric on the same day that a research output is published online - meaning that authors, publishers, institutions and funders and gain a much better understanding of how their work is being received amongst a much broader audience. With increasing pressure from funders and government to demonstrate socio-economic impact, and public engagement, altmetrics are growing in status.
Of course, altmetrics can be seen as being “new”, and many researchers are right to ask questions about their utility. Although the name is not yet ten years old, several hundred papers about altmetrics have been published in a wide variety of journals; there are two annual conferences and a new specialist journal, J Altmetrics. This literature is shedding new light on the mechanisms of scientific communications, engagement, and impact - especially in the relatively new, online world - and altmetrics is a key part in exploring these trends.
Q: What do you think Altmetric is best suited for, and where should we be most careful in using it?
A: For individual researchers and students, altmetrics are an excellent discovery tool. They help us find additional discussions and explanations of research papers and provide context that it would be hard for the reader to gather elsewhere. And personally, I have benefited greatly from having ‘met’ people on Twitter and other social platforms: many of these relationships have become productive, scholarly partnerships.
The context has to be one of the key things to remember when looking at the different sources: altmetrics are far more than a simple number - they offer a route into understanding the broader impact and reach of research. The first guideline in the Leiden Manifesto offers some great advice to everyone using metrics: “Quantitative evaluation should support qualitative, expert assessment.”
For example, knowing that Wikipedia is often used as an educational resource allows us to understand that some of the motivation that may have led the authors to create a link to the research.
For organizations who are involved in academic communication, altmetrics help them understand how effective they are being in their engagement and communication tasks. In the last year, I’ve met with people in research offices who use Altmetric data to keep academics up-to-date with the conversations around their research; with employees at technical universities, for whom patents are a very important part of understanding their status within certain topics.
For academics applying for grants, altmetrics can provide strong evidence of broader impact and engagement activities to support their applications. We’ve heard of grant applications to the National Institutes of Health in the USA and to the European Commission that have used Altmetric data as part of their evidence on past engagement activities.
SVEN HENDRIX ON THE IMPACT FACTOR AT UHASSELT
We heard both positive and negative things about the impact factor and its alternatives. Here, at Hasselt University, the Impact Factor is still a part of researcher evaluation, and it is also important for graduating in certain faculties. For this last part, we will learn more about how the impact factor is used at UHasselt by talking to professor Sven Hendrix, Director of the Doctoral School for Health & Life Sciences (DSHLS), and author of the blog 'smart science career' where he published a post entitled '10 Simple strategies to increase the impact factor of your publication’.
Q: Why did the DSHLS choose to use the Impact Factor for evaluating researchers and research students? What are the advantages?
A: When we discussed the minimal requirements to obtain a Ph.D. degree for our students we came to the conclusion n that Ph.D. students should be made aware of the fact that impact factors are still broadly used as a quantitative measurement to evaluate the "quality" and "excellence" of a researcher. Typical contexts are grant evaluations and selection committees for positions of postdocs and professors. We are very aware of the fact there is a lot of debate and critique on using IFs as proxies of excellence, however, there still is no major breakthrough how to handle this problem better.
Q: Being the Director of the doctoral school, do you ever see challenges in using Impact factors for research evaluation?
A: It is a permanent challenge because the focus on IFs has quite some side effects which are not easy to handle. There is a substantial element of unpredictability or luck to get your papers published with a certain impact factor although there are multiple strategies which increase the chances substantially. Some I have described in my blog smartsciencecareer.com. There is also the notorious dilemma that many PIs wish to publish as high as possible, while the Ph.D. students wish to finish their thesis before their contract ends. Ph.D. students at the end of their contract have the tendency to accept lower impact factors just to get permission to defend their thesis. Furthermore, there are domains such as rehabilitation sciences where the highest impact factor in the domain is much lower than for example in immunology or neuroscience. Thus, Ph.D. students in these domains are forced to increase the number of their publications to reach a sufficiently high cumulative impact factor. However, in the previous years, most students published much better than the publication requirements.
Q: If Impact Factors stopped being used in evaluating researchers, do you think scientific outputs will change? Will they improve or worsen?
A: (laughs) Whenever a QUANTITATIVE measurement is used a complex human performance is reduced to *one* number such as the impact factor or the number of citations in the last 2, 5 or 10 years. However, quantitative measurements are necessary for most selection procedures (grants, positions, promotions ...) to be as transparent as possible. Thus, if impact factors are replaced by other parameters it would be wise to have a panel of several parameters which give a broader picture than just one number.
Be sure out to check out our social media next week Tuesday to discover what our new TrueTuesday topic will be and don’t forget to check out our previous articles about ‘mental health’ and ‘mistakes & failure’.