How do you know you’ve saturated your reading for a lit review?
How have you found the journal papers you’ve read?
How do you know you’ve read everything relevant written about your topic before the proverbial “This addresses the gap in the research…”
I’m preparing for a systematic review of literature, or at least reaching saturation of certain topics in preparation for my proposal. One of the podcasts I listen to, Everything Hertz, has become slightly less confusing to me over the past year. I can’t always follow the conversations Dan Quintana and James Heathers are having about statistics, but I enjoy much of the content, and the episode titled “Chaos in the brickyard” prompted this post.
In this podcast episode, Quintana and Heathers began by discussing the use of Google Scholar because of a conference where @Jevinwest had presented that Google Scholar is altering scholarly citation patterns – citations are getting more concentrated, limited to the top 1-2 search results in Google Scholar. Two questions posed by the podcast hosts were: are the appropriate citations being ignored? is there a bigger problem?
Heathers rhetorically asked how many people have read a journal article and googled something that was mentioned in a paper, found the citation, and appended it to their own paper, without reading it? Scholars are citing the top results, and Heathers wonders if these are lazy citation research methods and whether appropriate citations are being ignored.
Many databases are behind paywalls and the luck of a graduate student or scholar depends on the school where they are stationed and how rich research libraries are. If searches can’t be done within certain databases due to these limitations, does anyone go beyond the first page of site results on Google? Are proper keyword searches the key to getting successful, relevant citations?
There are many questions with unsatisfying answers. To be fair, Google’s algorithm is as much a secret as other databases, but it’s an open platform. Quintana and Heathers applaud how easy Google Scholar is and that it can give you what you are looking for. It will also find a preprint and link to the PDF, even if the paper is behind a paywall – it opens up access to research, legally. However, one of the challenges with scholars relying on Google Scholar/Search: as these searches occur more and more, what academics are doing will shape behaviour and introduce bias into search results, thereby compounding the problem. Is the first reference the most appropriate for a disciplinary focus, they ask.
I was trying to find the paper that conducted research on what Heathers alluded to – that scholars don’t always read the stuff they cite. I remembered that I had read it on Twitter and the method used in the paper amused me.
The researchers came to the conclusion that approximately 80% of scholars had not read what was in citations, and the researchers could demonstrate this by tracing how far an identical citation error contained in a seminal paper continued on through other papers. They had examined 4300 citations of this paper in their research.
My problem: I couldn’t remember the authors nor where or when it had been published.
My search process for the paper:
I began on Twitter and searched my handle with the word “citation” because I remember that this is where I first read the paper. I hoped that I may have shared the paper in a tweet. This was unsuccessful, but I did not look through all my retweets because I couldn’t remember when I had read it or what key words had been used in the title of the article. I also didn’t recall if someone had taken a screenshot of the abstract or actually typed the authors’ names and content. I used the following search terms in this order:
- academic paper that shows academics do not read journal articles (Google)
- citation errors – academics not reading journal articles (Google)
- citation errors – academics not reading journal articles because citation errors are copied (Google)
- citation errors repeat in journal articles (Google)
- plagiarism by academics -views +citation (Google Scholar)
- citations share same errors (Google)
- citations share same errors not read (Google Scholar)
- citation-based plagiarism (Google Scholar)
- errors in references plagiarism (Google Scholar)
- found a paper: Propagation of errors in citation networks (false positive, but it provided a key word “propagation”)
- propagation of citation errors – eureka! (Google)
I found an article written in Nature with enough description on Google search results (not Scholar) that I was able to recognize this as the right story.
I returned to Google Scholar to find the paper I was looking for because Google is wonderfully efficient. Others have searched for this paper, using the same functions, and once I was able to find a reference to Simkin and Roychowdhury in an article, it was easy to find, using Google Scholar. My search terms and methods may not have been optimal, but key words, as Heathers suggested in the podcast, provided me with the result I wanted, and Google delivered. A PDF of a preprint, to boot!
I began to think about my literature review for my masters and the process I went through to find literature associated with the Canadian Test of English for Scholars and Trainees (CanTEST). This information would occupy part of one paragraph of one section in my literature review. Specifically, I was searching for something Alister Cumming had mentioned in class that he had written in the 1980s when he actively worked to design the CanTEST. All my regular methods resulted in nothing. I had been trying to use the Summon database at work as well as ERIC, using all sorts of Boolean searches. I sought out the help of a librarian who often visited my class to conduct research workshops and with her help and almost an hour of searching various databases, she helped me find the article. There is a CanTest framework for cancer detection that was affecting our search results. This involved a lot of knowledge, research, and time to cite a paper once in a masters thesis. Funny search epilogue: on October 6 2019, in two clicks, Google Scholar can provide me with a link to the paper I couldn’t find in 2014.
I also recall my current supervisor visiting our proseminar class a few years ago and talking about how people should read research. Her disdain for people who cite each other stayed with me. If you see someone or a group of people citing each other, there is not much there to learn,and they are not interested in learning.
I’ve a list of things I need to read to prepare for my proposal; some are suggestions from my supervisor, some are articles I found in lists of references from papers I was reading, some are the suggestions from web pages I have visited for other papers, and the largest list is from things that I have found while browsing my feed on Twitter. I will visit the library after I finish reading from these lists to help me determine if I’ve reached saturation, but I’ll also be asking a librarian for help. My budding knowledge about social network analysis tells me I can’t know all possible connections my research topic, despite my own reading. I’m ready for a database to surprise me with my knowledge gap. Google isn’t there, yet.
Ball, P. (2002, December 12). Paper trail reveals references go unread by citing authors. Nature International Journal of Science. 420 (594). Retrieved from https://www.nature.com/articles/420594a
Quintana, D. S., Heathers, J. A. J. (Hosts). (2019, September 16). “Chaos in the brickyard”, Everything Hertz [Audio podcast] https://osf.io/xfd2p/
Simkin, M. V., & Roychowdhury, V. P. (2002). Read before you cite!. arXiv preprint cond-mat/0212043.