“Sridhar Gutam is a senior scientist at ICAR-Indian Institute of Horticultural Research, Bengaluru. He is also the convenor of Open access India, an organisation advocating open access, open data and open education in India….”
In November of 2014, in a first, unexpected move for the field of particle physics, the Compact Muon Solenoid (CMS) experiment — one of the main detectors in the world’s largest particle accelerator, the Large Hadron Collider — released to the public an immense amount of data, through a website called the CERN Open Data Portal.
The data, recorded and processed throughout the year 2010, amounted to about 29 terabytes of information, yielded from 300 million individual collisions of high-energy protons within the CMS detector. The sharing of these data marked the first time any major particle collider experiment had released such an information cache to the general public.
A new study by Jesse Thaler, an associate professor of physics at MIT and a long-time advocate for open access in particle physics, and his colleagues now demonstrates the scientific value of this move. In a paper published in Physical Review Letters, the researchers used the CMS data to reveal, for the first time, a universal feature within jets of subatomic particles, which are produced when high-energy protons collide. Their effort represents the first independent, published analysis of the CMS open data.
“In our field of particle physics, there isn’t the tradition of making data public,” says Thaler. “To actually get data publicly with no other restrictions — that’s unprecedented.”
Part of the reason groups at the Large Hadron Collider and other particle accelerators have kept proprietary hold over their data is the concern that such data could be misinterpreted by people who may not have a complete understanding of the physical detectors and how their various complex properties may influence the data produced.
“The worry was, if you made the data public, then you would have people claiming evidence for new physics when actually it was just a glitch in how the detector was operating,” Thaler says. “I think it was believed that no one could come from the outside and do those corrections properly, and that some rogue analyst could claim existence of something that wasn’t really there.”
“This is a resource that we now have, which is new in our field,” Thaler adds. “I think there was a reluctance to try to dig into it, because it was hard. But our work here shows that we can understand in general how to use this open data, that it has scientific value, and that this can be a stepping stone to future analysis of more exotic possibilities.”
Thaler’s co-authors are Andrew Larkoski of Reed College, Simone Marzani of the State University of New York at Buffalo, and Aashish Tripathee and Wei Xue of MIT’s Center for Theoretical Physics and Laboratory for Nuclear Science.
Seeing fractals in jets
When the CMS collaboration publicly released its data in 2014, Thaler sought to apply new theoretical ideas to analyze the information. His goal was to use novel methods to study jets produced from the high-energy collision of protons.
Protons are essentially accumulations of even smaller subatomic particles called quarks and gluons, which are bound together by interactions known in physics parlance as the strong force. One feature of the strong force that has been known to physicists since the 1970s describes the way in which quarks and gluons repeatedly split and divide in the aftermath of a high-energy collision.
This feature can be used to predict the energy imparted to each particle as it cleaves from a mother quark or gluon. In particular, physicists can use an equation, known as an evolution equation or splitting function, to predict the pattern of particles that spray out from an initial collision, and therefore the overall structure of the jet produced.
“It’s this fractal-like process that describes how jets are formed,” Thaler says. “But when you look at a jet in reality, it’s really messy. How do you go from this messy, chaotic jet you’re seeing to the fundamental governing rule or equation that generated that jet? It’s a universal feature, and yet it has never directly been seen in the jet that’s measured.”
In 2014, the CMS released a preprocessed form of the detector’s 2010 raw data that contained an exhaustive listing of “particle flow candidates,” or the types of subatomic particles that are most likely to have been released, given the energies measured in the detector after a collision.
The following year, Thaler published a theoretical paper with Larkoski and Marzani, proposing a strategy to more fully understand a complicated jet in a way that revealed the fundamental evolution equation governing its structure.
“This idea had not existed before,” Thaler says. “That you could distill the messiness of the jet into a pattern, and that pattern would match beautifully onto that equation — this is what we found when we applied this method to the CMS data.”
To apply his theoretical idea, Thaler examined 750,000 individual jets that were produced from proton collisions within the CMS open data. He looked to see whether the pattern of particles in those jets matched with what the evolution equation predicted, given the energies released from their respective collisions.
Taking each collision one by one, his team looked at the most prominent jet produced and used previously developed algorithms to trace back and disentangle the energies emitted as particles cleaved again and again. The primary analysis work was carried out by Tripathee, as part of his MIT bachelor’s thesis, and by Xue.
“We wanted to see how this jet came from smaller pieces,” Thaler says. “The equation is telling you how energy is shared when things split, and we found when you look at a jet and measure how much energy is shared when they split, they’re the same thing.”
The team was able to reveal the splitting function, or evolution equation, by combining information from all 750,000 jets they studied, showing that the equation — a fundamental feature of the strong force — can indeed predict the overall structure of a jet and the energies of particles produced from the collision of two protons.
While this may not generally be a surprise to most physicists, the study represents the first time this equation has been seen so clearly in experimental data.
“No one doubts this equation, but we were able to expose it in a new way,” Thaler says. “This is a clean verification that things behave the way you’d expect. And it gives us confidence that we can use this kind of open data for future analyses.”
Thaler hopes his and others’ analysis of the CMS open data will spur other large particle physics experiments to release similar information, in part to preserve their legacies.
“Colliders are big endeavors,” Thaler says. “These are unique datasets, and we need to make sure there’s a mechanism to archive that information in order to potentially make discoveries down the line using old data, because our theoretical understanding changes over time. Public access is a stepping stone to making sure this data is available for future use.”
This research was supported, in part, by the MIT Charles E. Reed Faculty Initiatives Fund, the MIT Undergraduate Research Opportunities Program, the U.S. Department of Energy, and the National Science Foundation.
Over the last few decades, there has been ongoing debate and distress regarding the effects of the journal subscription paywall and the very real barriers to knowledge access that it creates. As major academic publishers invest and redirect their business strategies to open access and alternative paying structures, it may seem as if the access to knowledge battle is starting to be won. However, as big publishers move towards openness they have also been redirecting their business strategies towards the acquisition of scholarly infrastructure, the tools and services that underpin the scholarly research life cycle, many of which are geared towards data analytics. We argue that moves toward increased control over openness and data analytics by big publishers are simultaneous processes of profit maximization. Could it be that our attention on the paywall has ditracted us from paying attention to the strategic takeover of infrastructure by the publishers? These processes should be examined closely as they are actively entrenching the publisher’s’ power and control which could be posing great threats to the exclusion of already marginalized researchers and institutions.
“Given broad acceptance that the UK should move towards wider access to research, the debate has naturally moved on to the question of implementation. The details matter, including the words we use. The problem is that the terminology is being systematically misused. And that misuse is poisoning debate….”
Over the summer, librarians and academic leaders in Germany came together to lead a push in taking down the paywalls that block access to so many scientific research articles. The initiative, named Projekt DEAL, represents a bold push toward open access that could change the landscape of academic publishing.
The latest developments in Projekt DEAL pick up on a battle now over two years in the making, where libraries and universities in Germany have united in pushing large publishers to adopt a new business model. The institutions are looking to forego the typical subscription-based academic publishing business model in lieu of paying an annual lump sum that covers publications costs of all papers whose first authors are associated with German institutions.”
“However, despite the current success, this strategy of wining over faculty hasn’t been very effective: only a fraction of the current access is created by gold/green open access, much of it stems from sci-hub and sharing sites such as ResearchGate. In other words, as fantastic as full access to the literature that we now enjoy feels, it was brought about only to a small extent by the changed publication behavior of faculty.”
A thesis by Sigurbjörg Jóhannesdóttir, submitted in October 2015.
Abstract: Open Access (OA) are introduced and discussed associated with open scholarship and the international scientific community. The status of Open Access in Iceland is explored through the laws and policies relating to OA, gratis and libre publications within scholarly journals, publication within open repositories, and the opportunities that scientists have to publish scholarly papers in OA.
Data was collected through interviews with experts in the Open Access field. Two questions were used from a study of OA that was conducted among scientists at Reykjavik University (RU) 2014, as well as an analysis of a list of their published articles in scholarly journals in 2013.
The results show that OA is growing slowly in Iceland. Four institutions have OA policies. Icelandic scientists are not taking full advantage of the rules of journals about publishing articles within OA. Scientists’ beliefs concerning the barriers standing in their way for publishing sholarly papers in OA are based on a lack of knowledge and a lack of access to institutional repositories in which they might wish to publish their articles.
The opportunities and challenges that Icelandic universities face regarding open sholarship are outlined and discussed. The universities need to have policies for OA and Open Educational Resources (OER) which are consistent with what is happening internationally. Academics need to receive helpful information on OA, they also need to receive encouragement, advice and support concerning publishing in OA. The universities and the scientific community in Iceland need to take a joint decision on what are the best ways for the continued preservation and publication of research and educational resources in OA.
“With an estimated 190 million residents, Nigeria is the largest country in Africa. A remarkable 60% of Nigerians are school-aged, creating one of the largest student bodies in the world. With internet access in Nigeria quickly growing, local Wikimedians are working together to raise awareness for the platform and how Nigeria’s many students can both use and improve Wikipedia.”
“Here’s a little exercise which I’ve now done looking at research papers in a wide variety of disciplines. Look at the referenced sources in a recently published paper. Unless you are reading….this paper at one of the few fully-funded research libraries, you will find that a significant number of the referenced sources are unavailable to you. Open access is simply not there….Lots of the referenced sources will have to be obtained by inter-library loan or not at all. Your ability to participate in the scholarly inquiries of your field are highly constrained….I think there’s a case to be made that journal publishers may be missing a trick. There is a point in time when a publisher’s self-interest in the quality of their about-to-be-published work would be well-served by encouraging authors of referenced sources to share their past articles. This is also a moment in time at which the authors of referenced sources are also missing a trick but are unaware of it…..”