Science Beam – using computer vision to extract PDF data | Labs | eLife

“There’s a vast trove of science out there locked inside the PDF format. From preprints to peer-reviewed literature and historical research, millions of scientific manuscripts today can only be found in a print-era format that is effectively inaccessible to the web of interconnected online services and APIs that are increasingly becoming the digital scaffold of today’s research infrastructure….Extracting key information from PDF files isn’t trivial. …It would therefore certainly be useful to be able to extract all key data from manuscript PDFs and store it in a more accessible, more reusable format such as XML (of the publishing industry standard JATS variety or otherwise). This would allow for the flexible conversion of the original manuscript into different forms, from mobile-friendly layouts to enhanced views like eLife’s side-by-side view (through eLife Lens). It will also make the research mineable and API-accessible to any number of tools, services and applications. From advanced search tools to the contextual presentation of semantic tags based on users’ interests, and from cross-domain mash-ups showing correlations between different papers to novel applications like ScienceFair, a move away from PDF and toward a more open and flexible format like XML would unlock a multitude of use cases for the discovery and reuse of existing research….We are embarking on a project to build on these existing open-source tools, and to improve the accuracy of the XML output. One aim of the project is to combine some of the existing tools in a modular PDF-to-XML conversion pipeline that achieves a better overall conversion result compared to using individual tools on their own. In addition, we are experimenting with a different approach to the problem: using computer vision to identify key components of the scientific manuscript in PDF format….To this end, we will be collaborating with other publishers to collate a broad corpus of valid PDF/XML pairs to help train and test our neural networks….”

The Virtual Reformation — Kill Your Darlings

“With more books available, supply created demand. People, particularly those with means, began to learn to read. Even before Martin Luther nailed his ‘The 95 Theses’ to the church door in 1517, cracks were beginning to appear in the ironclad control the Catholic Church had previously exercised over access to information and knowledge….But even in the face of such draconian consequences, the public continued to demand their own direct relationship with God and their right to read the Bible in their own language. What people were really agitating for, perhaps, was access to information and knowledge. They were no longer willing to know only what the priestly class wanted them to know….Now that everybody with a smart device has access to the media as well as the ability to create content themselves, things that used to be kept quiet are getting out; everyone can have a direct relationship with what used to be privileged information….”

PBJ is now a leading open access plant journal – Daniell – 2017 – Plant Biotechnology Journal – Wiley Online Library

“Welcome to the first issue of the fifteenth volume of Plant Biotechnology Journal. I would like to start this editorial by announcing the successful transition of PBJ from a subscription-based journal to an open access journal supported exclusively by authors. This resulted in enhanced free global access to all readers. I applaud the PBJ management team for offering free open access to all articles published in this journal in the past 14 years. As the first among the top ten open access plant science journals, based on 2016 citations, PBJ is very likely to be ranked among the top three journals publishing original research. PBJ is now compatible with mobile platforms, tablets, iPads, and iPhones and offers several new options to evaluate short- and long-term impact of published articles, including Altmetric scores, article readership, and citations….”

Open Data Companion (ODC) – Android Apps on Google Play

“Open Data Companion (ODC) provides a unified access point to over 170 open data portals and thousands of datasets from around the world; right from your mobile device. Crafted with mobile-optimised features and design, this is an easy and convenient way to find, access and share open data.

Open Data Companion provides a framework for all Private Sector, State, Regional, National and Worldwide open data portals to deliver open data to all mobile users….”

Open Data Companion (ODC) – Android Apps on Google Play

“Open Data Companion (ODC) provides a unified access point to over 170 open data portals and thousands of datasets from around the world; right from your mobile device. Crafted with mobile-optimised features and design, this is an easy and convenient way to find, access and share open data.

Open Data Companion provides a framework for all Private Sector, State, Regional, National and Worldwide open data portals to deliver open data to all mobile users….”

Why This Audio Map for the Blind Offers an Open-Data Roadmap for the Country — Backchannel — Medium

“Imagine you’re blind. You have a smartphone, and you’re trying to find your own way to a spot downtown. To get there you’ll need precise voice directions to specific building numbers, but you can’t find an app that meets the challenge.

Next, imagine you’re an app-maker who wants to provide the most accurate navigation at the lowest cost ?to seeing-impaired customers. To do that you’ll need access to an accurate database of street addresses. While cities routinely collect this information, it isn’t necessarily publicly available.

Now a pioneering open data project in Louisville, Kentucky is lighting a torch to show cities, civic tech enthusiasts, and local businesses how to make sure assistive technology like this is easily and cheaply available. And its methods are so simple that they can applied to many more problems where open public data can make a difference….”

Why This Audio Map for the Blind Offers an Open-Data Roadmap for the Country

“Imagine you’re blind. You have a smartphone, and you’re trying to find your own way to a spot downtown. To get there you’ll need precise voice directions to specific building numbers, but you can’t find an app that meets the challenge.

Next, imagine you’re an app-maker who wants to provide the most accurate navigation at the lowest cost ?to seeing-impaired customers. To do that you’ll need access to an accurate database of street addresses. While cities routinely collect this information, it isn’t necessarily publicly available.

Now a pioneering open data project in Louisville, Kentucky is lighting a torch to show cities, civic tech enthusiasts, and local businesses how to make sure assistive technology like this is easily and cheaply available. And its methods are so simple that they can applied to many more problems where open public data can make a difference….”