GitHub preserves its open-source software code deep in the arctic for future generations – SiliconANGLE

“GitHub Inc. said today it has delivered a copy of all of the open-source software code stored on its website to a data repository at the Arctic World Archive, which is a very long-term archival facility buried 250 meters deep in the permafrost of an Arctic mountain.

The operation is part of the GitHub Archive Program, which is a project announced last year that aims to preserve today’s open-source software for future generations. To do that, GitHub said, it will store its code in an archive called the GitHub Arctic Code Vault, which it says has been built to last for a thousand years….”

Octopub

“An ODI experiment, Octopub offers simple way to prepare and check a dataset, and publish it online onto the GitHub platform….

Data isn’t open until an open licence has been applied. You can choose a licence that suits your needs.

If you know nothing about licences, nothing to worry about, we’ll help you choose….

Want your data to be high quality? Reusable? Machine readable? We encourage you to apply schemas to your files, and we can help you get started….

Octopub can check the quality of your CSV files for common errors.

We’ll give you quality feedback, and you can review and re-upload as often as you need to until the data you want to publish is of the highest standard….”

GitHub Archive Program: the journey of the world’s open source code to the Arctic – The GitHub Blog

“At GitHub Universe 2019, we introduced the GitHub Archive Program along with the GitHub Arctic Code Vault. Our mission is to preserve open source software for future generations by storing your code in an archive built to last a thousand years.

On February 2, 2020, we took a snapshot of all active public repositories on GitHub to archive in the vault. Over the last several months, our archive partners Piql, wrote 21TB of repository data to 186 reels of piqlFilm (digital photosensitive archival film). Our original plan was for our team to fly to Norway and personally escort the world’s open source code to the Arctic, but as the world continues to endure a global pandemic, we had to adjust our plans. We stayed in close contact with our partners, waiting for the time when it was safe for them to travel to Svalbard. We’re happy to report that the code was successfully deposited in the Arctic Code Vault on July 8, 2020. …”

Investigating the Scholarly Git Experience Survey

“You have been invited to take part in a research study to learn more about how people in academia interact with version control, specifically Git, and source code hosting platforms (e.g. GitLab, SourceForge, GitHub). This study will be conducted by Vicky Steeves and Sarah Nguyen of NYU Division of Libraries.

If you agree to be in this study, you will be asked to complete a 30 question survey about version control and source code hosting….”

GitHub is now free for all teams | TechCrunch

“GitHub today announced that all of its core features are now available for free to all users, including those that are currently on free accounts. That means free unlimited private repositories with unlimited collaborators for all, including teams that use the service for commercial projects, as well as up to 2,000 minutes per month of free access to GitHub Actions, the company’s automation and CI/CD platform….”

GitHub Repositories with Links to Academic Papers: Open Access, Traceability, and Evolution

Abstract:  Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of Open Source Software implements bleeding edge science into its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the link impact remains unknown. This paper investigates the role of academic paper references contained in these repositories. We conducted a large-scale study of 20 thousand GitHub repositories to establish prevalence of references to academic papers. We use a mixed-methods approach to identify Open Access (OA), traceability and evolutionary aspects of the links. Although referencing a paper is not typical, we find that a vast majority of referenced academic papers are OA. In terms of traceability, our analysis revealed that machine learning is the most prevalent topic of repositories. These repositories tend to be affiliated with academic communities. More than half of the papers do not link back to any repository. A case study of referenced arXiv paper shows that most of these papers are high-impact and influential and do align with academia, referenced by repositories written in different programming languages. From the evolutionary aspect, we find very few changes of papers being referenced and links to them.

 

GitHub Repositories with Links to Academic Papers: Open Access, Traceability, and Evolution

Abstract:  Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of Open Source Software implements bleeding edge science into its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the link impact remains unknown. This paper investigates the role of academic paper references contained in these repositories. We conducted a large-scale study of 20 thousand GitHub repositories to establish prevalence of references to academic papers. We use a mixed-methods approach to identify Open Access (OA), traceability and evolutionary aspects of the links. Although referencing a paper is not typical, we find that a vast majority of referenced academic papers are OA. In terms of traceability, our analysis revealed that machine learning is the most prevalent topic of repositories. These repositories tend to be affiliated with academic communities. More than half of the papers do not link back to any repository. A case study of referenced arXiv paper shows that most of these papers are high-impact and influential and do align with academia, referenced by repositories written in different programming languages. From the evolutionary aspect, we find very few changes of papers being referenced and links to them.

 

Peeling back the curtain – The Economist

“While we take care to identify our sources, we have not often published the data behind them. Sometimes, this is for good reason: some data are proprietary or otherwise not ours to publish. Often, we have simply not made the time to do it. This is a shame: releasing data can give our readers extra confidence in our work, and allows researchers and other journalists to check?—?and to build upon?—?our work. So we’re looking to change this, and publish more of our data on GitHub….”

Get Free Private Github Repositories Through GitHub Education

We’ve written a fair amount about GitHub here at ProfHacker. To cite just a few examples, Lincoln described how to fork syllabi using GitHub, George outlined how to preserve your Twitter archive using GitHub Pages, and Konrad wrote a long series introducing the basics of GitHub in detail.

I resisted GitHub for a long time, against the advice of my fellow Profs. Hacker and other colleagues, but recently have moved toward using it to store code related to my teaching and my research, as well to host sites for classes and research projects. In the coming weeks I plan to write tutorials outlining precisely how I’ve been building class and project pages using RStudio or Jekyll paired with GitHub Pages, but first I wanted to recall those previous posts so that interested readers can set up their own GitHub accounts, which they will need to follow those tutorials.

In the meantime, however, I wanted to share exciting news I learned only very recently from a colleague. In his Getting Started with GitHub post, Konrad noted,

GitHub accounts are free, and remain so as long as you allow your repositories be open source and available to the world. You only have to get a paid account if you wish to have private repositories protected from prying eyes.

This is generally true, but sometimes academics might want a private repository: for a class website in progress, perhaps, or for other materials—such as, in my case, a promotion dossier in progress—that would benefit from GitHub’s versioning but cannot be made publicly available.

Fortunately, however—and this is what I did not realize until just months ago—GitHub offers free individual and team accounts through their GitHub Education program. Students 13 years and older can apply for the Student Developer Pack, which gives access to specialized tools and unlimited private repositories. Educators and/or researchers can apply for a free individual Developer plan, which also offers unlimited private repositories, as well as for free Team plans for academic groups, such as a classroom or a research project team.

The process is simple. Visit the GitHub Education page, click request a discount, log into your account, and fill out the form. You should use your institutional email account, if you have one, and you may have to upload a document demonstrating your affiliation. From there the GitHub Education team has to approve your request, which in my case took only a few hours.

I don’t have a huge need for private repositories, but it is nice to have the option, and has allowed me to benefit from GitHub on a few projects I wouldn’t have felt comfortable putting on GitHub otherwise. If you’re reading ProfHacker, there’s a good chance you’ll qualify for a free educational account as well.

Are you using GitHub for teaching and/or research, and if so did you know about GitHub Education? Tell us about how you’re using GitHub in the comments.

Open Source Publishing and Distribution Platform for OA Books – Open Book Publishers

“An important part of OBP’s business model has been the ability to harness emerging digital technologies to bring down the publication costs associated with scholarly texts. In addition we have developed an extensive and cost effective distribution network for both digital and printed editions of our titles.

We now intend to reformat and update our software and processes for release as Open Source content, and make all the code freely available for others to adopt and adapt from our GitHub account. As we will be using and maintaining this code for our own operations, on completion we will be in a position to provide a complete, modular, managed, Open Source book publishing and distribution platform for others to freely adopt as they wish.”