Satoshi Village the blog of Daniel Himmelstein

Mapping the Long Trail: the best is now free with OpenStreetMaps

On January 24, 2010, Zeke Farwell completed the Long Trail. He had begun just two days earlier, connecting more than 50 segments of trail spanning over 200 miles. Motivated by “the longest and most well known hiking trail in Vermont”, Zeke often revisited portions of the trails in the years to come.

Now I’m not revealing a secret FKT, but rather the completion of the Long Trail route on OpenStreetMap. OpenStreetMap is like Wikipedia …

read more

Supporting Alexandra Elbakyan’s nomination for the 2020 John Maddox Prize

The John Maddox Prize has been awarded annually since 2012 to “researchers who have shown great courage and integrity in standing up for science and scientific reasoning against fierce opposition and hostility”. The prize is a joint initiative between the journal Nature and the Sense about Science charity.

Fergus Kane nominated Alexandra Elbakyan, creator of Sci-Hub, for the prize in 2018. While selected to a final shortlist, she did not win. Dr. Kane has nominated …

read more

On author versus numeric citation styles

Should citations in scholarly writing appear as author-year snippets, like (Pantcheva, 2018; Zelle, 2015), or numbers, like [1,2]? Let’s refer to these two methods as author-style and numeric-style. You may have also heard them referred to as the Harvard and Vancouver referencing systems.


Here’s an example of author-style from our recent Sci-Hub Coverage Study published in eLife. First, see how citations appear in the main text:

Sci-Hub Coverage Study in eLife: author-style citations

Notice how studies with 3 …

read more

Dangerous dusts and malignant mesotheliomas: Q&A with Alison Grimes

In 2015, Simeonov and I proposed that oxygen is an inhaled carcinogen, resulting in greater lung cancer incidence at lower elevations. Our study proved controversial, even evoking criticism from Cancer Research UK, which we responded to elsewhere on this blog. While the link between elevation and lung cancer is an open question, the association between asbestos and mesothelioma is indisputable, having been cemented by a century of epidemiologic interrogation.

Here, I chat with Alison Grimes …

read more

University software licenses prevent reproducible science

Today is an exciting day for reproducibility in computational sciences. Continuous analysis awakens with its publication in Nature Biotechnology. Continuous analysis is a method for automatically re-executing a study whenever its source code is updated. Any changes resulting from the update are tracked and visible.

Once properly configured, continuous analysis makes a computational study fully reproducible at every state throughout its history. It works by combining two technologies. First, continuous integration monitors the source data …

read more

The most interesting case of scientific irreproducibility?

On February 26, 2016, the first version of an article titled “How blockchain-timestamped protocols could improve the trustworthiness of medical science” was posted to F1000Research. The paper had two authors: Greg Irving of the University of Cambridge and John Holden of Garswood Surgery. The article describes a method for timestamping clinical trials, so the retrospective existence of a trial can be verified at a later date. The technique uses the Bitcoin blockchain as an immutable …

read more

The licensing of bioRxiv preprints

Jordan Anaya of Omnes Res — creator of the PrePubMed search engine for biomedical preprints — recently compared bioRxiv to PeerJ Preprints. We agree that PeerJ offers the better technology and user experience. However, bioRxiv has greater adoption in the biodata sciences.

In fact, since my last blog post on preprints at the beginning of 2016, bioRxiv has grown by 149% from 2,785 to 6,933 preprints. The growth has been fueled largely by the efforts …

read more

My PhD Exhibit

1,700 days after moving to San Francisco for graduate school, I gave my thesis seminar. We live streamed the seminar on YouTube. The stream peaked at 20 concurrent viewers and had viewers from America, Bulgaria, Brazil, Britain, Canada, Czechia, France, and Israel. Here’s a shout-out to everyone who tuned in. If you missed it, the recording (below) and slides are online.

After the seminar, the Baranzini Lab organized a reception. The reception was …

read more

Four years of fellowship: annual summaries for my NSF Graduate Research Fellowship

As a first year graduate student at UCSF, I took a mandatory course titled Scientific Writing, which helped students apply for the National Science Foundation’s Graduate Research Fellowship. I was fortunate to receive the fellowship (Grant No. 1144247), which has funded the bulk of my PhD since its third year.

At the end of each fellowship year, fellows submit an Annual Activities Report, which includes a written Fellowship Year Summary. Below I’ve reproduced …

read more

The history of publishing delays

Last June, I released a summary of the recent publishing delays at 3,475 journals. The post attracted lots of attention via Twitter and Nature News, primarily because scientists are frustrated with the sluggish pace of publishing.

However, a major question remained. Are publication delays getting shorter or longer? Kendall Powell, writing a feature for Nature News released in tandem with this post, contacted me. Her investigation had uncovered a widespread belief that delays were …

read more