You Don't Get to 2000 Open Data Sets Without Making a Few Friends – or: How I Got to be Called the Mark Zuckerberg of Open Source Genetics

What Have I Done?!

There are many firm believers in the different kinds of openness: open access, open source, open data, open science, open you-name-it. And at least to me, some of the most interesting things happen at the intersection of those different opens. Which probably is where openSNP – the project I co-founded in 2011 – can be located. It’s an open source project which tries to crowdsource collecting open genetic data. This is done by enabling people to donate their personal genetic information into the public domain, alongside phenotypic annotations. And for good measure we also factor in open access, by text mining the Public Library of Science and other open databases for primary literature. What started as a somewhat freakish idea in 2011 has by mid–2015 grown to a database of over 2100 genetic data sets and over 300 different phenotypes.

The Impact of Going open

The existence of openSNP is arguably a great success for (citizen) scientists, teachers and everyone else who likes to play around with genetic data. But somewhat ironically, running a project outside academia and designed to overcome the limitations academia has also had lasting impact on my academic career: Besides of learning lots of useful skills due to working on the project, it led to winning awards & grants, finding jobs and collaborators, writing papers, being invited to conferences.

Grants & Awards

One of the earliest successes was winning the PLOS/Mendeley Binary Battle in 2011, for making the most creative use of their respective open Application Programming Interfaces. This had a twofold effect: First this got us some funding to get the platform up and running and second it also generated wide media coverage, from scientific journals such as Nature to mainstream media like Forbes . Another success grant-wise was winning one of the first Grants4Apps by Bayer HealthCare in 2013, which again generating some basic funding and media coverage. Speaking of media coverage: it culminated in the probably most ridiculous comparison ever made, the Mark Zuckerberg of open-source genetics.

Jobs & Papers

Getting awards and media coverage is nice, but having a job and success in terms of the main academic currency – papers – can be just as important. Being somewhat dragged into the spotlight additionally helped in those two departments as well. In early 2012 it led to a job as a research assistant at the Max Planck Institute for Intelligent Systems, which culminated in the creation of easyGWAS, a web-platform for performing genome-wide association studies in model-organisms and an accompanying manuscript.

Other resulting and still ongoing collaborations are into how Direct-To-Consumer genetic tests can be analyzed using low-budget tools or into how citizen science and participant-led research are challenging the traditional research system. And last but not least, openSNP and the work we’ve done on it has been published as a open access paper itself with PLOS One.

Travel the World, Meet Interesting People

So far, doing open science also gave lots of travel opportunities: Visiting conferences and workshops, in the academic world as well as in industry. Amongst many more, ranging from a panel in San Francisco, over a workshop and Research Hack Days in Zurich to giving talks in Shenzhen and Dublin. While spreading the open science gospel can be very rewarding in itself, meeting others with similar interests can be even more so. Besides finding potential partners in crime (also known as collaborators), it also gives the opportunity to new friendships.

This is the End

While being an active advocate for openness in academia and science in general is still met with raised eyebrows and shaking heads from time to time, it definitely isn’t true any longer that being open necessarily equates career suicide. Publishing open access, doing open source and facilitating open data can also mean a huge career boost, giving networking opportunities, job offers and otherwise unpublished papers. To sum it up: 5/5, would do again.

Reviews

Showing 10 Reviews

  • Placeholder
    Ross Mounce
    Confidence in paper
    Quality of writing
    Originality of work
    2

    Excellent contribution. I didn't know you were also using text mining too. Have you considered mining non-open access journals too, not just PLOS & the PMC OA subset?

    I'm sure you probably know this but the UK has a specific copyright exception to enable non-commercial text and data mining, without the explicit permission of the copyright holders (provided you have legitimate access). I hope the rest of Europe gets an even better exception that allows mining for any purpose, including commercial purposes.

    10/10 for your title by the way. Love it :)

    This review has 1 comments. Click to view.
    • Img 9708 5
      Bastian Greshake

      Thanks for the great suggestions! We are already going through the catalog of Mendeley to identify papers which are about the genetic variants stored in openSNP. We link to those publications as well and for many we are also able to indicate whether they are open or closed access.

      But in general we are not too deep into text mining territory so far, would be really cool to enhance on this though. So if any experts on text mining feel like they have some spare time, we would love to get contributions!

      p.s. glad you like the title. :-)

  • Img 20150729 155406 animation 360
    Joshua Nicholson
    Quality of writing
    Originality of work
    1

    This is a great piece on how doing something open has helped an individual and community advance scientifically. I hope it encourages future Zuckerbergs of open science!

    This review has 1 comments. Click to view.
  • Placeholder
    Bev Acreman
    Quality of writing
    Originality of work
    1

    The title is great, and the innovation described in the article is inspiring. I love the idea of things that can happen at the "intersections of these different opens", and this line sums up he article perfectly: "publishing open access, doing open source and facilitating open data can also mean a huge career boost, giving networking opportunities, job offers and otherwise unpublished papers"

  • Placeholder
    jenny sabir
    0

    It is the thing that we all want to know but never expected such a great deal to get the cheat brawl stars online gems for free at this website . I was shocked to see that gold can also be generated here.

  • Cockatrice
    Melissa Haendel
    Confidence in paper
    Quality of writing
    Originality of work
    0

    I love this story. Its the story that all scientists want to come true. It is also the the story of a patient's fantasies come true. It is so wonderful that kicking this off started a whirlwind of conspirators, leading to an environment where we can say, what the hay, to those insurance companies. Make my patient data free (OH WAIT, THERE IS NOT A FORM FOR THAT!).

    Would love to help make openSNP phenotype data interoperable with other genomic health efforts.

    Congratulations on an amazing idea and an amazing effort.

    4pts for inspiration factor
    5pts for compelling nature of evidence
    3pts for quality of writing
    5pts for impact of activities on personal goals, science, or scholarship.

  • Placeholder
    Michael Crusoe
    Confidence in paper
    Quality of writing
    Originality of work
    0

    Open Science enhances for serendipity and Bastian Greshake’s success story (“You Don't Get to 2000 Open Data Sets Without Making a Few Friends – or: How I Got to be Called the Mark Zuckerberg of Open Source Genetics”) for the Winnower/ARCS contest clearly shows the the many potential benefits, both predictable (increased community visibility, papers, and a job) and sublime (the hyperbolic nickname from the title, world travel).

    Like many other readers I want to hear more of this story, outside of the word limit of the competition.

    Competing interests: I met Bastian at the Biometrics Open Source Conference, 2015 and we follow each other on Twitter.

  • Placeholder
    Danielle Robinson
    Confidence in paper
    Quality of writing
    Originality of work
    0

    Solid submission outlining the author's career after developing openSNP. Some additional detail on the early years would have been interesting. Was it an immediate success? Were there specific partnerships or decisions that took the project to the next level?
    4pts for inspiration factor
    5pts for compelling nature of evidence
    3pts for quality of writing
    5pts for impact of activities on personal goals, science, or scholarship.

    17/20

  • Placeholder
    Stephanie Westcott
    0

    4pts for "inspiration factor"
    5pts for compelling nature of evidence
    4pts for quality of writing
    5pts for impact of activities on personal goals, science, or scholarship.

    18 total points

  • Placeholder
    Lore Mroz
    Confidence in paper
    Quality of writing
    Originality of work
    0

    If you want to know why the Zuckerberg comparison is so ridiculous, read the disclaimer of openSNP. It's so brutally honest that I wonder how anyone ever uploads their data on openSNP. But people keep doing it, and the above article sums up quite nicely why that might be. I especially like how the different aspects of openness are addressed - synergies with academic work, publicity, funding, networking and dealing with the occasional raised eyebrow. 5/5, would read again.

    Disclaimer: I know the author personally.

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.