Hi folks, so it's about a week until the proposal is due. So far, we have a draft up, which should be viewable by anyone. It almost there, so there should be no problems with getting it in come submission time. However, it could still use some help with coming up with what the session would actually look like, how much off the beaten path we want to go without turning off the conference organizers, and how much off the beaten path we need to go to make the session useful.
One thing that would really help outside of the proposal itself is to have actual letters of support. That way the organizers will know there is serious interest and commitment for a session on Open Science - it's a gamble for them, frankly, but much less of one if there is a good crowd on board.
So if you would like to support this proposal and are willing to commit to participating should it get accepted, please send me an email to that effect (with as many details of your anticipated participation as you can provide at this time), and I will include all the emails as "supplementary material" next Friday. Please also disseminate this call on your own blogs if you can. Many thanks in advance!
Thursday, January 31, 2008
The scooping debate continues
Bora Zivkovic over at ScienceBlogs posted about a scooping story that was just published in Nature. The story itself is quite scandalous, since the individual in question doesn't have a great reputation as far as I could tell from reading various comments on blogs posting on the subject. Go see some of them at ScienceBlogs to get an idea. What I want to address in this post has to do with Bora's commentary on the story, since he suggests that in an Open Science world, scooping would be much more difficult to pull off, since everything is documented and associated with time stamps, and the community can rally behind the "first to blog". I'm not sure the picture is that simple.
Something that I did not fully appreciate before about scooping is that the "scooper" will claim that the discovery was made independently, which is difficult to disprove in many fields. Paleontology and archeology may have very slight advantage here in that some types of discoveries are singular and tangible - a fossil in the desert, artifacts under a dirt mound - physical things at physical locations. Someone would be hard-pressed to say they were at the same place someone else was, digging up the same object (though, as the aetosaur controversy makes clear, scooping of a related sort can still happen). In biology, you might slave away for years to discover that protein A regulates protein B, but if someone else publishes it first, you're out of luck. Sure, you have the reagents and the cell lines and the protein products - but so do your competitors, as long as they have the basic equipment and resources in place to reproduce what you did. So the fear that making your research public would increase your risk of getting scooped maybe is not that unfounded. Closed scientists could easily leech off the hard work of Open scientists with no one being able to prove anything.
Of course, that is an extremely pessimistic view, but unfortunately, it's a kneejerk reaction from many people in biomedical fields. Until Open Science becomes so widespread that anything Closed is viewed with suspicion, there will be the possibility of exploitation. And that statement itself smacks highly of Big Brother. I don't think we want a "tryanny of Openness" any more than we want scooping to happen. My conclusions from this rather sobering train of thought are that yes, scooping is a moral outrage and the fact that it is an issue is frustrating, but because it does happen, we need to think carefully about how to prevent it from escalating as more people go Open. Can we prevent it? Is collective disapproval enough?
Something that I did not fully appreciate before about scooping is that the "scooper" will claim that the discovery was made independently, which is difficult to disprove in many fields. Paleontology and archeology may have very slight advantage here in that some types of discoveries are singular and tangible - a fossil in the desert, artifacts under a dirt mound - physical things at physical locations. Someone would be hard-pressed to say they were at the same place someone else was, digging up the same object (though, as the aetosaur controversy makes clear, scooping of a related sort can still happen). In biology, you might slave away for years to discover that protein A regulates protein B, but if someone else publishes it first, you're out of luck. Sure, you have the reagents and the cell lines and the protein products - but so do your competitors, as long as they have the basic equipment and resources in place to reproduce what you did. So the fear that making your research public would increase your risk of getting scooped maybe is not that unfounded. Closed scientists could easily leech off the hard work of Open scientists with no one being able to prove anything.
Of course, that is an extremely pessimistic view, but unfortunately, it's a kneejerk reaction from many people in biomedical fields. Until Open Science becomes so widespread that anything Closed is viewed with suspicion, there will be the possibility of exploitation. And that statement itself smacks highly of Big Brother. I don't think we want a "tryanny of Openness" any more than we want scooping to happen. My conclusions from this rather sobering train of thought are that yes, scooping is a moral outrage and the fact that it is an issue is frustrating, but because it does happen, we need to think carefully about how to prevent it from escalating as more people go Open. Can we prevent it? Is collective disapproval enough?
Wednesday, January 30, 2008
OpenMM, Google Cell, and a marriage between informatics and simulation
Vijay Pande gave a talk today at the SimBIOS weekly seminar series here at Stanford. You may know him from such hits as "Folding@home", which has gone almost triple platinum since it was first released. He is a major figure in the protein folding and molecular simulation world, so the talk was definitely well attended.
Vijay's talk centered around a few major projects and themes, many of which I thought might be interesting to those in the Open Science world. The following are my summaries of those themes.
OpenMM - Open Molecular Mechanics
The molecular dynamics community is fragmented, with many different codes and software packages existing to do MD with overlapping functionality. A result of this is that different labs do things their own way, and new advances are adopted slowly because they must be ported into each set of codes. To address this, they are developing OpenMM, an extensible API for molecular mechanics that will, in principle, unify the MD community the way OpenGL did for graphics. If OpenMM is used as the back end to software applications, advances in theory or hardware will immediately translate to those applications.
Google Cell
Ok, this isn't really what he called it, but it's what I immediately thought of when he talked about their hopes to build a structural picture of an entire cell in atomic detail. I think he may have referred to the project under the name "AMOEBA", since that is what they're shooting for first. The idea is to use structural data from x-ray crystallography, cryoEM, and tomography, from which more and more high-quality data is being produced every day, and turn to physics-based simulation to fill in the details. This is a very high-level idea and I love it, if they can do it. And if they do... well, that made me think of a potential interface for it - Google Cell. Like Google Maps but for the cell instead of the Earth. Pan and zoom and click on interesting features, maybe even do searches! We're definitely years away from it, but the possibilities are endless.
Simulation-aided drug design
Many approaches towards drug design involve docking of potential ligands to rigid crystal structures. But Jim Wells at UCSF has shown that proteins can undergo allosteric changes upon binding to different ligands. Simulation can allow docking with "induced fit", providing a more realistic prediction of binding and possibly even affinity. There was a really cool animation that went along with that part of the talk, which will hopefully be posted soon.
Physics-based simulation is transferable
One of the good things about simulation is that it is transferable across many disciplines. Informatics, too, actually. But when data is scarce, it sometimes pays to borrow tools from other fields. The example Vijay used was that of the protein folding problem. We know relatively little about how proteins fold, but we do know a lot about chemistry and physics, which ultimately govern how proteins fold. So why not exploit our knowledge of chemistry and physics to try and learn more about protein folding? And that is precisely what molecular dynamics simulations do. More generally, transferability is a useful concept to keep in mind. In fact, one of the founding ideas behind SimBIOS is the idea that many diverse biological problems can be solved using the same tools and basic principles, one of them being physics-based simulation.
Informatics & Simulation, happily ever after?
An interesting observation Vijay made was that informaticians and simulation people have traditionally formed two separate, and sometimes antagonistic camps. Something like "physics-based simulation can't teach us anything!" vs. "informatics is sloppy". But in Vijay's work, it is becoming evident that while both approaches have their strengths and weakness, much more can be accomplished when the two are combined. He used the analogy of trying to find a specific person in a large city. One could use a device that beeps when close to the target, but you could search all day without ever getting anywhere near the person. Instead, if you looked the person up in a phone book - used information, in other words - you could find the person's house quite easily, and then the device would be extremely useful. Informatics is very good at getting in the ballpark, but sometimes suffers from being lower resolution. Simulation, on the other hand, can be very precise, but sometimes needs guidance or it will never find the solution. A partnership between informatics and simulation therefore seems not only powerful, but natural.
Vijay's talk centered around a few major projects and themes, many of which I thought might be interesting to those in the Open Science world. The following are my summaries of those themes.
OpenMM - Open Molecular Mechanics
The molecular dynamics community is fragmented, with many different codes and software packages existing to do MD with overlapping functionality. A result of this is that different labs do things their own way, and new advances are adopted slowly because they must be ported into each set of codes. To address this, they are developing OpenMM, an extensible API for molecular mechanics that will, in principle, unify the MD community the way OpenGL did for graphics. If OpenMM is used as the back end to software applications, advances in theory or hardware will immediately translate to those applications.
Google Cell
Ok, this isn't really what he called it, but it's what I immediately thought of when he talked about their hopes to build a structural picture of an entire cell in atomic detail. I think he may have referred to the project under the name "AMOEBA", since that is what they're shooting for first. The idea is to use structural data from x-ray crystallography, cryoEM, and tomography, from which more and more high-quality data is being produced every day, and turn to physics-based simulation to fill in the details. This is a very high-level idea and I love it, if they can do it. And if they do... well, that made me think of a potential interface for it - Google Cell. Like Google Maps but for the cell instead of the Earth. Pan and zoom and click on interesting features, maybe even do searches! We're definitely years away from it, but the possibilities are endless.
Simulation-aided drug design
Many approaches towards drug design involve docking of potential ligands to rigid crystal structures. But Jim Wells at UCSF has shown that proteins can undergo allosteric changes upon binding to different ligands. Simulation can allow docking with "induced fit", providing a more realistic prediction of binding and possibly even affinity. There was a really cool animation that went along with that part of the talk, which will hopefully be posted soon.
Physics-based simulation is transferable
One of the good things about simulation is that it is transferable across many disciplines. Informatics, too, actually. But when data is scarce, it sometimes pays to borrow tools from other fields. The example Vijay used was that of the protein folding problem. We know relatively little about how proteins fold, but we do know a lot about chemistry and physics, which ultimately govern how proteins fold. So why not exploit our knowledge of chemistry and physics to try and learn more about protein folding? And that is precisely what molecular dynamics simulations do. More generally, transferability is a useful concept to keep in mind. In fact, one of the founding ideas behind SimBIOS is the idea that many diverse biological problems can be solved using the same tools and basic principles, one of them being physics-based simulation.
Informatics & Simulation, happily ever after?
An interesting observation Vijay made was that informaticians and simulation people have traditionally formed two separate, and sometimes antagonistic camps. Something like "physics-based simulation can't teach us anything!" vs. "informatics is sloppy". But in Vijay's work, it is becoming evident that while both approaches have their strengths and weakness, much more can be accomplished when the two are combined. He used the analogy of trying to find a specific person in a large city. One could use a device that beeps when close to the target, but you could search all day without ever getting anywhere near the person. Instead, if you looked the person up in a phone book - used information, in other words - you could find the person's house quite easily, and then the device would be extremely useful. Informatics is very good at getting in the ballpark, but sometimes suffers from being lower resolution. Simulation, on the other hand, can be very precise, but sometimes needs guidance or it will never find the solution. A partnership between informatics and simulation therefore seems not only powerful, but natural.
Tuesday, January 29, 2008
PSB proposal up on Google Docs
I've posted the proposal on Google Docs. If you've been commenting, I invited you. Does anyone know if it's possible to make the document open to everyone?
Monday, January 28, 2008
First draft of PSB proposal
[Edit: I will move this to Google Docs later today when I get a chance, but feel free to post comments until then and I will do my best to incorporate them in the draft I put up there.]
-------------------------------------------------------------------------------
SESSION TITLE:
“Open Science: tools, approaches, and implications”
INTRODUCTION & FOCUS:
The practice of science undergoes constant evolution. As discoveries are made, technologies developed, and data generated, new paradigms in the way science is conducted arise and flourish. We are currently witnessing an unprecedented period of scientific and technological advancement, due mostly to the ubiquity and power of computing at multiple levels. Not only has computing drastically changed our ability to produce and analyze data, it is also changing the ways in which we store knowledge and communicate about science. A common theme emerges from these changes: openness.
Openness in science manifests itself in many ways. Open source tools and open access publishing are, by now, familiar concepts. Research would stall without the many public scientific databases and repositories available. The proliferation of such databases, in turn, has spurred the development of open standards and terminologies for data and information exchange ranging from experimental protocols [flow cytometry ref] and biological pathway descriptions [SBML and BioPAX ref] to biomedical text categorization [UMLS, MeSH refs]. Perhaps most notably, the last few years have witnessed an increased interest in what is being termed Open Notebook Science - the practice of disclosing publicly all or part of one's research or laboratory activities, usually through the use of blogs and wikis [ref on ONS].
This session would address the development and practice of Open Science with an emphasis on the following areas:
JUSTIFICATION:
Without openness, science, and especially the biomedical sciences, would suffer. This is most evident with regards to open data. Many fields rely on open data from public databases such as GenBank, Swiss-prot, and the Protein Data Bank, in addition to countless other resources. The availability of scientific literature also influences the rate at which research advances. Open data, open access, and open source have all become indispensable to research in the biomedical sciences, and their success suggests that even greater benefit would result from increased openness. Governments all over the world are throwing their weight behind Open efforts, the most recent example of which is the NIH mandate in the U.S. that all publicly-funded investigators make their publications open access, which was signed into law in early January.
It is evident that we are on the cusp of embracing Open Science, and yet rigorous forums for presenting methods and discussing issues have been rare, lost among special interest groups and fringe sessions convened for some related, but other purpose. Examples are the Bioinformatics Open Source Conference, held annually as a Special Interest Group (SIG) at Intelligent Systems for Molecular Biology (ISMB), the BioOntologies and BioLINK SIGs at ISMB the past few years, and a Birds of a Feather session at ISMB 2007. In addition to ISMB, the American Medical Informatics Association(AMIA) held sessions on health data exchange and communication in 2007 and 2008, and PSB itself regularly features sessions on data integration, Semantic Web, ontologies, and BioNLP, all of which are related to Open Science as either applications, beneficiaries, or technologies. None of these previous meetings were expressly focused on Open Science as a general concept, however. The best example of an Open Science-themed meeting may be the 2008 Science Blogging Conference held in mid-January in North Carolina, where several of the sessions concentrated on Open Science, public scientific data, and Open Science in the developing world.
There is clearly interest in Open Science in the science community, and the fact that so many areas of research - including a significant proportion of those highlighted at PSB - depend on Open Science principles suggests that it is a fruitful topic for investigation. It is time for the biomedical sciences and biocomputing - which arguably have the most to gain - to begin exploring the challenges and potential within Open Science as they would any other new technology or development. In particular, systematic studies of the current scientific climate and challenges of Open Science - behavioral, cultural, technological - are needed. This session on Open Science would highlight research, tools, and issues relevant to Open Science both to those active in the Open Science community and those interested in learning about Open Science. As Open Science is a relatively novel concept, few scientific publications or conferences have addressed it specifically. A growing body of literature from popular media, including BusinessWeek, the NY Times, and Scientific American supplements the few studies that explicitly investigate data sharing and open access literature in the biomedical sciences [Campbell EG et al 2002, Wren JD 2005, Piwowar HA 2007].
Thus, the stage is primed for additional research on the climate and culture of science, as well as tools and resources designed to facilitate Open Science. Papers can be solicited from a number of angles related to Open Science, such as from the bio-ontologies, BioNLP, or open source tools communities. We will also solicit research and policy papers from those involved in Open Access publishing and open data sharing. Importantly, however, we will invite those who are directly involved in the development or practice of Open Science, including, but not limited to: Jean-Claude Bradley (Drexel University), Cameron Neylon (University of Southampton), Rosie Redfield (University of British Columbia), Michael Barton (University of Manchester), Peter Suber (Open Access correspondent at the Scholarly Publishing And Resources Coalition), Bill Hooker (Shriner Hospital), and Heather Piwowar (University of Pittsburgh).
Adoption of Open Science, although widespread in public and government institutions, is still rare at the level of the individual researcher due to technological and cultural obstacles. Both types of obstacles can be addressed in an international, scientific forum exploring the tools, resources, and questions facing Open Science. By participating in a session on Open Science, the research community convened at PSB will be uniquely prepared to undertake much needed methodology development, scientific inquiry, and discussion necessary for advancing Open Science.
[Edit: thinking of adding a section at the end suggesting a format and structure for the session that would make it most in keeping with and productive for Open Science. i.e.
- All sessions have an intro tutorial the day before the conference. We should also have a tutorial, but I would prefer it not to be on ONS - save that for the actual session). Any suggestions here?
- Keynote on the history, importance, different aspects, and potential future of Open Science
- Research papers (primary talks) (2-4)
- Tools/demos/tutorials (2-4)
- Policy papers? (1-2)
- Panel discussion on some issue
- Open discussion on some question where action may be needed (maybe planning the next Open Science meeting)
(Note: This first attempt is probably way too flowery and expansive, and I am probably missing a lot of good material. The proposal can be up to 6 pages long, and what I've written is about 1, maybe 2 pages at most. Please help augment/modify it to be more meat and less air!)
Instructions:
In 1-6 pages,
- Identify a coherent topic that can be addressed by 3 to 12 papers (define a specific technical area)
- Justify why the proposed area is appropriate for PSB (discuss why the topic is timely and important, and how the topic has been addressed in other conferences or recent publications)
- Argue that there is likely to be sufficient high quality, unpublished material to fill the session, e.g. a list of researchers you intend to solicit for papers
- Provide a short autobiographical sketch and an explicit statement that your organization endorses your involvement
-------------------------------------------------------------------------------
SESSION TITLE:
“Open Science: tools, approaches, and implications”
INTRODUCTION & FOCUS:
The practice of science undergoes constant evolution. As discoveries are made, technologies developed, and data generated, new paradigms in the way science is conducted arise and flourish. We are currently witnessing an unprecedented period of scientific and technological advancement, due mostly to the ubiquity and power of computing at multiple levels. Not only has computing drastically changed our ability to produce and analyze data, it is also changing the ways in which we store knowledge and communicate about science. A common theme emerges from these changes: openness.
Openness in science manifests itself in many ways. Open source tools and open access publishing are, by now, familiar concepts. Research would stall without the many public scientific databases and repositories available. The proliferation of such databases, in turn, has spurred the development of open standards and terminologies for data and information exchange ranging from experimental protocols [flow cytometry ref] and biological pathway descriptions [SBML and BioPAX ref] to biomedical text categorization [UMLS, MeSH refs]. Perhaps most notably, the last few years have witnessed an increased interest in what is being termed Open Notebook Science - the practice of disclosing publicly all or part of one's research or laboratory activities, usually through the use of blogs and wikis [ref on ONS].
This session would address the development and practice of Open Science with an emphasis on the following areas:
- tools and resources for facilitating Open Science (open standards for exchange, tools for conducting Open Science, databases, ontologies),
- approaches towards Open Science (implementations and investigations of standards, licensing, open access publishing, open notebook science)
- socio-cultural studies of aspects of Open Science (case studies and investigations)
- (potentially: Open Science approaches towards science education?)
JUSTIFICATION:
Without openness, science, and especially the biomedical sciences, would suffer. This is most evident with regards to open data. Many fields rely on open data from public databases such as GenBank, Swiss-prot, and the Protein Data Bank, in addition to countless other resources. The availability of scientific literature also influences the rate at which research advances. Open data, open access, and open source have all become indispensable to research in the biomedical sciences, and their success suggests that even greater benefit would result from increased openness. Governments all over the world are throwing their weight behind Open efforts, the most recent example of which is the NIH mandate in the U.S. that all publicly-funded investigators make their publications open access, which was signed into law in early January.
It is evident that we are on the cusp of embracing Open Science, and yet rigorous forums for presenting methods and discussing issues have been rare, lost among special interest groups and fringe sessions convened for some related, but other purpose. Examples are the Bioinformatics Open Source Conference, held annually as a Special Interest Group (SIG) at Intelligent Systems for Molecular Biology (ISMB), the BioOntologies and BioLINK SIGs at ISMB the past few years, and a Birds of a Feather session at ISMB 2007. In addition to ISMB, the American Medical Informatics Association(AMIA) held sessions on health data exchange and communication in 2007 and 2008, and PSB itself regularly features sessions on data integration, Semantic Web, ontologies, and BioNLP, all of which are related to Open Science as either applications, beneficiaries, or technologies. None of these previous meetings were expressly focused on Open Science as a general concept, however. The best example of an Open Science-themed meeting may be the 2008 Science Blogging Conference held in mid-January in North Carolina, where several of the sessions concentrated on Open Science, public scientific data, and Open Science in the developing world.
There is clearly interest in Open Science in the science community, and the fact that so many areas of research - including a significant proportion of those highlighted at PSB - depend on Open Science principles suggests that it is a fruitful topic for investigation. It is time for the biomedical sciences and biocomputing - which arguably have the most to gain - to begin exploring the challenges and potential within Open Science as they would any other new technology or development. In particular, systematic studies of the current scientific climate and challenges of Open Science - behavioral, cultural, technological - are needed. This session on Open Science would highlight research, tools, and issues relevant to Open Science both to those active in the Open Science community and those interested in learning about Open Science. As Open Science is a relatively novel concept, few scientific publications or conferences have addressed it specifically. A growing body of literature from popular media, including BusinessWeek, the NY Times, and Scientific American supplements the few studies that explicitly investigate data sharing and open access literature in the biomedical sciences [Campbell EG et al 2002, Wren JD 2005, Piwowar HA 2007].
Thus, the stage is primed for additional research on the climate and culture of science, as well as tools and resources designed to facilitate Open Science. Papers can be solicited from a number of angles related to Open Science, such as from the bio-ontologies, BioNLP, or open source tools communities. We will also solicit research and policy papers from those involved in Open Access publishing and open data sharing. Importantly, however, we will invite those who are directly involved in the development or practice of Open Science, including, but not limited to: Jean-Claude Bradley (Drexel University), Cameron Neylon (University of Southampton), Rosie Redfield (University of British Columbia), Michael Barton (University of Manchester), Peter Suber (Open Access correspondent at the Scholarly Publishing And Resources Coalition), Bill Hooker (Shriner Hospital), and Heather Piwowar (University of Pittsburgh).
Adoption of Open Science, although widespread in public and government institutions, is still rare at the level of the individual researcher due to technological and cultural obstacles. Both types of obstacles can be addressed in an international, scientific forum exploring the tools, resources, and questions facing Open Science. By participating in a session on Open Science, the research community convened at PSB will be uniquely prepared to undertake much needed methodology development, scientific inquiry, and discussion necessary for advancing Open Science.
[Edit: thinking of adding a section at the end suggesting a format and structure for the session that would make it most in keeping with and productive for Open Science. i.e.
- All sessions have an intro tutorial the day before the conference. We should also have a tutorial, but I would prefer it not to be on ONS - save that for the actual session). Any suggestions here?
- Keynote on the history, importance, different aspects, and potential future of Open Science
- Research papers (primary talks) (2-4)
- Tools/demos/tutorials (2-4)
- Policy papers? (1-2)
- Panel discussion on some issue
- Open discussion on some question where action may be needed (maybe planning the next Open Science meeting)
Sunday, January 27, 2008
Science and sharing
Following up on my last post on this subject, it appears there has been a recent spate of posts and articles about sharing in science. Taken together, they're a great summary of the challenges Open Science is facing, but also of the benefits and the steps some groups are taking to enhance openness.
Both Neil Saunders and Cameron Neylon point out this article in the NY Times, and a post on the 23andme blog.
In response to the 23andme post, Jasper A. Bovenberg refers to his paper in Genomics, Society, and Policy from 2005.
There is also an article from BusinessWeek in 2007 about some big pharma companies embracing Science 2.0, and an article in Scientific American from a few weeks ago debating the pros and cons.
Both Neil Saunders and Cameron Neylon point out this article in the NY Times, and a post on the 23andme blog.
In response to the 23andme post, Jasper A. Bovenberg refers to his paper in Genomics, Society, and Policy from 2005.
There is also an article from BusinessWeek in 2007 about some big pharma companies embracing Science 2.0, and an article in Scientific American from a few weeks ago debating the pros and cons.
Friday, January 25, 2008
Is the danger of being scooped field-dependent?
The students in my program get together once in a while for what we call "Researchome", also known fondly as "dinner-ome", originally conceived as a casual forum in which students could present their research or other topics of interest to other students, while getting dinner for free. But without someone to present, there is no justification to have Researchome, so rather than deprive a dozen grad students of free food, I threw together a quick presentation of Open Science and my proposal for PSB for our Researchome last night.
Biomedical informatics students are a smart bunch, so there was some great discussion. Naturally, the concern over getting scooped came up, and while I was quick to pooh-pooh it as a naive/narcissistic fear, the others were fairly adamant that it was a valid concern. Several gave personal anecdotes. And the picture that started to emerge was one where the danger of being scooped was highly dependent on the field you were in - theoretical vs. applied, basic vs. translational, science vs. medicine, all of which may put different emphasis on the idea vs. the implementation.
According to one of the students at the Researchome, in theoretical disciplines such as math or physics, credit is given as soon as an idea is recorded. But in fields like cell biology, just having the idea for an experiment or a hypothesis is not enough; instead, you must conduct the experiment and demonstrate successful results through a peer-reviewed publication before credit is given. Because of this, people in these fields are more reluctant to be open about their research before it has been published, and getting scooped can have real consequences for someone's career and funding. In my limited explorations of the world wide open science web, it seems as if a significant portion of those participating are chemists. Is getting scooped less of a concern in chemistry than it is in, say, molecular biology, and, if so, why? If there is a discrepancy between fields in the danger of being scooped, how should this be addressed as the open science community moves forward? Is it possible to change the standards by which success and intellectual credit are determined?
Aside from this interesting issue, some valid points were brought up concerning the proposal for an open science session at PSB. One is that the audience at PSB is by and large composed of scientists who don't generate their own data, but use the data generated by others. A lot of high-throughput, -omics, and bioinformatics-minded people. Therefore, open data and open source will probably be of greater interest to them than the more overarching idea of open notebook science. Focusing on standards, exchange formats, and tools and methodologies for conducting open science may be a good approach.
The other consensus that the students came to was that open science is so broad and important a topic that it should be featured as a session at a much larger conference, such as ISMB or AMIA. Many were bemused as to why I chose what is arguably a niche conference as a venue for open science. My rationale at this point is that it is the soonest we could possibly organize a meeting on open science jointly with an established conference, and I think the audience is relevant enough for it to be productive. Being smaller, it may also be a good stepping stone towards a bigger meeting, and I wouldn't be surprised if some efforts began for that before PSB 2009 comes around.
Many thanks to the students who attended the Researchome for their feedback. I'll be working on a draft of the proposal over the next couple days.
Biomedical informatics students are a smart bunch, so there was some great discussion. Naturally, the concern over getting scooped came up, and while I was quick to pooh-pooh it as a naive/narcissistic fear, the others were fairly adamant that it was a valid concern. Several gave personal anecdotes. And the picture that started to emerge was one where the danger of being scooped was highly dependent on the field you were in - theoretical vs. applied, basic vs. translational, science vs. medicine, all of which may put different emphasis on the idea vs. the implementation.
According to one of the students at the Researchome, in theoretical disciplines such as math or physics, credit is given as soon as an idea is recorded. But in fields like cell biology, just having the idea for an experiment or a hypothesis is not enough; instead, you must conduct the experiment and demonstrate successful results through a peer-reviewed publication before credit is given. Because of this, people in these fields are more reluctant to be open about their research before it has been published, and getting scooped can have real consequences for someone's career and funding. In my limited explorations of the world wide open science web, it seems as if a significant portion of those participating are chemists. Is getting scooped less of a concern in chemistry than it is in, say, molecular biology, and, if so, why? If there is a discrepancy between fields in the danger of being scooped, how should this be addressed as the open science community moves forward? Is it possible to change the standards by which success and intellectual credit are determined?
Aside from this interesting issue, some valid points were brought up concerning the proposal for an open science session at PSB. One is that the audience at PSB is by and large composed of scientists who don't generate their own data, but use the data generated by others. A lot of high-throughput, -omics, and bioinformatics-minded people. Therefore, open data and open source will probably be of greater interest to them than the more overarching idea of open notebook science. Focusing on standards, exchange formats, and tools and methodologies for conducting open science may be a good approach.
The other consensus that the students came to was that open science is so broad and important a topic that it should be featured as a session at a much larger conference, such as ISMB or AMIA. Many were bemused as to why I chose what is arguably a niche conference as a venue for open science. My rationale at this point is that it is the soonest we could possibly organize a meeting on open science jointly with an established conference, and I think the audience is relevant enough for it to be productive. Being smaller, it may also be a good stepping stone towards a bigger meeting, and I wouldn't be surprised if some efforts began for that before PSB 2009 comes around.
Many thanks to the students who attended the Researchome for their feedback. I'll be working on a draft of the proposal over the next couple days.
Thursday, January 24, 2008
Going forward on the PSB proposal
After finding out more information about conference logistics, it looks like we have enough momentum to start writing the proposal. Most of the discussion is taking place on Science in the Open, so let's just keep it there, at least until I get the draft of the proposal up. If you are interested in contributing to the proposal, please email me so I can set up editing permissions in the draft document!
Wednesday, January 23, 2008
The blog of negative results
Magda, inspired by her group meeting today, decided to start a blog of negative results. We've all been there before, and it probably wasn't funny then, but what they say is true: you'll look back on those times and have a good laugh. Now, with the Worst Result Ever blog, you can appreciate the humor that much faster - it's the darker and the lighter side of academia all in one!
If you have your own results that are so bad it's funny, and don't mind sharing, feel free to contact Magda so she can add it to what will surely be a growing collection.
If you have your own results that are so bad it's funny, and don't mind sharing, feel free to contact Magda so she can add it to what will surely be a growing collection.
Monday, January 21, 2008
An additional feeler for PSB
In an earlier post, I mentioned the possibility of submitting a proposal to PSB for a session on Open Science. Since then, I've gotten some informal feedback that the organizing committee is interested in the idea, but needs more information. The advice I received was to submit a regular session proposal detailing what kind of papers are expected and the people who would potentially participate. Essentially, the topic needs to be able to attract enough substantial, scientific papers for the committee to be convinced of its merit. Part of the process will involve soliciting paper submissions from specific people known to be working in the field.
On Cameron Neylon's blog, the point was made that PSB is an expensive conference to attend, and for this reason it could be a non-optimal choice for an Open Science meeting. However, if there are no strong objections to having multiple meetings on Open Science in the next year or two (if there are, then that is definitely worth discussing!), then we have nothing to lose (and probably some to gain) from submitting a proposal to (and if all goes well, hosting a session at) PSB. As Pedro Beltrao points out, it doesn't hurt to gain more exposure, even if it ends up being just a short tutorial. Also, I've been seeing a lot of Open Science activity in the US and Europe, but have yet to catch wind of it elsewhere; PSB, being located in the Pacific, draws a lot of attendees from Asia and Oceania, and it would be interesting to see what efforts and issues regarding Open Science are happening there.
So my big questions are these:
1. What should be the focus of this session on Open Science? (first, frame it as a traditional PSB session, then perhaps as a "creative" session)
2. What kind of substantial/technical/research papers can be written about Open Science?
3. Who are the major players in the field? Who would the session chair invite to submit a paper?
4. Who is willing to help write/organize the actual proposal and session?
But my most important question is this: Is anyone involved in the field interested in chairing this session?
Being a novice grad student who has very little experience with Open Science (this blog is it so far, and it only started a week ago!), I feel that such a session would be much more effective with a more senior Open Science advocate as chair. One of the duties of the session chair is to give a tutorial on the subject, another reason why someone with more experience would be better suited to this role.
Again, the deadline for session proposals is Feb 8th, so if it is going to happen, it will have to happen pretty fast. The community has already shown that bigger things (a grant proposal) can be accomplished in less time (a week), so I am not worried about getting it done if a consensus is reached.
If you have any answers or suggestions for the above 4 questions or about the session in general, and especially if you are interested in chairing, please respond to this post or email me!
* Edit: I originally mixed up Cameron's blog (Science in the Open) with JC Bradley's blog (UsefulChemistry), and have corrected it. Apologies to both!
On Cameron Neylon's blog, the point was made that PSB is an expensive conference to attend, and for this reason it could be a non-optimal choice for an Open Science meeting. However, if there are no strong objections to having multiple meetings on Open Science in the next year or two (if there are, then that is definitely worth discussing!), then we have nothing to lose (and probably some to gain) from submitting a proposal to (and if all goes well, hosting a session at) PSB. As Pedro Beltrao points out, it doesn't hurt to gain more exposure, even if it ends up being just a short tutorial. Also, I've been seeing a lot of Open Science activity in the US and Europe, but have yet to catch wind of it elsewhere; PSB, being located in the Pacific, draws a lot of attendees from Asia and Oceania, and it would be interesting to see what efforts and issues regarding Open Science are happening there.
So my big questions are these:
1. What should be the focus of this session on Open Science? (first, frame it as a traditional PSB session, then perhaps as a "creative" session)
2. What kind of substantial/technical/research papers can be written about Open Science?
3. Who are the major players in the field? Who would the session chair invite to submit a paper?
4. Who is willing to help write/organize the actual proposal and session?
But my most important question is this: Is anyone involved in the field interested in chairing this session?
Being a novice grad student who has very little experience with Open Science (this blog is it so far, and it only started a week ago!), I feel that such a session would be much more effective with a more senior Open Science advocate as chair. One of the duties of the session chair is to give a tutorial on the subject, another reason why someone with more experience would be better suited to this role.
Again, the deadline for session proposals is Feb 8th, so if it is going to happen, it will have to happen pretty fast. The community has already shown that bigger things (a grant proposal) can be accomplished in less time (a week), so I am not worried about getting it done if a consensus is reached.
If you have any answers or suggestions for the above 4 questions or about the session in general, and especially if you are interested in chairing, please respond to this post or email me!
* Edit: I originally mixed up Cameron's blog (Science in the Open) with JC Bradley's blog (UsefulChemistry), and have corrected it. Apologies to both!
Friday, January 18, 2008
New meaning to "publish or perish" - an opening for Open Science?
The saying "publish or perish" is well-known in academia, and typically both actions refer to the same subject - you, the aspiring/struggling grad student/post-doc/fellow/assistant professor. A recent correspondence in Nature puts a new and bracing spin on the phrase.
I think at some point most academic researchers have experienced the conflict that can arise when it is time to write a paper. On the one hand, you're getting a chance to reward those months or years of hard work with some exposure and a line on your CV, and invest in the potential for future collaborations. On the other, maybe you've just gotten started on a really promising or exciting research direction, are in a groove, work-wise, and to have something like writing suddenly vying for your attention just means that both activities suffer. You feel that you can't drop what you're working on to write the paper, but the paper writing is distracted and unfocused because you're still trying to conduct research half the time (and thinking about it more than that). But we march on to these two seemingly competing drummers, fueled somewhat by the vague hope that our work, once it is in the public domain, will also contribute a drop in the bucket that is scientific advancement of our species.
But what about other species? In conservation biology, "publish or perish" can take on new, and frighteningly literal, meaning. Time spent working on publications is time taken away from research on ecosystems and endangered wildlife. In the meantime, earth's natural resources and diversity suffer. To prevent this from happening, the authors of the letter suggest (only slightly ironically) the adoption of a new impact factor:
Even if this proposal was made half in jest, it does highlight some important questions. The first sentence essentially asks: how much faster could research be conducted (and, by translation, medical or scientific advances be developed) if there was less emphasis on publication? The second sentence is quite a bit more complex, since it seems it would bring in value judgments on the worth of specific research questions - something that would be hard to define objectively and is easily influenced by prevailing trends, funding, and big talk.
So let's talk about the first idea - that the pace of scientific advancement suffers from the emphases placed on publication. Obviously, research needs to be disseminated if it is going to contribute. But here is where Open Science comes in. Suppose Open Science and Open Notebook Science became the norm rather than the burgeoning, but still fringe, movement that it is now. Two big questions immediately come to my mind: Would publication matter as much as it does now? Would research proceed faster? I say no and yes.
With most, if not all, of your methods, data, and results made public, formal publication would not be necessary for others to learn of and benefit from your work. Peer-review may become an intrinsic part of the entire research process. Of course, a formal summary of your work adds great value and would be indispensable for someone searching for information on your field of study, but much of the pressure to publish could be alleviated. Add to that the increased exposure to the entire community and you get enormous potential to speed up your research in addition to research in general. You can learn what is working and not working in your experiments, get useful feedback and suggestions, and meet people who may be able to help you, all on a much faster timescale. At the same time, new ideas may be spawned, collaborations fostered, and interesting connections made between concepts.
Obviously, the future of Open Science is not going to be as rosy as that, at least in the early stages of its evolution (issues like patenting and privacy are valid and worth lengthy discussion in their own right, but are beyond the scope of this post). In fields like conservation biology, however, the shadow cast by "publish or perish" has terribly real implications, and the move towards Open Science will help to lift it. Can anyone really argue that Open Science is a bad thing?
I think at some point most academic researchers have experienced the conflict that can arise when it is time to write a paper. On the one hand, you're getting a chance to reward those months or years of hard work with some exposure and a line on your CV, and invest in the potential for future collaborations. On the other, maybe you've just gotten started on a really promising or exciting research direction, are in a groove, work-wise, and to have something like writing suddenly vying for your attention just means that both activities suffer. You feel that you can't drop what you're working on to write the paper, but the paper writing is distracted and unfocused because you're still trying to conduct research half the time (and thinking about it more than that). But we march on to these two seemingly competing drummers, fueled somewhat by the vague hope that our work, once it is in the public domain, will also contribute a drop in the bucket that is scientific advancement of our species.
But what about other species? In conservation biology, "publish or perish" can take on new, and frighteningly literal, meaning. Time spent working on publications is time taken away from research on ecosystems and endangered wildlife. In the meantime, earth's natural resources and diversity suffer. To prevent this from happening, the authors of the letter suggest (only slightly ironically) the adoption of a new impact factor:
This impact factor would be based on an estimation of how much worse the conservation status of an endangered species or ecosystem might be in the absence of the candidate's research. It would select for targeted investigation that should help to fill in 'the great divide', and would exclude opportunistic ecology papers claiming to be of conservation significance.
Even if this proposal was made half in jest, it does highlight some important questions. The first sentence essentially asks: how much faster could research be conducted (and, by translation, medical or scientific advances be developed) if there was less emphasis on publication? The second sentence is quite a bit more complex, since it seems it would bring in value judgments on the worth of specific research questions - something that would be hard to define objectively and is easily influenced by prevailing trends, funding, and big talk.
So let's talk about the first idea - that the pace of scientific advancement suffers from the emphases placed on publication. Obviously, research needs to be disseminated if it is going to contribute. But here is where Open Science comes in. Suppose Open Science and Open Notebook Science became the norm rather than the burgeoning, but still fringe, movement that it is now. Two big questions immediately come to my mind: Would publication matter as much as it does now? Would research proceed faster? I say no and yes.
With most, if not all, of your methods, data, and results made public, formal publication would not be necessary for others to learn of and benefit from your work. Peer-review may become an intrinsic part of the entire research process. Of course, a formal summary of your work adds great value and would be indispensable for someone searching for information on your field of study, but much of the pressure to publish could be alleviated. Add to that the increased exposure to the entire community and you get enormous potential to speed up your research in addition to research in general. You can learn what is working and not working in your experiments, get useful feedback and suggestions, and meet people who may be able to help you, all on a much faster timescale. At the same time, new ideas may be spawned, collaborations fostered, and interesting connections made between concepts.
Obviously, the future of Open Science is not going to be as rosy as that, at least in the early stages of its evolution (issues like patenting and privacy are valid and worth lengthy discussion in their own right, but are beyond the scope of this post). In fields like conservation biology, however, the shadow cast by "publish or perish" has terribly real implications, and the move towards Open Science will help to lift it. Can anyone really argue that Open Science is a bad thing?
Wednesday, January 16, 2008
Open Science at PSB 2009?
I attended a seminar today where we came up with a list of ideas for sessions for the 2009 meeting of the Pacific Symposium on Biocomputing (PSB). This group of people organized a session at PSB 2008 called "Multi-scale modeling", so many of the ideas were related to computational modeling. But since I am interested in Open Science, I asked whether there would be interest in having an Open Science session at PSB.
PSB is a pretty prestigious conference that prides itself on covering only the "hot topics" in biology and biocomputing. Combine that with their locale (Hawaii), and you can imagine it is fairly difficult to get accepted. But even though a session on Open Science would differ from their traditional sessions (less primary research papers, more tutorials / descriptions of experiences / discussion), I think it would be great to talk about Open Science there, since A) PSB is prestigious, B) attendees are probably pretty forward-thinking, C) by Jan 2009, we will be ready to have some really productive discussion, and D) did I mention Hawaii?
The deadline for submitting a session proposal is Feb 8th, which is in only a couple weeks. If there is support for this, I will try to carry it through. Here is the call for papers, which includes guidelines for what makes a good session, and responsibilities of a session chair. I'm not sure that I would be the best choice for a session chair since I have not participated in an open science endeavors yet and the session chair is required to give a 1 hr tutorial on the session topic, so if anyone is interested in being the session chair, please let me know! Ideas for what kind of papers could be solicited would also be appreciated.
PSB is a pretty prestigious conference that prides itself on covering only the "hot topics" in biology and biocomputing. Combine that with their locale (Hawaii), and you can imagine it is fairly difficult to get accepted. But even though a session on Open Science would differ from their traditional sessions (less primary research papers, more tutorials / descriptions of experiences / discussion), I think it would be great to talk about Open Science there, since A) PSB is prestigious, B) attendees are probably pretty forward-thinking, C) by Jan 2009, we will be ready to have some really productive discussion, and D) did I mention Hawaii?
The deadline for submitting a session proposal is Feb 8th, which is in only a couple weeks. If there is support for this, I will try to carry it through. Here is the call for papers, which includes guidelines for what makes a good session, and responsibilities of a session chair. I'm not sure that I would be the best choice for a session chair since I have not participated in an open science endeavors yet and the session chair is required to give a 1 hr tutorial on the session topic, so if anyone is interested in being the session chair, please let me know! Ideas for what kind of papers could be solicited would also be appreciated.
A win for Open Access
NIH's public access policy is now mandatory for all NIH-funded investigators. The policy requires submission of a full, electronic version of each published manuscript to PubMed Central, where the full text or PDF is freely available for viewing or download. The BioMed Central blog has a good post about it.
This might be the biggest push towards open access to date, with a government mandate covering the majority of US researchers. Not surprisingly, many publishers are not happy with this development, but I don't think there's much they can do about it, since they will lose a significant number of authors/manuscripts to Open Access journals if they try to deny them, and with that will go a significant amount of their prestige.
Now, what would be even more helpful (and perhaps PubMed Central is doing this) is to provide free access to the full text of each article. Even better, free access to structured text of each article. Natural language processing of biomedical text is a growing field that would benefit hugely if sections, figures, tables, captions, references, etc, were all labeled as such in a computer-readable way. Maybe it's not such a pipe dream, even, to imagine all articles structured this way and indexed with biomedical terms as an automatic pre-publication step.
It will be interesting to follow the drama between Open Access, traditional publishers, and the NIH policy in the next few months!
This might be the biggest push towards open access to date, with a government mandate covering the majority of US researchers. Not surprisingly, many publishers are not happy with this development, but I don't think there's much they can do about it, since they will lose a significant number of authors/manuscripts to Open Access journals if they try to deny them, and with that will go a significant amount of their prestige.
Now, what would be even more helpful (and perhaps PubMed Central is doing this) is to provide free access to the full text of each article. Even better, free access to structured text of each article. Natural language processing of biomedical text is a growing field that would benefit hugely if sections, figures, tables, captions, references, etc, were all labeled as such in a computer-readable way. Maybe it's not such a pipe dream, even, to imagine all articles structured this way and indexed with biomedical terms as an automatic pre-publication step.
It will be interesting to follow the drama between Open Access, traditional publishers, and the NIH policy in the next few months!
Friday, January 11, 2008
It's a small world after all
One Big Lab is inspired by the idea of connectedness between scientists. Research shouldn't have to be hindered by a lack of time, resources, or knowledge; if there exists someone who possesses any of those three things who is willing to help someone else, then there should be a way to connect those individuals. In the end, everybody wins - especially society, and especially science.
Many things will have to happen before this idea can become tangible, and we have some thoughts on what One Big Lab could eventually look like, but for now, this blog will be a place for reflection and discussion on all things Open Science. Hopefully, we'll learn a lot, meet some people, and bring the ideas behind One Big Lab to life.
Many things will have to happen before this idea can become tangible, and we have some thoughts on what One Big Lab could eventually look like, but for now, this blog will be a place for reflection and discussion on all things Open Science. Hopefully, we'll learn a lot, meet some people, and bring the ideas behind One Big Lab to life.
Subscribe to:
Posts (Atom)