Friday, April 18, 2008

Call for collaboration: calcium site predictions in need of validation

Time to walk the walk? ;)

I work in a bioinformatics lab, and one of our major projects is protein function modeling and prediction from structure. This means that we often come up with predictions, but have little in the way of experimental validation. A small project done by a post-doc in the lab is looking like it could turn into a paper, and what could really give it the juice it (and many bioinformatics papers) needs to target a top tier journal would be validation in a living system.

In brief, we have a list of predictions for potentially novel calcium-binding sites in known calcium-binding proteins (i.e. new sites in addition to the ones already known) that we would like to validate. Probably only 2 or 3 validations would be sufficient. Since these proteins already bind calcium, some kind of quantitative assay on mutant versions of the proteins may be necessary (e.g. protein X normally binds this much calcium, mutate the loop predicted to bind and show that it now binds less).

If you or anyone you know is interested in collaborating with her to validate some of her predictions experimentally, please shoot me an email or respond here. Suggestions welcome, too!

Monday, April 14, 2008

Envisioning the scientific community as One Big Lab

The blogosphere has been abuzz recently, or, at least, it seems that way if you've only been checking up on it sporadically the last few weeks. Jennifer Rohn's post about lab notebooks has spurred over 100 lively comments spanning electronic lab notebooks, peer-review, openness in science, and the reward system in science, making for an engrossing peek at the social science of science. Cameron's own musings on that discussion. Pawel Szczesny writes about what it means to be a freelancing scientist. All of this is fascinating and it is exciting to contemplate both what the future of science holds and the obstacles we will need to overcome; the fact that there are indeed stubborn obstacles (technological as well as cultural) and potentially tremendous rewards makes the anticipation of that future all the more heightened.

Emboldened by the collective fervor, I would like to propose an idea - an idea with the same name as this blog. But first, the back story.

About 8 months ago, one of my lab mates was writing up a short paper for submission to a translational bioinformatics conference. The work she was submitting revolved around a powerful literature-search tool tailored for pharmacogenomics called Pharmspresso. Although Pharmspresso had features lacking in existing search methods and was thus useful, the intent was for it to recognize genes, drugs and polymorphisms in free text, and so she needed a way to evaluate its performance. The evaluation task would be straightforward: given a set of pharmacogenomics papers, what percentage of the mentions of genes, drugs, and polymorphisms does Pharmspresso capture? Getting the list of recognized entities from Pharmspresso would be easy, just give it the documents and set it running. But what would be the gold standard?

Typically, gold standards are created by humans. In this case, it would be the list of entities recognized by human readers with the appropriate knowledge to make the distinctions, in the same set of papers. To get her gold standard then, she essentially asked favors of her colleagues in the lab and the department, which translated to a number of them reading papers and doing data entry during free time (or during faculty talks) at a departmental retreat in early fall - not exactly fun, but done out of a sense of duty to science and the goodness of their hearts.

Afterwards, while socializing during one of the poster sessions, this task came up, and the discussion (in which Samuel Flores, Magda Jonikas, Yael Garten, Alain Laederach, and Bernie Daigle all participated) quickly turned to alternative solutions for tackling this and similar problems in science - those requiring knowledge and resources external to your own. As another example, many bioinformaticians work on problems that produce predictions of functions which would benefit from experimental tests of their validity. Conversely, a wet lab may benefit greatly from someone with computational expertise guiding or leading the data analysis, or even providing the hypotheses for experimental studies (in the form of predictions). This is the stuff from which many collaborations are born, but it may be difficult to find the right people in the first place, or the task at hand might seem not quite collaboration-worthy.

In essence, the problem boils down to this: you or your lab possesses a certain collection of skills, knowledge, and resources (hereafter referred to as simply resources), but your needs may not be fully addressed by what you possess. The solution lies in this simple proposition: some other person or lab has what you're looking for.

While it makes sense for a lab or individual to grow their resources and be mostly self-sufficient, at some point it becomes more economical to outsource certain tasks - to companies for antibody development, software for data analysis, supercomputers for high-throughput computing, etc. In some cases, the exchange takes place directly at the academic level, for example, with some labs maintaining and sharing specific cell lines or mouse strains for use by other researchers, or less directly through the use of published and available tools for all sorts of tasks in bioinformatics. So it would seem that outsourcing is common and accepted. But aside from these sorts of established avenues, what other needs do scientists have in conducting their research that are not easily solved? How often is a line of inquiry abandoned or slowed because of a lack of necessary skills, knowledge, or material resources?

The idea behind One Big Lab is that the scientific community should act as, well, one big lab, sharing resources when it makes sense, and everyone, especially the community as a whole, benefits.

During that discussion at the departmental retreat, the solution boiled down to some form of online transaction service built around a credit system. Scientist X would like 5 gold standard outputs for a certain task, so she posts a description of the task along with some credit attached. Other users can then sign up to complete the task, after which they receive the stated number of credits. Of course, in order to post tasks, you need to have a balance of credits you can draw from - which you earn by doing other people's tasks. Getting credits into the system to start needs to be figured out (give everyone N credits? Money for credits?), but assuming there's some baseline of credit floating around amongst the various users, an equilibrium should eventually be reached (at least, that's the hope).

Variations on this theme are natural - have a peer rating system, have the final credit payment be subject to a bidding system (based somehow on user ratings, e.g. highly rated users can ask for more credits to complete a task and the task-poster may select which user to "hire" based on the user ratings as well as how much each user is asking), have some kind of mechanism for taking transactions "offline" into serious collaborations, etc. Tasks may run the gamut from routine and rote to intellectually stimulating and scientifically rewarding. Obviously, guidelines will have to be set for what transactions may be appropriate for this forum and which ones might be more suited for formal, collaborative relationships - but even here, a forum such as this could be very useful for finding collaborators.

In addition to the scientific transaction system, there could be other features that build on the community aspect, such as journal clubs, informal manuscript review, resources for students, and discussion forums. There could be repositories for knowledge or links to existing ones, informal or formal consulting, and casual exchange of ideas which could stimulate research or professional development. All of this should reinforce the idea that science is strengthened by community and the scientific community should not be held back by insufficient allocation of resources.

Although there are a number of websites out there that tackle some of these aspects, especially the community-building ones, I haven't really seen much resembling the transaction system, which is really the core of the idea. Pawel's freelance science comes close, and what I'd like to see is a formalized community-wide online service for essentially that. Maybe this is technically infeasible right now right the way grants work (it may be difficult to justify spending time or resources on other people's research) or with the way scientists work, but I would like to think that the basic premise - bringing together people with complementary skills and resources - makes sense and balances out in everyone's favor. (Whether this premise actually pans out in practice is up for debate - if we offered credits for cash, would anyone ever do someone else's tasks, or would demand outpace supply? By the same token, there could be "freelance" scientists like Pawel who primarily complete tasks, and could then have the option of "cashing out".) I'm sure there are a ton of tricky legal, IP, financial, organizational, etc not to mention social and cultural issues (would you trust someone you don't know to do work for you?), but I think the idea of having One Big Lab is worth exploring.

If I had the time, skills, and business acumen I would throw together a prototype and work out a business plan, but at the moment the most I can do is outsource it to the closest thing we have to One Big Lab - the blogosphere. ;)

Incidentally, Alain Laederach had come up with a similar idea about a year earlier and we thought about naming it "Experitrade" - an online system for trading experiments, essentially, but the name sounded too corporate and the grant he wrote never got off the ground. But the idea has persisted and inspired One Big Lab.

So, I'd welcome any thoughts, logical extensions, deal-makers or deal-breakers, important issues to consider, "prior art"... does anyone think this idea has legs? Will it work if it is completely altruistic? Does adding money into the equation detract from its mission or the science? What sorts of technical and organizational roadblocks are there? Clearly it makes the most sense, if any prototype is developed, to start small - with a couple participating labs or within a school or university, which helps with the trust issue as well. But I'd like to make sure I'm not completely missing the picture!

Friday, April 11, 2008

New paper-protocol-lab-knowledge sharing website out of Stanford

Stanford PhD student Jason Hoyt in the Department of Genetics was fed up with the inadequate presence of literature resources on the web, specifically good discussion surrounding papers, so he's set out to build his own website that would allow users to post, rate, and discuss papers, in addition to other features. Jason says,

Hey fellow colleagues and grad students. So, about a year and a half ago I got tired of the lack of good discussion around research literature online. For instance, what was the best review paper in the field of a new research project I was about to start? So, I started building a website.

What I ended up with was:
-A citation manager called 'My Libraries' (easily download papers to EndNote)
-A lab database called 'WikiGroups' for any lab in the world
-A protocols database
-A paper search that gives better results than PubMed (this depends on you adding more
papers)
-Import papers from PubMed
-Contact or colleague manager called 'Notes'
-A 'My World' page that gathers all the latest from your colleagues, lab group activities
and school seminars.

It's in beta, so please report any bugs or feature requests (form available on all pages).

It's called Ologeez. From the plural of the suffix "-Ology," it refers to every branch of learning. If you find it useful, let other departments or schools know.

After very briefly exploring Ologeez, it seems like a competent addition to the handful of other science oriented resource and knowledge sharing websites currently available. OpenWetWare offers lab websites and shared protocols, but doesn't have literature-oriented resources. PLoS ONE has a journal club feature, but just for PLoS ONE and PLoS doesn't host lab websites or protocols. Laboratree and SciLink offer nice networking and some content management features, but don't support lab websites and literature discussion is indirect at best. Although Ologeez has very few users and entries right now, people may find it useful to be able to set up a lab presence with shared protocols and papers, post and discuss interesting papers, and keep up to date with what their colleagues are doing, all in one website. It includes categories for all branches of science and research, including business/econ, law, and math.

Given its inclusiveness, it has the potential to spread school-wide, though it'll be interesting to see if it catches on enough for the discussion and search features to be useful.