Tuesday, February 26, 2008

Tools for analyzing "lists" in biology

My latest research is focused on cluster/list annotation in biology. Given a cluster or list of genes or proteins that were grouped together using some metric (expression profile, sequence or structure similarity, interactions, etc), how can you discover descriptive terms or labels for that cluster? This seems to be a common question, and yet I've had trouble finding tools that help you do what I am specifically trying to do (investigation of a list of biological entities). I've found many that can give you tons of information for single genes or proteins, which I don't consider that helpful, and a few that can give you information for a group, but these are either organism specific or limited to one or two types of data (e.g. GO terms).

Since I am developing a method to do this based on text, I'd like to be able to compare my method to existing ones that solve the same problem. What I am looking for is two or three available methods that give you information relevant to a list of biological entities from multiple species, at least one of which uses literature or text-mining. Does anyone know of such methods, or have ideas of where to look? Various PubMed and Google searches have failed me!

Unrelated, but also done today: Submitted the PSB proposal to Nature Precedings as per several of your requests. Will update once word is back from their review process.

3 comments:

Bill Hooker said...

Try asking over at the Open Helix blog; they were very helpful to me.

shwu said...

Thanks! I've left them a comment and look forward to seeing what they come up with! By the way, that What's Your Problem thing they've got going is pretty awesome... open bioinformatics consulting!

Jean-Claude Bradley said...

I'm happy to see you went the Precedings route with your proposal! If more people started doing that it could really be useful to scientific progress.