Google Summer of Code 2007
"Google Summer of Code is a program that offers student developers stipends to write code for various open source projects. Google works with a several open source, free software and technology-related groups to identify and fund several projects over a three month-period. Historically, the program has brought together over 1,000 students with over 100 open source projects to create hundreds of thousands of lines of code."
get more open source code created and released for the benefit of all
inspire young developers to begin participating in open source development
help open source projects identify and bring in new developers/committers
provide students the opportunity to do work related to their academic pursuits during the summer
give students more exposure to real-world software development scenarios
What is GenMAPP?
GenMAPP is a visualization and analysis tool for biological data. GenMAPP illustrates the relationships between various genes and proteins to help researchers understand their data in terms of connected, biological pathways. Over 15,000 people from >40 countries have registered to download the GenMAPP program. There are over 260 publications that reference GenMAPP or use GenMAPP to display data in the context of biological pathways. GenMAPP is 100% open source and all new development is in Java, MySQL, Derby, XML, and Web technologies such as wiki in collaboration with the
UCSF library,
BiGCaT Bioinformatics, and the
Cytoscape Consortium. Our development team is composed of individuals who are both biologists and programmers, providing a unique perspective on building and using open source tools. More...
Accepted GSoC Projects
Enhanced search strategy in Cytoscape: by Maital Ashkenazi, mentored by David States, Allan Kuchinsky, and Gary Bader
Visual history of pathway modification: by Martijn van Iersel, mentored by Alexander Pico, Kristina Hanspers, and Patrick Ahles
GSoC Pathway Editor for WikiPathways: by Thomas Kelder, mentored by Kristina Hanspers, Alexander Pico, and Bruce Conklin
Graph layout library for GenMapp: by Nikolic Aleksandar, mentored by Michael Smoot, Gary Bader, and Scooter Morris
Resources
Communication
Email
Project Wiki Pages (above)
Live voice chat on Skype
Live text chat on
SlashNET: irc.slashnet.org #genmapp-gsoc (
useful commands) Live text chat on
SlashNET: irc.slashnet.org #summer-discuss (
useful commands) Live virtual socials in
Second Life! wiki:/SecondLifeMeeting Start your own
blog! GSoC Planet
Blogs
For Students
For Mentors
Overview of Ideas
As we are prototyping new features and functions for GenMAPP we are exploring a number of more general areas ideal for Google Summer of Code students. If you have solid CS skills and have interests in the biological domain (do you think genes are cool?), then you should apply!
IDEA 1: WikiPathways: Content Management
Ever hear of Wikipedia? Well, we are doing the same thing for biological pathways. A major hurdle to scientific research involving pathways is collecting and maintaining the information from the scattered (and busy) experts in various fields of study around the world. In the same manner as wikipedia, we'd like to bring the power of editing pathways to the people. While complementary, this approach is in contrast to more traditional efforts involving specialized groups of curators. The success of a wiki relies on good content management, resolving changes, divergences, convergences and disputes in a way that is fair, impartial and effective. A good wiki also needs intuitive search and navigation strategies that evolve with the complexity of the site. Let's work on a body of code that controls content management and navigation for wiki projects. We can test the code immediately on our WikiPathways site. We already have content and eager contributors. If we build it, they will come.
IDEA 1A: Pathway history
An essential feature of wikis is that the history of edits is stored. This encourages contributions because of the psychological effect of the impossibility of accidentally destroying information. But visualizing history information is easier for 1-dimensional text than for pathways, being 2-dimensional vector graphics. Visualizing the history of pathways in an intuitive way is a major challenge that we'd like to see solved.IDEA 1B: Pathways and the semantic web
The concept of wiki's brings us closer to the ideal of the semantic web, where information is in a context suitable for automatic analysis. BioPAX is based on the Web Ontology Language (OWL) and a more advanced way to describe biological pathways. By making it possible to incorporate BioPAX information into wikipathways we bring the ideal of the semantic web a step closer.IDEA 1C: Embedded pathway editor
To encourage scientists to contribute their pathway information to the community WikiPathways has to provide an easy and intuitive way of editing pathways. Currently the threshold for editing a pathway is rather high: an external editor is started using Java Webstart, a lot of jars are downloaded and JVM has to be installed on the client machine. It would be better to implement an editor that is embedded in the wiki page (e.g. using
AJAX), making it possible to start editing a pathway with a single mouse click.
IDEA 2: Innovative & Intuitive Interfaces
A major downfall of most informatics tools in the biological domain is their poorly designed, clunky GUIs. As biologists, we have a lot of ideas about how the interfaces should look, feel and work. As developers, we often think about what can be done instead of what should be done. Like most interfaces, our interfaces need to display available options and provide easy access to common functions. We can immediately implement it for the interfaces to be used in the next version of GenMAPP.
IDEA 2A: Database Interface
There are some nice interfaces for Search like Apple's
Spotlight, with auto-complete and continuously updated indexing. We are interested in developing a similarly intuitive interface for database-driven editing. Imagine drawing a pathway using a drafting board like in Illustrator and then linking the objects you've drawn to a dedicated database of gene names and attributes. We have the drafting board and we have the databases. What we need is a good interface for identifying and linking the visual objects with the proper database objects. IDEA 2B: Data Import Interface
We need upload/import interfaces that are specialized for different biological data types (listed below). This interface would also need to support the application of pre-processing algorithms and statistics to help define color-coded visualizations to be used in downstream analyses.RNA microarray
Proteomics
Single Nucleotide Polymorphisms (SNPs)
Protein-Protein interactions
All-exon splicing arrays
IDEA 3: Visualizing Multiple Data Points/Types
Biological networks are frequently associated with data that come from a diversity of sources -- e.g. gene expression via microarrays, protein abundance via mass spectrometry, chromosomal abberations via comparative genomic hybridization, genetic polymorphisms. A number of approaches have been developed for overlaying data from multiple sources on network nodes and edges, for example mapping multiple gene expression measurements onto a 'striped' node color overlay. We are seeking proposals that implement novel, alternative ways of visualizing multiple data measurements on network nodes and edges.
IDEA 4: Network Workspace Sharing/Publishing
Currently, users are able to save their session in a single file for later use or to pass to their colleagues. We are seeking proposals to create a novel system for saving an interactive project to the web. You could imagine that a biologist in Boston could share their new molecular interaction network on the web and their collaborator in Paris could immediately interact with it. This would support more sharing and collaboration in biology.
IDEA 5: New Network Visualizations
We have a core set of basic graph (network) layout algorithms, from both proprietary and open source libraries. Biologists clearly need more advanced layout algorithms that consider biological information when performing the layout. We seek proposals to write new layout algorithms that make use of biological attribute data. For instance, a layout could organize nodes with different attributes in concentric rings, with each ring representing one type of node, or nodes with different degree. Lots of cool ideas are showcased at
http://www.visualcomplexity.com/
IDEA 6: Advanced Editing Features and Tools
We have a simple editing tool with a limited set of features. This is sufficient to support most tasks. However, additional tools and features would be useful in supporting tasks like preparation of figures for publication or presentation. Examples of such features are
support for grid lines and snapping to grid
management of clipboard objects
keyboard accelerators for editing operations
a command-line 'script' interpreter for adding nodes and edges by typing text, e.g. 'A inhibits B'
adding and simple graphical annotations, such as text, basic shapes, and symbols
and other features/tools that one might expect in products like Adobe Illustrator and Microsoft PowerPoint. We seek proposals for enhancing our network editing environment with novel features and tools.
If you want to apply
We would like to know who you are and how you think. Incorporate the following into your application:
Your interests w.r.t. programming
The background behind your interests
Your interests and background w.r.t. biology (not required)
Your ideas for a project (or expand from one above)
Refer to and link to other projects or products that illustrate your ideas
What can you bring to the team?
For more background information see:
GenMAPP Website: Current open source tool.
GenMAPP Wiki: GenMAPP development wiki
Cytoscape Website: Another open source network tool which will serve as the platform for future GenMAPP development in Java.
Cytoscape Wiki: Cytoscape development wiki
Cytoscape Javadoc: Cytoscape API and source code documentation
Cytoscape WebStart: Run Cytoscape as a WebStart App!
PathVisio Website: Another open source pathway visualization tool which will serve as the editor for WikiPathways.org.
PathVisio Webstart: Run PathVisio as a WebStart App!
If you are selected
You be working with a small, active group of programmers that also speak biology
You will be gaining experience in a rapidly evolving field that interfaces computer and biological sciences
You might make more that you would mowing lawns!
GenMAPP Wiki