The transPLANT integrated search service enables data integration through a single point of entry to a suite of resources under a common schema optimised for search applications. Resources include information on genes, transcripts, markers, phenotypes, and metabolic reactions (see table below).
Search results from all partner databases are returned in a consistent format, ranked by relevance to the search term used. Each result is accompanied by a URL, linking the user back to to the appropriate source information. The list of results matching a search term can be filtered by several search facets, including: data source, data type, and species. Multiple facets can be added or removed dynamically, allowing the available information to be explored before selecting one or more links to retrieve further details.
The search uses the open source search engine Apache Solr using a standard schema devised to capture core information from the range of resources available. The schema is described here, and implemented here. There are two methods for adding resources to the results: 1) data from a particular resource may be collected via FTP in tabular format and indexed 'locally' (into a separate Solr core using the common schema), or 2) via the dynamic (sharded) querying of a remote Solr servers hosted independently by individual data providers.