ted sequences in all selected organisms but not in other related organisms. An initial total of September Pathogen-Host Omics Data proteome. One needs to be careful in choosing the diagnostic targets that these two non-pathogenic organisms are not being detected. The initial list of in all studies. Either the proteins or the DNA coding for these proteins can be used to develop and test pathogen detection systems. All the ��core unique��proteins detected in this study lacked meaningful functional annotation, which is not surprising as such unique proteins are not easy to characterize. One protein identified as a target is a remnant of a prophage protein. Such proteins are well known to be related to virulence. Another protein is from the pXO Discussion A systems approach to biology or medicine requires the sharing, integration and navigation of large and diverse experimental data sets to develop the models and hypotheses required to make new discoveries and to develop new treatments. To date this has most often been done with selected research data or within an institution or program where common instrumentation and methods make standardization of experimental practices and data management easier to achieve. Alternative approaches require a reanalysis of all the data by a common methodology as has been done in some data repositories or assigning some common statistical metric to all data of a certain type to allow functional coupling. These approaches are all potentially useful, but practically difficult to achieve on a large scale with heterogeneous data. The protein-centric approach we employed is a relatively simple, yet powerful and practical, approach to integrate and September Pathogen-Host Omics Data navigate diverse sets of omics data in a manner useful for systems biology. Proteins are often the biologically functional elements in cellular networks; thus, many types of data can be mapped to and through proteins as a common biological object. The lightweight data warehouse approach used for the MPD proved useful in practice, especially with large datasets as its simple design and schema allows greater flexibility to add new data types and to modify search and analysis capabilities. Similar lightweight approaches and schemas designed to optimize queries have been shown useful in integration of genomic data. The main drawback of this approach is that the warehouse does not contain all the data. However, this is HC-030031 site rarely a problem if the data are available in some other data resource optimized for that particular data type and if some upfront analysis of the user’s needs for query and analysis options is performed. For example, our use case analysis suggested that for microarray and mass spectrometry data, individual raw intensities, machine-specific parameters and most calculated numerical values were not required for general queries and analysis across the combined data as these values were only comparable between the particular analysis performed in one lab. As a result, most numerical values were not included in the MPD for the default search but are accessible for display via hyperlinks to our Protein Data Center or FTP site. However, if a new attribute appear or users request searches on a particular value omitted from the warehouse, adding it is a relatively simple matter of adding new data columns. For instance, in example II our combination and analysis of mass spectrometry and protein interaction data, we could include peptide