A parallel and distributed computing mining system is proposed for finding protein-protein
interaction literatures from the databases on the Internet. In the proposed system, we try to find out
discriminating words for protein-protein interaction by way of text mining from the literatures. A
threshold called matching-degree is also evaluated to check if a given literature might related to protein-
protein interactions. Furthermore, a keypage-based search mechanism is adopted to find related
papers for protein-protein interactions from a given document. The system is designed with a webbased
graphical user interface and a parallel and distributed job-dispatching kernel. Experiments are
conducted the experimental results indicate that by using the proposed system, it is helpful for researchers
to find out protein-protein literatures from the overwhelming piece of information. Moreover,
the utilization of parallel and distributed architecture makes this system scalable and the speedup
and efficiency of the system are promising. With two servers, the speedup is 1.95 and with three servers
the speedup is 3.97 which derive the efficiency to be 0.975 and 0.9925, respectively.
Keywords: Protein-protein interactions, parallel and distributed computing, intelligent computing, data mining, information
retrieval, keypage-based search.
Rights & PermissionsPrintExport