Here, we attempt to identify as well as catalog physical interactions between pairs or groups of proteins using text mining. Protein-protein interactions are important for studying intracellular signaling pathways, modeling protein structures as well as other processes. Identifying protein-protein interactions for fusion proteins is an area which still lacks useful information and resources. ProtFus catalogs the list of some possible mentions of interactions of fusion proteins from text using a Natural Language Processing method. Publicly available information from biomedical research is readily accessible through the internet and is becoming a powerful resource for predictive protein-protein interactions and protein docking.
The approach of text mining takes comparatively lesser time as well as resource in compared to classical high-throughput techniques. ProtFus detects binary relations between interacting protein from individual sentences using rule or pattern-based information extraction. On principle, this approach of text mining is broadly divided into two standard stages, namely, information retrieval, wherein literature containing names of either or both proteins complexes are selected and information extraction, where detecting occurrences of tokens or residues are retrieved.