Application Note pubs.acs.org/jcim
Activity Landscape Plotter: A Web-Based Application for the Analysis of Structure−Activity Relationships Mariana González-Medina,* Oscar Méndez-Lucio, and José L. Medina-Franco* School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico S Supporting Information *
ABSTRACT: Activity landscape modeling is a powerful method for the quantitative analysis of structure−activity relationships. This cheminformatics area is in continuous growth, and several quantitative and visual approaches are constantly being developed. However, these approaches often fall into disuse due to their limited access. Herein, we present Activity Landscape Plotter as the first freely available web-based tool to automatically analyze structure−activity relationships of compound data sets. Based on the concept of activity landscape modeling, the online service performs pairwise structure and activity relationships from an input data set supplied by the user. For visual analysis, Activity Landscape Plotter generates Structure−Activity Similarity and Dual-Activity Difference maps. The user can interactively navigate through the maps and export all the pairwise structure−activity information as comma delimited files. Activity Landscape Plotter is freely accessible at https://unam-shiny-difacquim.shinyapps.io/ ActLSmaps/.
■
INTRODUCTION
Landscape Plotter are discussed and exemplified with two benchmark data sets of relevance in epigenetic drug discovery.
Structure−activity relationships (SAR), using the concept of activity landscape modeling (ALM), are becoming a common practice in drug discovery aimed to identify property cliffs,1 to guide compound optimization2 and to avoid the detrimental effects of activity cliffs on classical (QSAR) models and similarity searching.3 Some tools introduced to model activity landscapes have been implemented in stand-alone programs or commercial packages; however, their availability and implementation could limit their broad use. Some examples are SARANEA,9 which includes functions such as SARI4 and network-like similarity graphs (NSGs),5 Data Warrior,6 free software to analyze activity cliffs and calculate the Structure− Activity Landscape Index (SALI),7 or generate hierarchical trees, and Canvas,8 a commercial suite that has an application for activity cliff analysis. Of note, Guha and Van Drie have made freely available R functions to generate SALI matrices and a network visualization of such matrices.7 However, no online services are available and open for the community to automatically output all pairwise structure−activity comparisons, generate Structure−Activity Similarity (SAS), DualActivity Difference (DAD) maps, or SALI values and analyze the results accordingly. The goal of this work is to present the features of Activity Landscape Plotter a free web-based tool for ALM. The current features and implementation of the first version of Activity © 2017 American Chemical Society
■
IMPLEMENTATION AND FEATURES Activity Landscape Plotter is an online service that generates two major types of analysis for activity landscape modeling: SAS maps9 and DAD maps.10 Table 1 summarizes the major functions implemented in the current version of the plotter. The back-end of Activity Landscape Plotter was developed using R programming language. The front-end was implemented with the R package, Shiny11 and all the fingerprints and similarity values are computed using the R package rcdk.12 The format of the input data that must be uploaded to Activity Landscape Plotter is a comma or TAB delimited file in which the first column contains user-provided SMILES, followed by one or up to four columns with activity data provided as −log values, such as pIC50 or pKi, and a column with the compounds identification (ID). The input file must contain a maximum of 400 compounds, otherwise they will receive a warning and the app will not produce the plots. The outputs of each function implemented in Activity Landscape Plotter are summarized in Table 1 and further elaborated in the following sections. Structural Similarity. The user can choose between three molecular fingerprints from the R rcdk package12 to calculate all Received: December 20, 2016 Published: February 24, 2017 397
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402
Application Note
Journal of Chemical Information and Modeling Table 1. Major Functions Implemented in the Current Version of Activity Landscape Plotter characteristics
description of outputa
SAS map
data points color-coded by SALI, the most active compound in the pair or density
DAD map
data points color-coded by similarity, selectivity or density
for the chosen fingerprint and activity: csv file with the user given ID and SMILES for the pair of compounds, similarity, activity difference, the activity value of the most active compound in the pair and SALI for the chosen thresholds: csv file for each area of the plot with the user given ID and SMILES for the pair of compounds, similarity, activity difference, the activity value of the most active compound in the pair and SALI value for the chosen fingerprint and activities: csv file with the user given ID and SMILES for the pair of compounds, similarity, selectivity, activity difference 1 and activity difference 2 for the chosen thresholds: csv file for each area of the plot with the pair of compounds, similarity, activity difference 1 and activity difference 2
function
a
ref SAS maps color-coded by SALI were implemented in this tool13,14
DAD maps color-coded by selectivity and density were implemented in this tool10
All maps can be exported as 900 × 800 pixel TIFFs.
Figure 1. SAS map panel and menu in Activity Landscape Plotter.
where pIC50(T)i and pIC50(T)j are the activities of the ith and jth molecules (j > i) for target T.16 For the DAD map this tool performs all pairwise activity differences for each possible pair in the data set against both targets. For example, for pIC50 the activity differences for each target are calculated with the following expression:
pairwise similarities. The ECFP option encodes the circular type of fingerprints with a diameter of four or six, Pubchem represents PubChem’s binary substructure fingerprints with 881 bits, and MACCS computes the 166-bit MACCS keys as described by MDL Information Systems. The current version of Activity Landscape Plotter uses the Tanimoto index15 to quantify the fingerprint-based similarity values. Activity Difference. For the SAS maps, the activity difference plotted on the y axis, is the absolute difference between the activity values of each pair of compounds, for the chosen target. For example, for pIC50: |ΔpIC50(T )i , j | = pIC50(T )i − pIC50(T )j
ΔpIC50(T )i , j = pIC50(T )i − pIC50(T )j
(2)
where pIC50(T)i and pIC50(T)j are the activities of the ith and jth molecules (j > i) for target T.16 SALI. SALI was introduced7 to quantify activity cliffs in terms of similarity and activity difference. In Activity Landscape Plotter, SALI is computed using the equation:
(1) 398
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402
Application Note
Journal of Chemical Information and Modeling
Figure 2. (A) Prototype SAS map divided in four quadrants: nondescriptive, activity cliffs, similarity cliffs, and smooth SAR. (B) Prototype DAD map. The thresholds intersect the axes activity difference values of ±t generating nine quadrants: Z1u, Z1d, Z2u, Z2d, Z3u, Z3d, Z4l, Z4r, and Z5.
SALIi , j =
Figure 3. Examples of SAS maps generated with different tools. The plots were generated using activity vs HDAC1 (Activity 1 in the input file) and the extended connectivity fingerprint (ECFP) diameter 4. (A) SAS map generated with python in-house scripts. (B) Plot generated with Activity Landscape Plotter. Data points are colored by SALI.
|A i − A j | 1 − sim(i , j)
(3)
where Ai and Aj are the activities of the ith and the jth molecules, and sim(i, j) is the similarity coefficient between the two molecules. Whenever two very similar compounds have a similarity value of 1, the SALI value will be infinity and R will be unable to produce the plot. As a workaround of this potential issue we have replaced all the 0 produced for the difference of 1 − sim(i, j) with 0.01. Most Active Compound in the Pair. Activity Landscape Plotter includes the option Max.Activity which outputs the activity of the most active compound in the pair.13 Selectivity. The selectivity of each compound between two targets was calculated as |Δselectivityi| = activity( T ) − activity( T) i 1 i 2
Δselectivityi , j = selectivityi − selectivityj
(5)
where selectivityi and selectivityj are the selectivities of the ith and jth compounds, respectively. SAS Maps. On the SAS maps generated with Activity Landscape Plotter the structural similarity is plotted on the Xaxis and activity difference on the Y-axis. The user can choose between the activities provided on the input file and the three fingerprints described. Figure 1 shows a SAS map produced with Activity Landscape Plotter and the menu on the SAS map panel. Coloring Data Points. In the current implementation, the SAS map can be colored by three different approaches (Table 1). The options: SALI, Max.Activity and density can be found on the panel of Activity Landscape Plotter called SAS map, in the option “Select the color for the data points”. For SALI and Max.Activity data points are colored by a continuous color scale from high values (red) to intermediate (yellow-to-orange) to
(4)
where activityi(T1) and activityi(T2) are the activities of the ith compound for targets 1 and 2, respectively. Since each data point on the DAD maps represents the activity differences for two targets, the selectivity for two compounds is calculated as 399
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402
Application Note
Journal of Chemical Information and Modeling
Figure 4. Example to define the thresholds and the information contained on the comma delimited file. (A) SAS map produced with Activity Landscape Plotter. Data points are colored by the most active compound on the pair, and the thresholds are set at 0.5 for the X-axis and 2.0 for the Y-axis. The activity cliffs area is selected in blue. (B) The “You selected” area shows all the pairs selected on the plot. (C) Example of a commadelimited file exported using the “Activity Cliffs” button.
points are red, while the regions with less data points are light pink-to-gray. Setting up Thresholds in SAS and DAD Maps. For SAS maps Activity Landscape Plotter allows the user to input their threshold values to distinguish high/low structural similarity under the “Choose the X axis threshold” and high/low activity difference under the “Choose the Y axis threshold” (Figure 1). The two thresholds give rise to four quadrants in the SAS maps (Figure 2A): smooth SAR, activity cliffs, similarity cliffs, and nondescriptive. These quadrants and the information obtained from each of them are reviewed elsewhere.16 For DAD maps, the user can choose the thresholds for the Xaxis (activity difference 1) and Y-axis (activity difference 2). These thresholds will generate nine quadrants on a DAD map (Figure 2B) which determine boundaries for low or high activity differences between the two targets. Each quadrant offers different information regarding structural changes on the
low values (green). The third option, density, depicts in red the areas of the plot with the highest density of points while the areas with low density are colored in light pink-to-gray. DAD Maps. For the DAD maps generated with Activity Landscape Plotter, activity difference 1 is plotted on the X-axis and activity difference 2 is plotted on the Y-axis. The user can choose two of the activities, provided on the input file. Coloring Data Points. The user has three options to color the data points on the DAD maps: similarity, selectivity, and density. The data points representing pairs of compounds with high similarity or selectivity will be color red, intermediate similarity or selectivity color yellow-to-orange and low similarity or selectivity, green. Data points color-coded red by selectivity indicate that there is a considerable activity difference for one of the compounds toward one of the targets. The third option, density, produces a plot colored by the number or density of data points in the plot. The regions with more data 400
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402
Application Note
Journal of Chemical Information and Modeling pairs of compounds and the activity toward each target; a detailed explanation has been reviewed elsewhere.16 Exporting Images and Raw Calculation Data from SAS and DAD Maps. All the information used to produce the SAS and DAD maps can be exported using the “Download SASmap data” and “Download DADmap data” buttons. These comma delimited files will contain the pairwise information calculated for the entire data set (Table 1). Of note, if the user wishes to compare the compounds structures, after obtaining their activity landscape, this can be done using the SMILES for the pairs of compounds in the downloadable file. In addition, once the thresholds have been selected by the user to divide the SAS and DAD maps, it is feasible to download each section individually using the buttons under the “You selected” area, the legend on each button is the same as the quadrants name shown in Figure 2. The user can export all the plots as a 900 × 800 pixel TIFF, by clicking on the download button named “Download image” under the plot and the “You selected” section. Browsing SAS and DAD Maps. The user can interact with the plot by selecting a specific data point. The specific pairwise information corresponding to that point will be displayed in the “You selected” area. Further selecting a region will display all the information on the compound pairs selected.
■
RESULTS AND DISCUSSION To illustrate the application of the plotter we used two data sets of epigenetic relevance. The files associated with these examples are provided as Supporting Information and are freely available at the application web site. SAS Maps. The functions and a SAS map generated with Activity Landscape Plotter are illustrated with a data set of 140 pyrimidine hydroxyl amide compounds and their activity for histone deacetylase isoforms 1, 2, 3, and 6 (HDAC1, HDAC2, HDAC3, HDAC6).17 In this example, the online tool performs 9730 pairwise comparisons that are presented as SAS maps. Figure 3B shows a SAS map for HDAC1 using this tool colorcoded by SALI. This plot is compared to a graph produced using in-house python scripts and MayaChemTools18 fingerprint ECFP4 (Figure 3A). Using in-house scripts17 or Activity Landscape Plotter (Figure 3B) leads to the same activity landscape. Example of Browsing SAS Maps and Exporting SALI Values. Figure 4 illustrates the visualization on the plotter and the function to export the raw data of the data points selected on the activity cliff quadrant using the download button “Activity Cliffs”. After the thresholds are set, the comma delimited files under the SAS map will change, showing only the pairs of compounds present on each area of the SAS map. To exemplify this function, in Figure 4 we defined the thresholds on the plot as 0.5 for the X-axis and 2.0 for the Yaxis; we then selected all the activity cliffs area (Figure 4A). Compounds selected in the activity cliffs area are the same in the comma delimited file exported using the “Activity Cliffs” button (Figure 4C) and in the area “You selected”. DAD Maps. The functions and a DAD map generated with Activity Landscape Plotter are illustrated with a data set of 88 bromodomain inhibitors (BRDis)19 and their activity obtained from ChEMBL20 against isoforms 2, 3, and 4.21 In this example, Activity Landscape Plotter performs and outputs 2775 pairwise comparisons which are presented on the DAD map. Figure 5 shows the comparison between the DAD maps produced with the activities for the BRDs 2 and 3, using this tool and python
Figure 5. Examples of DAD maps. The input file contains screening data for 88 bromodomain inhibitors (BRDis): 3486 data points. Both plots were generated using the activity differences for BRD2 (Activity 1 in the input file) and BRD3 (Activity 2 in the input file). (A) DAD map generated with python in-house scripts. (B) This plot was generated with Activity Landscape Plotter, data points are color-coded by similarity computed with MACCS keys.
in-house scripts. Figure 5A depicts a DAD map generated with python, which is equivalent to the map generated using the online plotter in Figure 5B.
■
CONCLUSIONS AND FUTURE DEVELOPMENTS Activity Landscape Plotter is the first web-based tool to conduct activity landscape studies. The user can generate plots using different molecular fingerprints. These plots along with their data can be exported to perform other analyses. This tool has the flexibility to set up user-defined thresholds to divide the SAS and DAD maps in regions for the analysis of different types of landscapes. In addition, the user can interact with different plots and different data sets. We expect that the community uses Activity Landscape Plotter to conduct structure−activity relationships for their own data sets. In future versions several utilities will be implemented such as including other similarity coefficients, combine similarity values 401
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402
Application Note
Journal of Chemical Information and Modeling
(8) Canvas. Schrö dinger. https://www.schrodinger.com/canvas (accessed Feb 1, 2017). (9) Maggiora, G. M.; Shanmugasundaram, V. Molecular Similarity Measures. Methods Mol. Biol. 2011, 672, 39−100. (10) Perez-Villanueva, J.; Santos, R.; Hernandez-Campos, A.; Giulianotti, M. A.; Castillo, R.; Medina-Franco, J. L. Structure-activity Relationships of Benzimidazole Derivatives as Antiparasitic Agents: Dual Activity-Difference (DAD) Maps. MedChemComm 2011, 2 (1), 44−49. (11) Shinyapps.io by RStudio. http://www.shinyapps.io (accessed Feb 1, 2017). (12) Guha, R. Chemical Informatics Functionality in R. J. Stat. Soft. 2007, 18, 1−16. (13) Perez-Villanueva, J.; Santos, R.; Hernandez-Campos, A.; Giulianotti, M. A.; Castillo, R.; Medina-Franco, J. L. Towards a Systematic Characterization of the Antiprotozoal Activity Landscape of Benzimidazole Derivatives. Bioorg. Med. Chem. 2010, 18 (21), 7380− 7391. (14) Naveja, J. J.; Medina-Franco, J. L. Activity Landscape of DNA Methyltransferase Inhibitors Bridges Chemoinformatics with Epigenetic Drug Discovery. Expert Opin. Drug Discovery 2015, 10 (10), 1059−1070. (15) Bajusz, D.; Rácz, A.; Héberger, K. Why is Tanimoto Index an Appropriate Choice for Fingerprint-Based Similarity Calculations? J. Cheminf. 2015, 7, 20. (16) Medina-Franco, J. L. Scanning Structure-Activity Relationships With Structure-Activity Similarity And Related Maps: From Consensus Activity Cliffs to Selectivity Switches. J. Chem. Inf. Model. 2012, 52 (10), 2485−2493. (17) Saldívar-González, F. I.; Naveja, J. J.; Palomino-Hernández, O.; Medina-Franco, J. L. Getting SMARt in Drug Discovery: Chemoinformatics Approaches for Mining Structure-Multiple Activity Relationships. RSC Adv. 2017, 7 (2), 632−641. (18) Sud, M. MayaChemTools: An Open Source Package for Computational Drug Discovery. J. Chem. Inf. Model. 2016, 56 (12), 2292−2297. (19) García-Sánchez, M. O.; Cruz-Monteagudo, M.; Medina-Franco, J. L. Quantitative Structure-Epigenetic Activity Relationships. In Advances in QSAR modeling with Applications in Pharmaceutical, Chemical, Food, Agricultural and Environmental Sciences; Roy, K., Ed.; 2017; in press. (20) Papadatos, G.; Overington, J. P. The ChEMBL Database: A Taster for Medicinal Chemists. Future Med. Chem. 2014, 6 (4), 361− 364. (21) Prieto-Martinez, F. D.; Gortari, E. F.-d.; Mendez-Lucio, O.; Medina-Franco, J. L. A Chemical Space Odyssey of Inhibitors of Histone Deacetylases and Bromodomains. RSC Adv. 2016, 6 (61), 56225−56239.
using different data fusion options, add other types of molecular descriptors and different options to color each plot. We offer other research groups to include their methods in this plotter to automatize activity landscape analyses. In addition, the source code of the current version of Activity Landscape Plotter is available upon request.
■
ASSOCIATED CONTENT
S Supporting Information *
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00776. Examples of input and output data files that illustrate the features of Activity Landscape Plotter (ZIP)
■
AUTHOR INFORMATION
Corresponding Authors
*Phone: +5255-5622-3899. Ext. 44458. E-mail: mgm_14392@ hotmail.com (M.G.-M.). *E-mail:
[email protected] (J.L.M.-F.). ORCID
Mariana González-Medina: 0000-0001-7365-939X José L. Medina-Franco: 0000-0003-4940-1107 Funding
This work was supported by the Universidad Nacional Autónoma de México (UNAM) [grant PAPIME PE200116] and the Programa de Apoyo a la Investigación y el Posgrado ́ (PAIP) [grant 5000−9163], Facultad de Quimica, UNAM. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS ́ We thank Fernanda Saldivar-Gonzá lez for providing the data ́ set of HDAC inhibitors and Fernando Prieto-Martinez for providing the data set of BRDs inhibitors. We also thank Mario ́ nchez for providing the images produced with Omar Garcia-Sá python.
■ ■
DEDICATION This paper is dedicated to Nicolás Medina Sandoval in occasion of his 85th Birthday. REFERENCES
(1) Medina-Franco, J. L. Activity Cliffs: Facts or Artifacts? Chem. Biol. Drug Des. 2013, 81 (5), 553−6. (2) Bajorath, J. Modeling of Activity Landscapes for Drug Discovery. Expert Opin. Drug Discovery 2012, 7 (6), 463−473. (3) Cruz-Monteagudo, M.; Medina-Franco, J. L.; Pérez-Castillo, Y.; Nicolotti, O.; Cordeiro, M. N. D. S.; Borges, F. Activity cliffs in Drug Discovery: Dr Jekyll or Mr Hyde? Drug Discovery Today 2014, 19 (8), 1069−1080. (4) Peltason, L.; Bajorath, J. SAR Index: Quantifying the Nature of Structure - Activity Relationships. J. Med. Chem. 2007, 50 (23), 5571− 5578. (5) Wawer, M.; Peltason, L.; Weskamp, N.; Teckentrup, A.; Bajorath, J. Structure - Activity Relationship Anatomy by Network-like Similarity Graphs and Local Structure - Activity Relationship Indices. J. Med. Chem. 2008, 51 (19), 6075−6084. (6) Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. DataWarrior: An Open-Source Program for Chemistry Aware Data Visualization And Analysis. J. Chem. Inf. Model. 2015, 55 (2), 460−473. (7) Guha, R.; Van Drie, J. H. Structure-Activity Landscape Index: Identifying and Quantifying Activity Cliffs. J. Chem. Inf. Model. 2008, 48 (3), 646−658. 402
DOI: 10.1021/acs.jcim.6b00776 J. Chem. Inf. Model. 2017, 57, 397−402