On the Validity versus Utility of Activity Landscapes: Are All Activity Cliffs Statistically Significant?. Guha, R. & Medina-Franco, J. J.~Cheminf., 2014.
abstract   bibtex   
Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are "real" (i.e., statistically significant). RESULTS: The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. CONCLUSIONS: We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models
@article{Guha:2013fk,
	Abstract = {Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are "real" (i.e., statistically significant).
RESULTS:
The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations.
CONCLUSIONS:
We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models},
	Author = {Guha, R. and Medina-Franco, J.L.},
	Date-Added = {2013-07-18 13:03:55 -0400},
	Date-Modified = {2014-08-13 12:35:57 +0000},
	Journal = {J.~Cheminf.},
	Number = {11},
	Title = {On the Validity versus Utility of Activity Landscapes: Are All Activity Cliffs Statistically Significant?},
	Volume = {6},
	Year = {2014},
	Bdsk-Url-1 = {http://dx.doi.org/10.1186/1758-2946-6-11}}

Downloads: 0