Collaborative visual analysis with RCloud. North, S., Scheidegger, C., Urbanek, S., & Woodhull, G. In 2015 IEEE Conference on Visual Analytics Science and Technology (VAST), pages 25–32, October, 2015. ISSN: null
doi  abstract   bibtex   
Consider the emerging role of data science teams embedded in larger organizations. Individual analysts work on loosely related problems, and must share their findings with each other and the organization at large, moving results from exploratory data analyses (EDA) into automated visualizations, diagnostics and reports deployed for wider consumption. There are two problems with the current practice. First, there are gaps in this workflow: EDA is performed with one set of tools, and automated reports and deployments with another. Second, these environments often assume a single-developer perspective, while data scientist teams could get much benefit from easier sharing of scripts and data feeds, experiments, annotations, and automated recommendations, which are well beyond what traditional version control systems provide. We contribute and justify the following three requirements for systems built to support current data science teams and users: discoverability, technology transfer, and coexistence. In addition, we contribute the design and implementation of RCloud, a system that supports the requirements of collaborative data analysis, visualization and web deployment. About 100 people used RCloud for two years. We report on interviews with some of these users, and discuss design decisions, tradeoffs and limitations in comparison to other approaches.
@inproceedings{north_collaborative_2015,
	title = {Collaborative visual analysis with {RCloud}},
	doi = {10.1109/VAST.2015.7347627},
	abstract = {Consider the emerging role of data science teams embedded in larger organizations. Individual analysts work on loosely related problems, and must share their findings with each other and the organization at large, moving results from exploratory data analyses (EDA) into automated visualizations, diagnostics and reports deployed for wider consumption. There are two problems with the current practice. First, there are gaps in this workflow: EDA is performed with one set of tools, and automated reports and deployments with another. Second, these environments often assume a single-developer perspective, while data scientist teams could get much benefit from easier sharing of scripts and data feeds, experiments, annotations, and automated recommendations, which are well beyond what traditional version control systems provide. We contribute and justify the following three requirements for systems built to support current data science teams and users: discoverability, technology transfer, and coexistence. In addition, we contribute the design and implementation of RCloud, a system that supports the requirements of collaborative data analysis, visualization and web deployment. About 100 people used RCloud for two years. We report on interviews with some of these users, and discuss design decisions, tradeoffs and limitations in comparison to other approaches.},
	booktitle = {2015 {IEEE} {Conference} on {Visual} {Analytics} {Science} and {Technology} ({VAST})},
	author = {North, Stephen and Scheidegger, Carlos and Urbanek, Simon and Woodhull, Gordon},
	month = oct,
	year = {2015},
	note = {ISSN: null},
	keywords = {Type of Work: Tool/Software, HOW: ???, Maybe related. RCloud allows distributed (collaborative) analysis in R, WHY: collaborative analysis (using R)},
	pages = {25--32},
	file = {IEEE Xplore Abstract Record:C\:\\Users\\conny\\Zotero\\storage\\FHBBCXVC\\7347627.html:text/html}
}

Downloads: 0