Boletín de Mayo de 2006
 
Boletín Informativo
Cyber-environments for re-engineering the research process

NCSA's Jim Myers
Jim Myers, Associate Director, NCSA Cyberenvironments and Technologies Directorate

One could argue that scientists have been slowly losing the capabilities that enabled the scientific revolution. Hundreds of years ago, the top minds in any given culture had access to the overwhelming majority of their culture's knowledge. There wasn't that much data to go around, and thinkers tended to congregate at the intellectual epicenters. They all used a common format -- text and drawings, even if it did mean you had to speak several languages. And they all used a very limited set of tools. Mostly, they had their brains, pencils, and benefactors.

That systemic view is something we've lost, locking up knowledge in a variety of tools and databases and individual disciplines.

Cyberenvironments
-- integrated, end-to-end software systems that will make cyberinfrastructure as accessible and usable as Web browsers made the
Internet -- mitigate that loss. They'll give researchers the ability to make connections across the whole of human knowledge and to have the global perspective that enabled their forebearers' revolutionary advances.

Cyberenvironments support the re-engineering of science and engineering research processes. Part of their role will be to provide an easy-to-use interface to local and shared instruments, sensor arrays, data stores and data sets, computational systems, networks, scientific and engineering applications, data analysis and visualization tools, and services, all within a secure framework. They go beyond simply providing access to cyber-resources by enhancing researchers' abilities to manage complex projects and automate processes within and across projects and disciplines as well as collaborate effectively with colleagues near and far.

Cyberenvironments can also be tailored to allow researchers and educators to interact with the cyberinfrastructure using concepts and approaches familiar to their specific discipline.

Cyberenvironments are frequently thought of as being limited to gateways to large-scale computational capabilities or community data stores or as collaboration spaces. They are also often assumed to be Web portal-based and disconnected from local tools. NCSA realizes that these definitions are not sufficient to capture the full potential of cyberinfrastructure and satisfy the future needs of scientists and engineers. In particular, future cyberenvironments must:

* Allow researchers to manage large-scale and complicated scientific projects and processes.
* Allow researchers to manage the diverse and large-scale experimental, computational, and data resources needed to address challenging problems and complex phenomena.
* Bridge local, institutional, and national cyberinfrastructure to create a seamless environment that assures the most efficient and effective
resources and capabilities are brought to bear on the problem at hand.
* Assist in the bi-directional connection between raw or group research artifacts (data, notes, plans, etc.) and published artifacts (vetted data,
annotations, best practices, reviews, and papers) to enhance the flow ofinformation between basic research and application and between research and education.

In effect, they must enable the on-demand creation of new science and engineering sub-disciplines that allow community knowledge and shared resources to be quickly gathered around problems of interest.

To provide these capabilities, NCSA is developing high-level service abstractions, above those currently available in the cyberinfrastructure,
and additional capabilities for automating or semi-automating processes. The concept of visual knowledge discovery -- using data analysis to categorize, cluster, and extract features from large data sets coupled with interactive visualization -- is a prime example of new capabilities needed to allow users to quickly digest data and build understanding. Similarly, capabilities to manage semantic information about data and resources will enable higher-level capabilities such as provenance tracking, annotation, and collaborative data curation. There are a growing number of projects developing the kinds of rich capabilities that cyberenvironments will ultimately need. NCSA is at the forefront of applied research and development in these areas.

NCSA is also enhancing the sustainability, adaptability, and scalability of cyberenvironments. We're using current and emerging technologies such as web and grid services, translating or integrating middleware (for example, NCSA's MyProxy), global unique identifiers and metadata, workflow and provenance, and semantic descriptions of resources and data. These technologies reduce the burden of coupling cyberinfrastructure into cyberenvironments, making it easier to integrate components from many sources. Meanwhile, they maintain the environments' end-to-end nature.

Revamping current development methods and tools to support this approach will be crucial to producing persistent cyberenvironments that support and evolve with research over decades.