Writing samples

ICPSR Virtual Data Enclave Opens Doors to Accept More Restricted-Use Data

Mark Thompson-Kolar
ICPSR Senior Editor

(May 8, 2014) The Virtual Data Enclave (VDE) at ICPSR is now available to accept a variety of restricted-use data from depositors.

The VDE provides researchers access to quantitative and qualitative restricted-use data in a secure environment. It is a virtual machine launched from the researcher’s desktop but operating on a remote server at ICPSR, similar to remotely logging into another physical computer. The virtual machine is isolated from the users’ physical desktop computers, restricting them from downloading files. Users also are prevented from emailing, copying, or otherwise moving files outside of the secure environment.

Data remain on ICPSR file servers and are accessed and analyzed by researchers virtually. The results of analyses are reviewed by ICPSR for disclosure risk before they are transferred to researchers.

Providing data to researchers

"The VDE offers new ways of providing important data to the research community while respecting the confidentiality of subjects," said ICPSR Director George Alter. "We are reminded every day that computer security is a complex problem, and the VDE is an important new tool for reconciling data access with data protection."

Two thematic archives at ICPSR have experience using the virtual environment to provide access to data. They are the Bill & Melinda Gates Foundation-sponsored Measures of Effective Teaching Longitudinal Database (MET LDB) and the Substance Abuse and Mental Health Data Archive (SAMHDA), funded by the Substance Abuse & Mental Health Services Administration. (SAMHDA’s virtual enclave is named the Data Portal.)

About 100 total data files constituting 10 studies in the two archives currently are housed within the virtual enclave.

Other archives at ICPSR are exploring or implementing VDE access, said Asmat Noori, assistant director of ICPSR’s Computer and Network Services department.

The VDE is a standard Windows desktop environment with Microsoft Office and a broad range of widely used statistical packages and GIS software.

Benefits for depositors and funders

For the depositor or funder, the VDE offers several benefits. It:

"We’ve found that depositors are broadening their field of vision in terms of what data they will consider depositing," said Johanna Bleckman, a manager of the MET LDB project. "Data that were formerly assumed to be too risky are being seriously considered for access and secondary analysis via ICPSR, which is a big win for the field. We’ve also seen a fair amount of interest in exploring VDE access to data with contextual variables or geocodes."

The number of VDE users continues to grow. About 60 project or research groups — utilizing about 300 user accounts — have access to files in it.

Benefits for researchers

For researchers, the VDE:

Part of geospatial data project

Additionally, the VDE is a key element in a two-year, NSF-funded research project, "Research on Unique Confidentiality Risks & Geospatial Data Sharing within a Virtual Archive." The project explores the unique confidentiality characteristics of geospatial data and tests various methods of masking such data within the VDE. Douglas Richardson, executive director of the Association of American Geographers, is Principal Investigator. Alter is co-PI.

"The virtual data environment allows the sharing of confidential geospatial research data among researchers, and it also allows some of that data to be masked and removed from the VDE for publication, distribution, and so forth, once it has been transformed," Richardson said.

Bleckman said researchers have helped ICPSR enhance the enclave over the past two years. "They have provided feedback on the user experience, and we have refined the tool and the experience in response."

Technology utilized in public-access service

The technologies of the VDE also are being employed in ICPSR’s new public-access data sharing service, openICPSR, for handling restricted-use datasets. "A virtual environment is an expensive and complicated thing to build, and we’ve got experience using it," Detterman said. "So it’s a great thing that openICPSR can utilize the existing virtual environment infrastructure and our knowledge about using it."

"ICPSR loves data and wants to see people use data," said Marcotte. "When data are restricted-use, the virtual environment provides an additional avenue for making them available to researchers."

Data providers interested in depositing data for use in the virtual enclave should contact Amy Pienta, ICPSR director of Acquisitions (apienta@umich.edu).

Return to top