ProQuest TDM Studio is a cloud-based tool allowing researchers at UW the ability to text data mine (TDM) large sets of content published in news, scholarly, and other publications that the University of Washington Libraries licenses from ProQuest within the tool. TDM Studio offers two levels of working with the data: Visualizations and Workbench.
Follow these steps to determine if TDM Studio will be helpful to your research:
To get started using the tool, Register to Use TDM Studio. You must register with ProQuest and create an individual password to access TDM Studio. Access will not be through UW's authentication with UW NetID and password. ProQuest Privacy Policy.
TDM Studio Visualizations | TDM Studio Workbench | |
Corpus | 10 Newspapers covering primarily the 1990s to the present based on current UW subscriptions: Chicago Tribune, Globe and Mail, The Guardian, Los Angeles Times, New York Times, South China Post, Sydney Morning Herald, Times of India, Wall Street Journal, Washington Post, ProQuest Dissertations and Theses. | Includes most scholarly journals, newspapers, industry and popular magazines, dissertations, theses, and other primary source texts for all time periods and publication dates available through UW subscriptions to ProQuest databases. |
Coding Skills |
No advanced coding skills are required to mine and generate data visualizations. Employs a graphical user interface and provides pre-built visualizations one can apply to the content being analyzed, including support of geographical analysis, topic modeling, and sentiment analysis. |
Requires one or more team members with knowledge of either R or Python programming languages for use within TDM Studio's Jupyter Notebook coding environment. |
Access |
All UW students, faculty, and staff with current UW NetIDs. |
All UW students, faculty, and staff with current UW NetIDs. Research teams (2-5 people) who need to collaborate on a project using the same workbench should email tdmstudio@clarivate.com to request the research team workbench be created. At least one team member must be a current UW faculty, staff, or student with a current UW NetID. |
Period of Access for Research or Teaching | 24/7, for as long as you are a current UW student, faculty, or staff member with a valid UW NetID. | 24/7 as long as you are a current UW student, faculty, or staff member with a valid UW NetID. |
Storage Limits | Each researcher can work on up to a maximum of 10 projects simultaneously, each consisting of 10,000 documents or less. | Research team members can work on as many as ten dataset projects simultaneously, each consisting of as many as 2 million documents. |
Data Export Limits | Citation and geographic locations from each specific dataset may be exported. Screen captures of visualizations may be taken, saved, and published. |
Rolling seven-day maximum limit of 15mb is available for export outside of the TDM Studio environment. The corpus of full text remains in the TDM Studio. The full text data sets cannot be exported, only programs and secondary analysis. |
More Information |
UW Libraries is piloting this resource and your feedback is vital to determining if we continue to subscribe to it over time. Please share feedback via email: uwlib-tdm@uw.edu.