Skip to Main Content
Research Guides

Text and Data Mining Guide: API

Step-by-step guide on how to get started with your text mining project along with examples of past text mining projects from UW researchers and students.

How to select API?

  1. Identify all potential APIs
  2. Identify the amount of data you need to collect and identify the tier of access you’d need for each API in your list
  3. Identify the amount of money you’d need to invest (if you need large amount of data) to use the API
  4. Pick the API that matches your budget and data needs the best

List of APIs

Twitter has now made API available for researchers. Get information from the other guide.

API Name Description Free / Paid Limitations Help Contact
arXiv Provides access to metadata and article abstracts for the e-prints hosted on Free; No key required None arXiv help
SAO/NASA Astrophysics Data System (ADS) Provides access to ADS database of bibliographic data on astronomy and physics publications Free; Key required Rate limits apply
BioMed Central Provides access both to metadata and full-text content for the 260,000 open access journals published on BioMed Central. Free; Key required None

Chronicling America Access to information about historic newspapers and select digitized newspaper pages. Free; no key required None


Provides access to metadata records with CrossRef DOIs, covering about 75 million scholarly works from around 5000 publishers. Free; no key required None
Digital Public Library of America Provides metadata on items and collections indexed by the DPLA. Also includes partner data from Harvard, New York Public Library, ARTstor, and others. Free; Key required None


Troubleshooting & FAQ

HathiTrust (Bibliographic API) Provides bibliographic and rights information for items in the HathiTrust Digital Library. Please note that this API is not intended for bulk-retrieval of records. Free; no key required None; Permission must be sought for bulk retrieval
HathiTrust (Data API) Provides access to HathiTrust and Google digitized texts of public domain works. Volumes digitized by Google will require agreement with Google. Free; Key required No specific limits, however please see their policies on data use
JSTOR Data for Research Not a true API, but provides access to content on JSTOR for research and teaching. Free; Requires MyJSTOR account registration Max 25,000 documents per dataset; users can get larger datasets by special request Data for Research help
Library of Congress Multiple APIs available to download bibliographic data and search Library of Congress digital collections, including images, public radio and television, and historic newspapers Free; Most APIs do not require key Varies
Nature Provides access to the Metadata and more than 460,000 open access full-text documents from Springer Nature. Free; Varied access requirement

No specific limits, however downloads should be limited to “reasonable rates”


Springer Nature TDM Policy
National Library of Medicine NLM offers 29 separate APIs for accessing a wide variety of content from various NLM databases. Free; Varied access requirement Varies Varies
National Center for Biotechnology Information Several public APIs to access many databases and tools including PubMed, PMC, Gene, Nuccore and Protein. Free; Most APIs do not require key Varies NCBI Help Manual
OECD Provides access to a selection of top used OECD datasets. Free; no key required Max 1,000,000 results per query, max URL length of 1,000 characters OECD.Stats help
Open Academic Graph Downloadable datasets for citations drawn from two large academic graphs: Microsoft Academic Graph (MAG) and AMiner. (Not an API) Free; no key required None  
ORCID Queries and searches the ORCID researcher identifier system and obtain researcher profile data Free; ORCID ID Account required Two options: 1) Users can access the free Public API, which only returns data marked as “public”; 2) Become an ORCID member to receive API credentials: see here ORCID API FAQ
Oxford English Dictionary(OED) Oxford University Press grants research access to the Corpus for academic projects that can demonstrate a strong practical need for this data. Free; Key required. Academic researchers can request free access 3,000 request per month and 60 calls per minute with free option, other options available



Oxford Dictionaries Contact


API Forum

PLoS Article-Level Metrics Retrieves article-level metrics (including usage statistics, citation counts, and social networking activity) for articles published in PLOS journals and articles added to PLOS Hubs: Biodiversity Free; Key required

Results limited to batches of 50 at a time


Contact api@plos.orgfor high-volume use requests ; Questions can also be posted in PLoS API Google Group
PLOS Search Allows PLoS content to be queried for integration into web, desktop, or mobile applications Free; Key required Max is 7200 requests a day, 300 per hour, 10 per minute; users should wait 5 seconds for each query to return results; requests should not return more than 100 rows; high-volume users should contact; API users are limited to no more than five concurrent connections from a single IP address ; Questions can also be posted in PLoS API Google Group
Worldbank Provide access to World Bank statistical databases, indicators, projects, and loans, credits, financial statements and other data related to financial operations Free; no key required Request volume limits are unspecified, but should be “reasonable”
IEEE Xplore Provides metadata and DOIs for IEEE Xplore articles. Cost negotiated per request

Key required,

Must subscribe to or be a member of an institution that subscribes to IEEE Xplore (UW subscribes)
STAT!Ref OpenSearch Bibliographic search service for displaying STAT!Ref results on a website. Free (with subscription) Free to register for users at a subscribing institution
Twitter The Twitter API provides the tools you need to contribute to, engage with, and analyze the conversation happening on Twitter. Academic Research product track provides free access to the full history of public conversation Limitation might apply if using other tier of access. The academic research product allows for scrape with a higher monthly Tweet volume cap of 10 million Twitter help topics
Wiley Allows text- and data-mining access to content in the Wiley Online Library Free (with subscription) Must be part of a subscribing institution to have full text access. Users will encounter a click-through agreement and will receive a Client API Token, which is needed when requesting full text of articles

Wiley TDM help page