Written By: Utsav Ranjit
The nature of research has transformed in the past decade or so. Research nowadays tends to be data-intensive. Koltay (2019) describes this data-driven nature of research as Research 2.0, where research is increasingly based on large datasets and digital artifacts, involving open, networked systems. A major step towards data-driven research is finding relevant and credible datasets for analysis. If you are trying to look for datasets for your class assignments, research, or just to brush up on your data analysis skills, the UNT Libraries have some useful sources that might help fulfill your dataset needs.
So, where to find datasets? The UNT Libraries’ Finding Datasets guide lists many credible sources where you can find datasets. You can find datasets from open sources that do not require any subscription, like U.S. Census data, a platform to access data and digital content from the U.S. Census Bureau; Texas Open Data Portal, a source to access administrative data reported by various departments in the state of Texas; U.S. Government open data, a federal open government data site and other sources listed on the public data sources page or you can look for datasets on subscription-based data sources like IBIS World, a collection of U.S. and global industry market research and U.S. risk ratings; Cross-National Time-Series Data Archive, a longitudinal national data series that provides annual data on categories like demographic data, social, political and economic topics for all countries; Social Explorer, an online research tool designed to provide quick and easy access to current and historical census data and demographic information and so forth. Accessing subscription-based sources off-campus will require you to authenticate using your EUID and password. The guide does a great job of providing a brief description of almost all the sources it lists so you get an overview of the type of datasets you can expect when you go into the sources.
You can also find datasets using search engines for datasets. Dataset search engines host varieties of datasets, so it is recommended to check the quality and credibility of data before using them. Two popular search engines for datasets are:
- Kaggle dataset: It is an open data-sharing platform. It is popular among data analysts because of the data analysis notebook feature, where users can upload their analysis on the dataset.
- Google dataset: It is like an aggregator website that enables users to find datasets stored across the Web through a simple keyword search.
Hopefully, these resources make your quest of finding datasets more of a guided adventure than an endless exploration on Google. If you have any questions about searching for datasets or need help with your research, feel free to Ask US.
Koltay, T. (2019). Accepted and emerging roles of academic libraries in supporting research 2.0. The Journal of Academic Librarianship, 45(2), 75-80. https://doi.org/10.1016/j.acalib.2019.01.001