Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Datasets: Home

Contact The Library


24-7-icon - NRT WEBSITE

Chat bubbles, chat room, online chatting, talk, web chat iconCHAT

  • To chat KU Librarians, please see browser sidebar on the library website  


Chat, discussion, message, messaging, text, texting iconTEXT

  • KU Librarians are available for texting services at 810-255-9009
  • KU Archivist is available for texting services at  810-255-0022


  • For general inquiries and research assistance, please call a Librarian at 810-762-9598
  • For information regarding renewing books, your library account or reserves, please call Circulation at 810-762-7814 
  • For information regarding a book or article request, please call Materials On Demand (MOD) at 810-762-9841
  • For historical questions, please call the Archivist at 810-762-9690

What Are Datasets?

Datasets, also called "data sets" are structured groups of raw data, statistics, and information compiled during and after a research study. They are often presented in spreadsheets or charts. While there is movement toward open data, not all agencies are there yet. Many governmental agencies and non-profit organizations around the world offer their data freely, while most for-profit companies charge a fee for access. 

Depending on the topic you're researching, there are loads of different sources for datasets. For example, if you are looking for information on the United States population and demographics, U.S. Census Bureau would be a great starting place. For data on public opinion on modern social issues, such a politics, the media, and technology, one may start by searching the Pew Research Center. Looking for engineering and sciences? Try a Google Dataset Search. Google searches thousands of data respositories around the world, locating the metadata of millions of datasets where they are hosted.

Popular Datasets

  • ApolloScape - Includes datasets covering scene parsing, car instance, lane segmentation, detection/tracking, trajectory and more.

  • Audi Autonomous Driving Dataset - 2.3 TB of data including 2D semantic segmentation, 3D point clouds, 3D bounding boxes, and vehicle bus data.

  • Argoverse - Two public datasets supported by highly detailed maps to test, experiment, and teach self-driving vehicles how to understand the world around them.

  • Berkeley DeepDrive - diverse dataset for autonomous driving from UC Berkeley. Also called BDD100K.

  • Cityscapes Dataset - semantic, instance-wise dense pixel annotations of 30 classes. 

  • Comma2k19 - a dataset of over 33 hours of commute in California's 280 highway. 

  • Google Landmark Dataset V2 - 5 million images depicting human-made and natural landmarks spanning 200 thousand classes.

  • Kitti-360 - a large-scale dataset with 3D and 2D annotations.

  • Leddar PixSet - full-Waveform flash LiDAR dataset for autonomous vehicle R&D.

  • Level 5 - large collection of 3D annotation, lidar point clouds, traffic agent movement, and semantic map annotations.

  • nuScenes - large-scale public dataset for autonomous driving using the full sensor suite of a real self-driving car. 

  • Oxford Radar Robotcar Dataset - this dataset captures many different combinations of weather, traffic and pedestrians, along with  construction and roadwork.

  • PandaSet - open-source AV dataset combining Hesai’s best-in-class LiDAR sensors with Scale AI’s high-quality data annotation.

  • Udacity Self Driving Car Dataset - this dataset contains 97,942 labels across 11 classes and 15,000 images.

  • Waymo Open Dataset - motion dataset comprising object trajectories and corresponding 3D maps for 103,354 segments.

  • IEEE DataPort - this source holds thousands of open datasets on electrical and electronic engineering. Requires a free account to access.
  • CERN Open Data - over 2 petabytes of particle physics open data. 

  • EarthChem Library - open chemistry and earth science datasets.

  • EarthData - NASA's free and open Earth science data is interactive, interoperable, and accessible for research and societal benefit both today and tomorrow.

  • Figshare - data repository where research outputs are available in a citable, shareable and discoverable manner.

  • Global Health Observatory - the WHO's gateway to health-related statistics for more than 1000 indicators for its 194 Member States.

  • Harvard Dataverse - topics include math, science, engineering, business, social sciences and more.

  • Mendeley Data - over 29 million searchable datasets.

  • OSF - find projects, data, materials, and collaborators on OSF that might be helpful to your own research.

  • Statista Statistics Portal - datasets on a wide variety of sciences and social sciences.

  • UCI Machine Learning Repository - collection of databases, domain theories, and data generators used by the machine learning community for the empirical analysis of machine learning algorithms.

  • FBI Crime Data Explorer - datasets covering topics such as hate crime statistics, human trafficking, assaults on officers, arrest data, and crime by territory.

  • GitHub AwesomeData Social Sciences

  • ICPSR - world's largest social and behavioral science data archive. 250,000 data files available.

  • Mendeley Data - over 29 million searchable datasets.

  • Open Data Flint - Flint-based data gathered from academic institutions, local organizations, and federal agencies to encourage a healthier and informed community.

  • Data.Gov.UKSearch over 17,000 datasets from the government of the United Kingdom.

  • DataHub - thousands of datasets from financial market data and population growth to cryptocurrency prices.

  • International Monetary Fund - datasets on finance, economic outlook, trade, consumer price indices, and more.

  • ​​​​​​Open Data Canada - vast array of subjects, including science, health, technology, labor, and transport.

  • UN Data - data from around the world, including social, economic, trade, education, and health indicators.

  • World Bank Open Data - financial and population data for countries around the world.

Citing Your Sources

Public Services Librarian

Profile Photo
Meagan Brown
Library: 2-202 AB

Library Homepage

Like us on Facebook

Follow us on Twitter