Be specific about your topic so you can narrow your search, but be flexible enough to tailor your needs to existing sources.
This is what you should be able to define:
Social Unit: This is the population that you want to study.
It can be:
Time: This is the period of time you want to study.
Things to think about:
Space: Geography or place.
There are two main types of geographic classifications:
Remember to define your topic with enough flexibility to adapt to available data!
Data is not available for every thinkable topic. Some data is hidden (behind a pay-wall for example), uncollected, unavailable. Be prepared to try alternative data.
Look within a data repository that collects datasets within the general subject area that you are searching for.
Check out the other tabs in this guide for more disciplinary repository examples.
Ask yourself: Who might collect and publish this type of data?
Then visit the organization’s website and see if you're right! Or, search for them as an author in the library catalog.
These are some of the main types of data producers:
Government Agencies
The government collects data to aid in policy decisions and is the largest producer of data overall. For example, the U.S. Census Bureau, Federal Election Commission, Federal Highway Administration and many other agencies collect and publish data. To better understand the structure of government agencies read the U.S. Government Manual and browse FedStats. United States government data is free and publicly available, but may require access through library resources or special requests.
Non-Government Organizations
Many independent non-commercial and nonprofit organizations collect and publish data that supports their social platform. For example, the International Monetary Fund, United Nations, World Health Organization, and many others collect and publish data. Data from NGOs may be free or fee-based.
Academic Institutions
Academic research projects funded by public and private foundations create a wealth of data. For example, the Michigan State of the State Survey, Panel Study of Income Dynamics, American National Election Studies, and many other research projects collect and publish data. Much of this type of data is free and publicly available, but may require access through library resources. Access to smaller original research projects may be dependent upon contacting individual researchers.
Private Sector
Commercial firms collect and publish data as a paid service to clients or to sell broadly. Examples include marketing firms, pollsters, trade organizations, and business information. This information is almost always fee-based and may not always be available for public release.
Search for research studies based on secondary analysis of publicly available data sets.
Unfortunately, citation of research data is often incomplete. Sometimes the best you will get is the title of the data set used, but check to see if the data or a related publication are cited and follow it up. Don't commit this fallacy when you publish, cite your data.
Knowing when to call in reinforcements is important.
Depending on which search strategy you used, you may have already found the dataset file download link directly on a website. Or, you may have just a reference/citation to a dataset or producer. Here are some common ways to find the dataset files themselves.
Once you’ve chosen a data set that you believe will work, take care to carefully evaluate it. Is it appropriate? Does it come from an authoritative source? Does it fit your needs? Does it cover your Where, When, and Who or What requirements? Are you willing to compromise your requirements or manipulate the data to fit your needs? Always read the documentation and codebook to ensure that the analysis you are planning to do really measures what you want it to.