• Privacy Policy

Buy Me a Coffee

Research Method

Home » Data Collection – Methods Types and Examples

Data Collection – Methods Types and Examples

Table of Contents

Data collection

Data Collection


Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation.

In order for data collection to be effective, it is important to have a clear understanding of what data is needed and what the purpose of the data collection is. This can involve identifying the population or sample being studied, determining the variables to be measured, and selecting appropriate methods for collecting and recording data.

Types of Data Collection

Types of Data Collection are as follows:

Primary Data Collection

Primary data collection is the process of gathering original and firsthand information directly from the source or target population. This type of data collection involves collecting data that has not been previously gathered, recorded, or published. Primary data can be collected through various methods such as surveys, interviews, observations, experiments, and focus groups. The data collected is usually specific to the research question or objective and can provide valuable insights that cannot be obtained from secondary data sources. Primary data collection is often used in market research, social research, and scientific research.

Secondary Data Collection

Secondary data collection is the process of gathering information from existing sources that have already been collected and analyzed by someone else, rather than conducting new research to collect primary data. Secondary data can be collected from various sources, such as published reports, books, journals, newspapers, websites, government publications, and other documents.

Qualitative Data Collection

Qualitative data collection is used to gather non-numerical data such as opinions, experiences, perceptions, and feelings, through techniques such as interviews, focus groups, observations, and document analysis. It seeks to understand the deeper meaning and context of a phenomenon or situation and is often used in social sciences, psychology, and humanities. Qualitative data collection methods allow for a more in-depth and holistic exploration of research questions and can provide rich and nuanced insights into human behavior and experiences.

Quantitative Data Collection

Quantitative data collection is a used to gather numerical data that can be analyzed using statistical methods. This data is typically collected through surveys, experiments, and other structured data collection methods. Quantitative data collection seeks to quantify and measure variables, such as behaviors, attitudes, and opinions, in a systematic and objective way. This data is often used to test hypotheses, identify patterns, and establish correlations between variables. Quantitative data collection methods allow for precise measurement and generalization of findings to a larger population. It is commonly used in fields such as economics, psychology, and natural sciences.

Data Collection Methods

Data Collection Methods are as follows:

Surveys involve asking questions to a sample of individuals or organizations to collect data. Surveys can be conducted in person, over the phone, or online.

Interviews involve a one-on-one conversation between the interviewer and the respondent. Interviews can be structured or unstructured and can be conducted in person or over the phone.

Focus Groups

Focus groups are group discussions that are moderated by a facilitator. Focus groups are used to collect qualitative data on a specific topic.


Observation involves watching and recording the behavior of people, objects, or events in their natural setting. Observation can be done overtly or covertly, depending on the research question.


Experiments involve manipulating one or more variables and observing the effect on another variable. Experiments are commonly used in scientific research.

Case Studies

Case studies involve in-depth analysis of a single individual, organization, or event. Case studies are used to gain detailed information about a specific phenomenon.

Secondary Data Analysis

Secondary data analysis involves using existing data that was collected for another purpose. Secondary data can come from various sources, such as government agencies, academic institutions, or private companies.

How to Collect Data

The following are some steps to consider when collecting data:

  • Define the objective : Before you start collecting data, you need to define the objective of the study. This will help you determine what data you need to collect and how to collect it.
  • Identify the data sources : Identify the sources of data that will help you achieve your objective. These sources can be primary sources, such as surveys, interviews, and observations, or secondary sources, such as books, articles, and databases.
  • Determine the data collection method : Once you have identified the data sources, you need to determine the data collection method. This could be through online surveys, phone interviews, or face-to-face meetings.
  • Develop a data collection plan : Develop a plan that outlines the steps you will take to collect the data. This plan should include the timeline, the tools and equipment needed, and the personnel involved.
  • Test the data collection process: Before you start collecting data, test the data collection process to ensure that it is effective and efficient.
  • Collect the data: Collect the data according to the plan you developed in step 4. Make sure you record the data accurately and consistently.
  • Analyze the data: Once you have collected the data, analyze it to draw conclusions and make recommendations.
  • Report the findings: Report the findings of your data analysis to the relevant stakeholders. This could be in the form of a report, a presentation, or a publication.
  • Monitor and evaluate the data collection process: After the data collection process is complete, monitor and evaluate the process to identify areas for improvement in future data collection efforts.
  • Ensure data quality: Ensure that the collected data is of high quality and free from errors. This can be achieved by validating the data for accuracy, completeness, and consistency.
  • Maintain data security: Ensure that the collected data is secure and protected from unauthorized access or disclosure. This can be achieved by implementing data security protocols and using secure storage and transmission methods.
  • Follow ethical considerations: Follow ethical considerations when collecting data, such as obtaining informed consent from participants, protecting their privacy and confidentiality, and ensuring that the research does not cause harm to participants.
  • Use appropriate data analysis methods : Use appropriate data analysis methods based on the type of data collected and the research objectives. This could include statistical analysis, qualitative analysis, or a combination of both.
  • Record and store data properly: Record and store the collected data properly, in a structured and organized format. This will make it easier to retrieve and use the data in future research or analysis.
  • Collaborate with other stakeholders : Collaborate with other stakeholders, such as colleagues, experts, or community members, to ensure that the data collected is relevant and useful for the intended purpose.

Applications of Data Collection

Data collection methods are widely used in different fields, including social sciences, healthcare, business, education, and more. Here are some examples of how data collection methods are used in different fields:

  • Social sciences : Social scientists often use surveys, questionnaires, and interviews to collect data from individuals or groups. They may also use observation to collect data on social behaviors and interactions. This data is often used to study topics such as human behavior, attitudes, and beliefs.
  • Healthcare : Data collection methods are used in healthcare to monitor patient health and track treatment outcomes. Electronic health records and medical charts are commonly used to collect data on patients’ medical history, diagnoses, and treatments. Researchers may also use clinical trials and surveys to collect data on the effectiveness of different treatments.
  • Business : Businesses use data collection methods to gather information on consumer behavior, market trends, and competitor activity. They may collect data through customer surveys, sales reports, and market research studies. This data is used to inform business decisions, develop marketing strategies, and improve products and services.
  • Education : In education, data collection methods are used to assess student performance and measure the effectiveness of teaching methods. Standardized tests, quizzes, and exams are commonly used to collect data on student learning outcomes. Teachers may also use classroom observation and student feedback to gather data on teaching effectiveness.
  • Agriculture : Farmers use data collection methods to monitor crop growth and health. Sensors and remote sensing technology can be used to collect data on soil moisture, temperature, and nutrient levels. This data is used to optimize crop yields and minimize waste.
  • Environmental sciences : Environmental scientists use data collection methods to monitor air and water quality, track climate patterns, and measure the impact of human activity on the environment. They may use sensors, satellite imagery, and laboratory analysis to collect data on environmental factors.
  • Transportation : Transportation companies use data collection methods to track vehicle performance, optimize routes, and improve safety. GPS systems, on-board sensors, and other tracking technologies are used to collect data on vehicle speed, fuel consumption, and driver behavior.

Examples of Data Collection

Examples of Data Collection are as follows:

  • Traffic Monitoring: Cities collect real-time data on traffic patterns and congestion through sensors on roads and cameras at intersections. This information can be used to optimize traffic flow and improve safety.
  • Social Media Monitoring : Companies can collect real-time data on social media platforms such as Twitter and Facebook to monitor their brand reputation, track customer sentiment, and respond to customer inquiries and complaints in real-time.
  • Weather Monitoring: Weather agencies collect real-time data on temperature, humidity, air pressure, and precipitation through weather stations and satellites. This information is used to provide accurate weather forecasts and warnings.
  • Stock Market Monitoring : Financial institutions collect real-time data on stock prices, trading volumes, and other market indicators to make informed investment decisions and respond to market fluctuations in real-time.
  • Health Monitoring : Medical devices such as wearable fitness trackers and smartwatches can collect real-time data on a person’s heart rate, blood pressure, and other vital signs. This information can be used to monitor health conditions and detect early warning signs of health issues.

Purpose of Data Collection

The purpose of data collection can vary depending on the context and goals of the study, but generally, it serves to:

  • Provide information: Data collection provides information about a particular phenomenon or behavior that can be used to better understand it.
  • Measure progress : Data collection can be used to measure the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Support decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions.
  • Identify trends : Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Monitor and evaluate : Data collection can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.

When to use Data Collection

Data collection is used when there is a need to gather information or data on a specific topic or phenomenon. It is typically used in research, evaluation, and monitoring and is important for making informed decisions and improving outcomes.

Data collection is particularly useful in the following scenarios:

  • Research : When conducting research, data collection is used to gather information on variables of interest to answer research questions and test hypotheses.
  • Evaluation : Data collection is used in program evaluation to assess the effectiveness of programs or interventions, and to identify areas for improvement.
  • Monitoring : Data collection is used in monitoring to track progress towards achieving goals or targets, and to identify any areas that require attention.
  • Decision-making: Data collection is used to provide decision-makers with information that can be used to inform policies, strategies, and actions.
  • Quality improvement : Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Characteristics of Data Collection

Data collection can be characterized by several important characteristics that help to ensure the quality and accuracy of the data gathered. These characteristics include:

  • Validity : Validity refers to the accuracy and relevance of the data collected in relation to the research question or objective.
  • Reliability : Reliability refers to the consistency and stability of the data collection process, ensuring that the results obtained are consistent over time and across different contexts.
  • Objectivity : Objectivity refers to the impartiality of the data collection process, ensuring that the data collected is not influenced by the biases or personal opinions of the data collector.
  • Precision : Precision refers to the degree of accuracy and detail in the data collected, ensuring that the data is specific and accurate enough to answer the research question or objective.
  • Timeliness : Timeliness refers to the efficiency and speed with which the data is collected, ensuring that the data is collected in a timely manner to meet the needs of the research or evaluation.
  • Ethical considerations : Ethical considerations refer to the ethical principles that must be followed when collecting data, such as ensuring confidentiality and obtaining informed consent from participants.

Advantages of Data Collection

There are several advantages of data collection that make it an important process in research, evaluation, and monitoring. These advantages include:

  • Better decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions, leading to better decision-making.
  • Improved understanding: Data collection helps to improve our understanding of a particular phenomenon or behavior by providing empirical evidence that can be analyzed and interpreted.
  • Evaluation of interventions: Data collection is essential in evaluating the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Identifying trends and patterns: Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Increased accountability: Data collection increases accountability by providing evidence that can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.
  • Validation of theories: Data collection can be used to test hypotheses and validate theories, leading to a better understanding of the phenomenon being studied.
  • Improved quality: Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Limitations of Data Collection

While data collection has several advantages, it also has some limitations that must be considered. These limitations include:

  • Bias : Data collection can be influenced by the biases and personal opinions of the data collector, which can lead to inaccurate or misleading results.
  • Sampling bias : Data collection may not be representative of the entire population, resulting in sampling bias and inaccurate results.
  • Cost : Data collection can be expensive and time-consuming, particularly for large-scale studies.
  • Limited scope: Data collection is limited to the variables being measured, which may not capture the entire picture or context of the phenomenon being studied.
  • Ethical considerations : Data collection must follow ethical principles to protect the rights and confidentiality of the participants, which can limit the type of data that can be collected.
  • Data quality issues: Data collection may result in data quality issues such as missing or incomplete data, measurement errors, and inconsistencies.
  • Limited generalizability : Data collection may not be generalizable to other contexts or populations, limiting the generalizability of the findings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like


Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Research Questions

Research Questions – Types, Examples and Writing...

Table of Contents

What is data collection, why do we need data collection, what are the different data collection methods, data collection tools, the importance of ensuring accurate and appropriate data collection, issues related to maintaining the integrity of data collection, what are common challenges in data collection, what are the key steps in the data collection process, data collection considerations and best practices, choose the right data science program, are you interested in a career in data science, what is data collection: methods, types, tools.

What is Data Collection? Definition, Types, Tools, and Techniques

The process of gathering and analyzing accurate data from various sources to find answers to research problems, trends and probabilities, etc., to evaluate possible outcomes is Known as Data Collection. Knowledge is power, information is knowledge, and data is information in digitized form, at least as defined in IT. Hence, data is power. But before you can leverage that data into a successful strategy for your organization or business, you need to gather it. That’s your first step.

So, to help you get the process started, we shine a spotlight on data collection. What exactly is it? Believe it or not, it’s more than just doing a Google search! Furthermore, what are the different types of data collection? And what kinds of data collection tools and data collection techniques exist?

If you want to get up to speed about what is data collection process, you’ve come to the right place. 

Transform raw data into captivating visuals with Simplilearn's hands-on Data Visualization Courses and captivate your audience. Also, master the art of data management with Simplilearn's comprehensive data management courses  - unlock new career opportunities today!

Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences, business, and healthcare.

Accurate data collection is necessary to make informed business decisions, ensure quality assurance, and keep research integrity.

During data collection, the researchers must identify the data types, the sources of data, and what methods are being used. We will soon see that there are many different data collection methods . There is heavy reliance on data collection in research, commercial, and government fields.

Before an analyst begins collecting data, they must answer three questions first:

  • What’s the goal or purpose of this research?
  • What kinds of data are they planning on gathering?
  • What methods and procedures will be used to collect, store, and process the information?

Additionally, we can break up data into qualitative and quantitative types. Qualitative data covers descriptions such as color, size, quality, and appearance. Quantitative data, unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.

Before a judge makes a ruling in a court case or a general creates a plan of attack, they must have as many relevant facts as possible. The best courses of action come from informed decisions, and information and data are synonymous.

The concept of data collection isn’t a new one, as we’ll see later, but the world has changed. There is far more data available today, and it exists in forms that were unheard of a century ago. The data collection process has had to change and grow with the times, keeping pace with technology.

Whether you’re in the world of academia, trying to conduct research, or part of the commercial sector, thinking of how to promote a new product, you need data collection to help you make better choices.

Now that you know what is data collection and why we need it, let's take a look at the different methods of data collection. While the phrase “data collection” may sound all high-tech and digital, it doesn’t necessarily entail things like computers, big data , and the internet. Data collection could mean a telephone survey, a mail-in comment card, or even some guy with a clipboard asking passersby some questions. But let’s see if we can sort the different data collection methods into a semblance of organized categories.

Primary and secondary methods of data collection are two approaches used to gather information for research or analysis purposes. Let's explore each data collection method in detail:

1. Primary Data Collection:

Primary data collection involves the collection of original data directly from the source or through direct interaction with the respondents. This method allows researchers to obtain firsthand information specifically tailored to their research objectives. There are various techniques for primary data collection, including:

a. Surveys and Questionnaires: Researchers design structured questionnaires or surveys to collect data from individuals or groups. These can be conducted through face-to-face interviews, telephone calls, mail, or online platforms.

b. Interviews: Interviews involve direct interaction between the researcher and the respondent. They can be conducted in person, over the phone, or through video conferencing. Interviews can be structured (with predefined questions), semi-structured (allowing flexibility), or unstructured (more conversational).

c. Observations: Researchers observe and record behaviors, actions, or events in their natural setting. This method is useful for gathering data on human behavior, interactions, or phenomena without direct intervention.

d. Experiments: Experimental studies involve the manipulation of variables to observe their impact on the outcome. Researchers control the conditions and collect data to draw conclusions about cause-and-effect relationships.

e. Focus Groups: Focus groups bring together a small group of individuals who discuss specific topics in a moderated setting. This method helps in understanding opinions, perceptions, and experiences shared by the participants.

2. Secondary Data Collection:

Secondary data collection involves using existing data collected by someone else for a purpose different from the original intent. Researchers analyze and interpret this data to extract relevant information. Secondary data can be obtained from various sources, including:

a. Published Sources: Researchers refer to books, academic journals, magazines, newspapers, government reports, and other published materials that contain relevant data.

b. Online Databases: Numerous online databases provide access to a wide range of secondary data, such as research articles, statistical information, economic data, and social surveys.

c. Government and Institutional Records: Government agencies, research institutions, and organizations often maintain databases or records that can be used for research purposes.

d. Publicly Available Data: Data shared by individuals, organizations, or communities on public platforms, websites, or social media can be accessed and utilized for research.

e. Past Research Studies: Previous research studies and their findings can serve as valuable secondary data sources. Researchers can review and analyze the data to gain insights or build upon existing knowledge.

Now that we’ve explained the various techniques, let’s narrow our focus even further by looking at some specific tools. For example, we mentioned interviews as a technique, but we can further break that down into different interview types (or “tools”).

Word Association

The researcher gives the respondent a set of words and asks them what comes to mind when they hear each word.

Sentence Completion

Researchers use sentence completion to understand what kind of ideas the respondent has. This tool involves giving an incomplete sentence and seeing how the interviewee finishes it.


Respondents are presented with an imaginary situation and asked how they would act or react if it was real.

In-Person Surveys

The researcher asks questions in person.

Online/Web Surveys

These surveys are easy to accomplish, but some users may be unwilling to answer truthfully, if at all.

Mobile Surveys

These surveys take advantage of the increasing proliferation of mobile technology. Mobile collection surveys rely on mobile devices like tablets or smartphones to conduct surveys via SMS or mobile apps.

Phone Surveys

No researcher can call thousands of people at once, so they need a third party to handle the chore. However, many people have call screening and won’t answer.


Sometimes, the simplest method is the best. Researchers who make direct observations collect data quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in small-scale situations.

Accurate data collecting is crucial to preserving the integrity of research, regardless of the subject of study or preferred method for defining data (quantitative, qualitative). Errors are less likely to occur when the right data gathering tools are used (whether they are brand-new ones, updated versions of them, or already available).

Among the effects of data collection done incorrectly, include the following -

  • Erroneous conclusions that squander resources
  • Decisions that compromise public policy
  • Incapacity to correctly respond to research inquiries
  • Bringing harm to participants who are humans or animals
  • Deceiving other researchers into pursuing futile research avenues
  • The study's inability to be replicated and validated

When these study findings are used to support recommendations for public policy, there is the potential to result in disproportionate harm, even if the degree of influence from flawed data collecting may vary by discipline and the type of investigation.

Let us now look at the various issues that we might face while maintaining the integrity of data collection.

In order to assist the errors detection process in the data gathering process, whether they were done purposefully (deliberate falsifications) or not, maintaining data integrity is the main justification (systematic or random errors).

Quality assurance and quality control are two strategies that help protect data integrity and guarantee the scientific validity of study results.

Each strategy is used at various stages of the research timeline:

  • Quality control - tasks that are performed both after and during data collecting
  • Quality assurance - events that happen before data gathering starts

Let us explore each of them in more detail now.

Quality Assurance

As data collecting comes before quality assurance, its primary goal is "prevention" (i.e., forestalling problems with data collection). The best way to protect the accuracy of data collection is through prevention. The uniformity of protocol created in the thorough and exhaustive procedures manual for data collecting serves as the best example of this proactive step. 

The likelihood of failing to spot issues and mistakes early in the research attempt increases when guides are written poorly. There are several ways to show these shortcomings:

  • Failure to determine the precise subjects and methods for retraining or training staff employees in data collecting
  • List of goods to be collected, in part
  • There isn't a system in place to track modifications to processes that may occur as the investigation continues.
  • Instead of detailed, step-by-step instructions on how to deliver tests, there is a vague description of the data gathering tools that will be employed.
  • Uncertainty regarding the date, procedure, and identity of the person or people in charge of examining the data
  • Incomprehensible guidelines for using, adjusting, and calibrating the data collection equipment.

Now, let us look at how to ensure Quality Control.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Quality Control

Despite the fact that quality control actions (detection/monitoring and intervention) take place both after and during data collection, the specifics should be meticulously detailed in the procedures manual. Establishing monitoring systems requires a specific communication structure, which is a prerequisite. Following the discovery of data collection problems, there should be no ambiguity regarding the information flow between the primary investigators and staff personnel. A poorly designed communication system promotes slack oversight and reduces opportunities for error detection.

Direct staff observation conference calls, during site visits, or frequent or routine assessments of data reports to spot discrepancies, excessive numbers, or invalid codes can all be used as forms of detection or monitoring. Site visits might not be appropriate for all disciplines. Still, without routine auditing of records, whether qualitative or quantitative, it will be challenging for investigators to confirm that data gathering is taking place in accordance with the manual's defined methods. Additionally, quality control determines the appropriate solutions, or "actions," to fix flawed data gathering procedures and reduce recurrences.

Problems with data collection, for instance, that call for immediate action include:

  • Fraud or misbehavior
  • Systematic mistakes, procedure violations 
  • Individual data items with errors
  • Issues with certain staff members or a site's performance 

Researchers are trained to include one or more secondary measures that can be used to verify the quality of information being obtained from the human subject in the social and behavioral sciences where primary data collection entails using human subjects. 

For instance, a researcher conducting a survey would be interested in learning more about the prevalence of risky behaviors among young adults as well as the social factors that influence these risky behaviors' propensity for and frequency. Let us now explore the common challenges with regard to data collection.

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand them better and avoid them.

Data Quality Issues

The main threat to the broad and successful application of machine learning is poor data quality. Data quality must be your top priority if you want to make technologies like machine learning work for you. Let's talk about some of the most prevalent data quality problems in this blog article and how to fix them.

Inconsistent Data

When working with various data sources, it's conceivable that the same information will have discrepancies between sources. The differences could be in formats, units, or occasionally spellings. The introduction of inconsistent data might also occur during firm mergers or relocations. Inconsistencies in data have a tendency to accumulate and reduce the value of data if they are not continually resolved. Organizations that have heavily focused on data consistency do so because they only want reliable data to support their analytics.

Data Downtime

Data is the driving force behind the decisions and operations of data-driven businesses. However, there may be brief periods when their data is unreliable or not prepared. Customer complaints and subpar analytical outcomes are only two ways that this data unavailability can have a significant impact on businesses. A data engineer spends about 80% of their time updating, maintaining, and guaranteeing the integrity of the data pipeline. In order to ask the next business question, there is a high marginal cost due to the lengthy operational lead time from data capture to insight.

Schema modifications and migration problems are just two examples of the causes of data downtime. Data pipelines can be difficult due to their size and complexity. Data downtime must be continuously monitored, and it must be reduced through automation.

Ambiguous Data

Even with thorough oversight, some errors can still occur in massive databases or data lakes. For data streaming at a fast speed, the issue becomes more overwhelming. Spelling mistakes can go unnoticed, formatting difficulties can occur, and column heads might be deceptive. This unclear data might cause a number of problems for reporting and analytics.

Become a Data Science Expert & Get Your Dream Job

Become a Data Science Expert & Get Your Dream Job

Duplicate Data

Streaming data, local databases, and cloud data lakes are just a few of the sources of data that modern enterprises must contend with. They might also have application and system silos. These sources are likely to duplicate and overlap each other quite a bit. For instance, duplicate contact information has a substantial impact on customer experience. If certain prospects are ignored while others are engaged repeatedly, marketing campaigns suffer. The likelihood of biased analytical outcomes increases when duplicate data are present. It can also result in ML models with biased training data.

Too Much Data

While we emphasize data-driven analytics and its advantages, a data quality problem with excessive data exists. There is a risk of getting lost in an abundance of data when searching for information pertinent to your analytical efforts. Data scientists, data analysts, and business users devote 80% of their work to finding and organizing the appropriate data. With an increase in data volume, other problems with data quality become more serious, particularly when dealing with streaming data and big files or databases.

Inaccurate Data

For highly regulated businesses like healthcare, data accuracy is crucial. Given the current experience, it is more important than ever to increase the data quality for COVID-19 and later pandemics. Inaccurate information does not provide you with a true picture of the situation and cannot be used to plan the best course of action. Personalized customer experiences and marketing strategies underperform if your customer data is inaccurate.

Data inaccuracies can be attributed to a number of things, including data degradation, human mistake, and data drift. Worldwide data decay occurs at a rate of about 3% per month, which is quite concerning. Data integrity can be compromised while being transferred between different systems, and data quality might deteriorate with time.

Hidden Data

The majority of businesses only utilize a portion of their data, with the remainder sometimes being lost in data silos or discarded in data graveyards. For instance, the customer service team might not receive client data from sales, missing an opportunity to build more precise and comprehensive customer profiles. Missing out on possibilities to develop novel products, enhance services, and streamline procedures is caused by hidden data.

Finding Relevant Data

Finding relevant data is not so easy. There are several factors that we need to consider while trying to find relevant data, which include -

  • Relevant Domain
  • Relevant demographics
  • Relevant Time period and so many more factors that we need to consider while trying to find relevant data.

Data that is not relevant to our study in any of the factors render it obsolete and we cannot effectively proceed with its analysis. This could lead to incomplete research or analysis, re-collecting data again and again, or shutting down the study.

Deciding the Data to Collect

Determining what data to collect is one of the most important factors while collecting data and should be one of the first factors while collecting data. We must choose the subjects the data will cover, the sources we will be used to gather it, and the quantity of information we will require. Our responses to these queries will depend on our aims, or what we expect to achieve utilizing your data. As an illustration, we may choose to gather information on the categories of articles that website visitors between the ages of 20 and 50 most frequently access. We can also decide to compile data on the typical age of all the clients who made a purchase from your business over the previous month.

Not addressing this could lead to double work and collection of irrelevant data or ruining your study as a whole.

Dealing With Big Data

Big data refers to exceedingly massive data sets with more intricate and diversified structures. These traits typically result in increased challenges while storing, analyzing, and using additional methods of extracting results. Big data refers especially to data sets that are quite enormous or intricate that conventional data processing tools are insufficient. The overwhelming amount of data, both unstructured and structured, that a business faces on a daily basis. 

The amount of data produced by healthcare applications, the internet, social networking sites social, sensor networks, and many other businesses are rapidly growing as a result of recent technological advancements. Big data refers to the vast volume of data created from numerous sources in a variety of formats at extremely fast rates. Dealing with this kind of data is one of the many challenges of Data Collection and is a crucial step toward collecting effective data. 

Low Response and Other Research Issues

Poor design and low response rates were shown to be two issues with data collecting, particularly in health surveys that used questionnaires. This might lead to an insufficient or inadequate supply of data for the study. Creating an incentivized data collection program might be beneficial in this case to get more responses.

Now, let us look at the key steps in the data collection process.

In the Data Collection Process, there are 5 key steps. They are explained briefly below -

1. Decide What Data You Want to Gather

The first thing that we need to do is decide what information we want to gather. We must choose the subjects the data will cover, the sources we will use to gather it, and the quantity of information that we would require. For instance, we may choose to gather information on the categories of products that an average e-commerce website visitor between the ages of 30 and 45 most frequently searches for. 

2. Establish a Deadline for Data Collection

The process of creating a strategy for data collection can now begin. We should set a deadline for our data collection at the outset of our planning phase. Some forms of data we might want to continuously collect. We might want to build up a technique for tracking transactional data and website visitor statistics over the long term, for instance. However, we will track the data throughout a certain time frame if we are tracking it for a particular campaign. In these situations, we will have a schedule for when we will begin and finish gathering data. 

3. Select a Data Collection Approach

We will select the data collection technique that will serve as the foundation of our data gathering plan at this stage. We must take into account the type of information that we wish to gather, the time period during which we will receive it, and the other factors we decide on to choose the best gathering strategy.

4. Gather Information

Once our plan is complete, we can put our data collection plan into action and begin gathering data. In our DMP, we can store and arrange our data. We need to be careful to follow our plan and keep an eye on how it's doing. Especially if we are collecting data regularly, setting up a timetable for when we will be checking in on how our data gathering is going may be helpful. As circumstances alter and we learn new details, we might need to amend our plan.

5. Examine the Information and Apply Your Findings

It's time to examine our data and arrange our findings after we have gathered all of our information. The analysis stage is essential because it transforms unprocessed data into insightful knowledge that can be applied to better our marketing plans, goods, and business judgments. The analytics tools included in our DMP can be used to assist with this phase. We can put the discoveries to use to enhance our business once we have discovered the patterns and insights in our data.

Let us now look at some data collection considerations and best practices that one might follow.

We must carefully plan before spending time and money traveling to the field to gather data. While saving time and resources, effective data collection strategies can help us collect richer, more accurate, and richer data.

Below, we will be discussing some of the best practices that we can follow for the best results -

1. Take Into Account the Price of Each Extra Data Point

Once we have decided on the data we want to gather, we need to make sure to take the expense of doing so into account. Our surveyors and respondents will incur additional costs for each additional data point or survey question.

2. Plan How to Gather Each Data Piece

There is a dearth of freely accessible data. Sometimes the data is there, but we may not have access to it. For instance, unless we have a compelling cause, we cannot openly view another person's medical information. It could be challenging to measure several types of information.

Consider how time-consuming and difficult it will be to gather each piece of information while deciding what data to acquire.

3. Think About Your Choices for Data Collecting Using Mobile Devices

Mobile-based data collecting can be divided into three categories -

  • IVRS (interactive voice response technology) -  Will call the respondents and ask them questions that have already been recorded. 
  • SMS data collection - Will send a text message to the respondent, who can then respond to questions by text on their phone. 
  • Field surveyors - Can directly enter data into an interactive questionnaire while speaking to each respondent, thanks to smartphone apps.

We need to make sure to select the appropriate tool for our survey and responders because each one has its own disadvantages and advantages.

4. Carefully Consider the Data You Need to Gather

It's all too easy to get information about anything and everything, but it's crucial to only gather the information that we require. 

It is helpful to consider these 3 questions:

  • What details will be helpful?
  • What details are available?
  • What specific details do you require?

5. Remember to Consider Identifiers

Identifiers, or details describing the context and source of a survey response, are just as crucial as the information about the subject or program that we are actually researching.

In general, adding more identifiers will enable us to pinpoint our program's successes and failures with greater accuracy, but moderation is the key.

6. Data Collecting Through Mobile Devices is the Way to Go

Although collecting data on paper is still common, modern technology relies heavily on mobile devices. They enable us to gather many various types of data at relatively lower prices and are accurate as well as quick. There aren't many reasons not to pick mobile-based data collecting with the boom of low-cost Android devices that are available nowadays.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

1. What is data collection with example?

Data collection is the process of collecting and analyzing information on relevant variables in a predetermined, methodical way so that one can respond to specific research questions, test hypotheses, and assess results. Data collection can be either qualitative or quantitative. Example: A company collects customer feedback through online surveys and social media monitoring to improve their products and services.

2. What are the primary data collection methods?

As is well known, gathering primary data is costly and time intensive. The main techniques for gathering data are observation, interviews, questionnaires, schedules, and surveys.

3. What are data collection tools?

The term "data collecting tools" refers to the tools/devices used to gather data, such as a paper questionnaire or a system for computer-assisted interviews. Tools used to gather data include case studies, checklists, interviews, occasionally observation, surveys, and questionnaires.

4. What’s the difference between quantitative and qualitative methods?

While qualitative research focuses on words and meanings, quantitative research deals with figures and statistics. You can systematically measure variables and test hypotheses using quantitative methods. You can delve deeper into ideas and experiences using qualitative methodologies.

5. What are quantitative data collection methods?

While there are numerous other ways to get quantitative information, the methods indicated above—probability sampling, interviews, questionnaire observation, and document review—are the most typical and frequently employed, whether collecting information offline or online.

6. What is mixed methods research?

User research that includes both qualitative and quantitative techniques is known as mixed methods research. For deeper user insights, mixed methods research combines insightful user data with useful statistics.

7. What are the benefits of collecting data?

Collecting data offers several benefits, including:

  • Knowledge and Insight
  • Evidence-Based Decision Making
  • Problem Identification and Solution
  • Validation and Evaluation
  • Identifying Trends and Predictions
  • Support for Research and Development
  • Policy Development
  • Quality Improvement
  • Personalization and Targeting
  • Knowledge Sharing and Collaboration

8. What’s the difference between reliability and validity?

Reliability is about consistency and stability, while validity is about accuracy and appropriateness. Reliability focuses on the consistency of results, while validity focuses on whether the results are actually measuring what they are intended to measure. Both reliability and validity are crucial considerations in research to ensure the trustworthiness and meaningfulness of the collected data and measurements.

Are you thinking about pursuing a career in the field of data science? Simplilearn's Data Science courses are designed to provide you with the necessary skills and expertise to excel in this rapidly changing field. Here's a detailed comparison for your reference:

Program Name Data Scientist Master's Program Post Graduate Program In Data Science Post Graduate Program In Data Science Geo All Geos All Geos Not Applicable in US University Simplilearn Purdue Caltech Course Duration 11 Months 11 Months 11 Months Coding Experience Required Basic Basic No Skills You Will Learn 10+ skills including data structure, data manipulation, NumPy, Scikit-Learn, Tableau and more 8+ skills including Exploratory Data Analysis, Descriptive Statistics, Inferential Statistics, and more 8+ skills including Supervised & Unsupervised Learning Deep Learning Data Visualization, and more Additional Benefits Applied Learning via Capstone and 25+ Data Science Projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Resume Building Assistance Upto 14 CEU Credits Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

We live in the Data Age, and if you want a career that fully takes advantage of this, you should consider a career in data science. Simplilearn offers a Caltech Post Graduate Program in Data Science  that will train you in everything you need to know to secure the perfect position. This Data Science PG program is ideal for all working professionals, covering job-critical topics like R, Python programming , machine learning algorithms , NLP concepts , and data visualization with Tableau in great detail. This is all provided via our interactive learning model with live sessions by global practitioners, practical labs, and industry projects.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Recommended Reads

Data Science Career Guide: A Comprehensive Playbook To Becoming A Data Scientist

Difference Between Collection and Collections in Java

An Ultimate One-Stop Solution Guide to Collections in C# Programming With Examples

Managing Data

Capped Collection in MongoDB

What Are Java Collections and How to Implement Them?

Get Affiliated Certifications with Live Class programs

Data scientist.

  • Add the IBM Advantage to your Learning
  • 25 Industry-relevant Projects and Integrated labs

Caltech Data Sciences-Bootcamp

  • Exclusive visit to Caltech’s Robotics Lab

Caltech Post Graduate Program in Data Science

  • Earn a program completion certificate from Caltech CTME
  • Curriculum delivered in live online sessions by industry experts
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Join thousands of product people at Insight Out Conf on April 11. Register free.

Insights hub solutions

Analyze data

Uncover deep customer insights with fast, powerful features, store insights, curate and manage insights in one searchable platform, scale research, unlock the potential of customer insights at enterprise scale.

Featured reads

Create a quick summary to identify key takeaways and keep your team in the loop.

Tips and tricks

Make magic with your customer data in Dovetail

what is data collection in research methodology

Four ways Dovetail helps Product Managers master continuous product discovery

what is data collection in research methodology

Product updates

Dovetail retro: our biggest releases from the past year

Events and videos

© Dovetail Research Pty. Ltd.

Data collection in research: Your complete guide

Last updated

31 January 2023

Reviewed by

Cathy Heath

In the late 16th century, Francis Bacon coined the phrase "knowledge is power," which implies that knowledge is a powerful force, like physical strength. In the 21st century, knowledge in the form of data is unquestionably powerful.

But data isn't something you just have - you need to collect it. This means utilizing a data collection process and turning the collected data into knowledge that you can leverage into a successful strategy for your business or organization.

Believe it or not, there's more to data collection than just conducting a Google search. In this complete guide, we shine a spotlight on data collection, outlining what it is, types of data collection methods, common challenges in data collection, data collection techniques, and the steps involved in data collection.

Analyze all your data in one place

Uncover hidden nuggets in all types of qualitative data when you analyze it in Dovetail

  • What is data collection?

There are two specific data collection techniques: primary and secondary data collection. Primary data collection is the process of gathering data directly from sources. It's often considered the most reliable data collection method, as researchers can collect information directly from respondents.

Secondary data collection is data that has already been collected by someone else and is readily available. This data is usually less expensive and quicker to obtain than primary data.

  • What are the different methods of data collection?

There are several data collection methods, which can be either manual or automated. Manual data collection involves collecting data manually, typically with pen and paper, while computerized data collection involves using software to collect data from online sources, such as social media, website data, transaction data, etc. 

Here are the five most popular methods of data collection:

Surveys are a very popular method of data collection that organizations can use to gather information from many people. Researchers can conduct multi-mode surveys that reach respondents in different ways, including in person, by mail, over the phone, or online.

As a method of data collection, surveys have several advantages. For instance, they are relatively quick and easy to administer, you can be flexible in what you ask, and they can be tailored to collect data on various topics or from certain demographics.

However, surveys also have several disadvantages. For instance, they can be expensive to administer, and the results may not represent the population as a whole. Additionally, survey data can be challenging to interpret. It may also be subject to bias if the questions are not well-designed or if the sample of people surveyed is not representative of the population of interest.

Interviews are a common method of collecting data in social science research. You can conduct interviews in person, over the phone, or even via email or online chat.

Interviews are a great way to collect qualitative and quantitative data . Qualitative interviews are likely your best option if you need to collect detailed information about your subjects' experiences or opinions. If you need to collect more generalized data about your subjects' demographics or attitudes, then quantitative interviews may be a better option.

Interviews are relatively quick and very flexible, allowing you to ask follow-up questions and explore topics in more depth. The downside is that interviews can be time-consuming and expensive due to the amount of information to be analyzed. They are also prone to bias, as both the interviewer and the respondent may have certain expectations or preconceptions that may influence the data.

Direct observation

Observation is a direct way of collecting data. It can be structured (with a specific protocol to follow) or unstructured (simply observing without a particular plan).

Organizations and businesses use observation as a data collection method to gather information about their target market, customers, or competition. Businesses can learn about consumer behavior, preferences, and trends by observing people using their products or service.

There are two types of observation: participatory and non-participatory. In participatory observation, the researcher is actively involved in the observed activities. This type of observation is used in ethnographic research , where the researcher wants to understand a group's culture and social norms. Non-participatory observation is when researchers observe from a distance and do not interact with the people or environment they are studying.

There are several advantages to using observation as a data collection method. It can provide insights that may not be apparent through other methods, such as surveys or interviews. Researchers can also observe behavior in a natural setting, which can provide a more accurate picture of what people do and how and why they behave in a certain context.

There are some disadvantages to using observation as a method of data collection. It can be time-consuming, intrusive, and expensive to observe people for extended periods. Observations can also be tainted if the researcher is not careful to avoid personal biases or preconceptions.

Automated data collection

Business applications and websites are increasingly collecting data electronically to improve the user experience or for marketing purposes.

There are a few different ways that organizations can collect data automatically. One way is through cookies, which are small pieces of data stored on a user's computer. They track a user's browsing history and activity on a site, measuring levels of engagement with a business’s products or services, for example.

Another way organizations can collect data automatically is through web beacons. Web beacons are small images embedded on a web page to track a user's activity.

Finally, organizations can also collect data through mobile apps, which can track user location, device information, and app usage. This data can be used to improve the user experience and for marketing purposes.

Automated data collection is a valuable tool for businesses, helping improve the user experience or target marketing efforts. Businesses should aim to be transparent about how they collect and use this data.

Sourcing data through information service providers

Organizations need to be able to collect data from a variety of sources, including social media, weblogs, and sensors. The process to do this and then use the data for action needs to be efficient, targeted, and meaningful.

In the era of big data, organizations are increasingly turning to information service providers (ISPs) and other external data sources to help them collect data to make crucial decisions. 

Information service providers help organizations collect data by offering personalized services that suit the specific needs of the organizations. These services can include data collection, analysis, management, and reporting. By partnering with an ISP, organizations can gain access to the newest technology and tools to help them to gather and manage data more effectively.

There are also several tools and techniques that organizations can use to collect data from external sources, such as web scraping, which collects data from websites, and data mining, which involves using algorithms to extract data from large data sets. 

Organizations can also use APIs (application programming interface) to collect data from external sources. APIs allow organizations to access data stored in another system and share and integrate it into their own systems.

Finally, organizations can also use manual methods to collect data from external sources. This can involve contacting companies or individuals directly to request data, by using the right tools and methods to get the insights they need.

  • What are common challenges in data collection?

There are many challenges that researchers face when collecting data. Here are five common examples:

Big data environments

Data collection can be a challenge in big data environments for several reasons. It can be located in different places, such as archives, libraries, or online. The sheer volume of data can also make it difficult to identify the most relevant data sets.

Second, the complexity of data sets can make it challenging to extract the desired information. Third, the distributed nature of big data environments can make it difficult to collect data promptly and efficiently.

Therefore it is important to have a well-designed data collection strategy to consider the specific needs of the organization and what data sets are the most relevant. Alongside this, consideration should be made regarding the tools and resources available to support data collection and protect it from unintended use.

Data bias is a common challenge in data collection. It occurs when data is collected from a sample that is not representative of the population of interest. 

There are different types of data bias, but some common ones include selection bias, self-selection bias, and response bias. Selection bias can occur when the collected data does not represent the population being studied. For example, if a study only includes data from people who volunteer to participate, that data may not represent the general population.

Self-selection bias can also occur when people self-select into a study, such as by taking part only if they think they will benefit from it. Response bias happens when people respond in a way that is not honest or accurate, such as by only answering questions that make them look good. 

These types of data bias present a challenge because they can lead to inaccurate results and conclusions about behaviors, perceptions, and trends. Data bias can be avoided by identifying potential sources or themes of bias and setting guidelines for eliminating them.

Lack of quality assurance processes

One of the biggest challenges in data collection is the lack of quality assurance processes. This can lead to several problems, including incorrect data, missing data, and inconsistencies between data sets.

Quality assurance is important because there are many data sources, and each source may have different levels of quality or corruption. There are also different ways of collecting data, and data quality may vary depending on the method used. 

There are several ways to improve quality assurance in data collection. These include developing clear and consistent goals and guidelines for data collection, implementing quality control measures, using standardized procedures, and employing data validation techniques. By taking these steps, you can ensure that your data is of adequate quality to inform decision-making.

Limited access to data

Another challenge in data collection is limited access to data. This can be due to several reasons, including privacy concerns, the sensitive nature of the data, security concerns, or simply the fact that data is not readily available.

Legal and compliance regulations

Most countries have regulations governing how data can be collected, used, and stored. In some cases, data collected in one country may not be used in another. This means gaining a global perspective can be a challenge. 

For example, if a company is required to comply with the EU General Data Protection Regulation (GDPR), it may not be able to collect data from individuals in the EU without their explicit consent. This can make it difficult to collect data from a target audience.

Legal and compliance regulations can be complex, and it's important to ensure that all data collected is done so in a way that complies with the relevant regulations.

  • What are the key steps in the data collection process?

There are five steps involved in the data collection process. They are:

1. Decide what data you want to gather

Have a clear understanding of the questions you are asking, and then consider where the answers might lie and how you might obtain them. This saves time and resources by avoiding the collection of irrelevant data, and helps maintain the quality of your datasets. 

2. Establish a deadline for data collection

Establishing a deadline for data collection helps you avoid collecting too much data, which can be costly and time-consuming to analyze. It also allows you to plan for data analysis and prompt interpretation. Finally, it helps you meet your research goals and objectives and allows you to move forward.

3. Select a data collection approach

The data collection approach you choose will depend on different factors, including the type of data you need, available resources, and the project timeline. For instance, if you need qualitative data, you might choose a focus group or interview methodology. If you need quantitative data , then a survey or observational study may be the most appropriate form of collection.

4. Gather information

When collecting data for your business, identify your business goals first. Once you know what you want to achieve, you can start collecting data to reach those goals. The most important thing is to ensure that the data you collect is reliable and valid. Otherwise, any decisions you make using the data could result in a negative outcome for your business.

5. Examine the information and apply your findings

As a researcher, it's important to examine the data you're collecting and analyzing before you apply your findings. This is because data can be misleading, leading to inaccurate conclusions. Ask yourself whether it is what you are expecting? Is it similar to other datasets you have looked at? 

There are many scientific ways to examine data, but some common methods include:

looking at the distribution of data points

examining the relationships between variables

looking for outliers

By taking the time to examine your data and noticing any patterns, strange or otherwise, you can avoid making mistakes that could invalidate your research.

  • How qualitative analysis software streamlines the data collection process

Knowledge derived from data does indeed carry power. However, if you don't convert the knowledge into action, it will remain a resource of unexploited energy and wasted potential.

Luckily, data collection tools enable organizations to streamline their data collection and analysis processes and leverage the derived knowledge to grow their businesses. For instance, qualitative analysis software can be highly advantageous in data collection by streamlining the process, making it more efficient and less time-consuming.

Secondly, qualitative analysis software provides a structure for data collection and analysis, ensuring that data is of high quality. It can also help to uncover patterns and relationships that would otherwise be difficult to discern. Moreover, you can use it to replace more expensive data collection methods, such as focus groups or surveys.

Overall, qualitative analysis software can be valuable for any researcher looking to collect and analyze data. By increasing efficiency, improving data quality, and providing greater insights, qualitative software can help to make the research process much more efficient and effective.

what is data collection in research methodology

Learn more about qualitative research data analysis software

Get started today.

Go from raw data to valuable insights with a flexible research platform

Editor’s picks

Last updated: 21 December 2023

Last updated: 16 December 2023

Last updated: 17 February 2024

Last updated: 19 November 2023

Last updated: 5 March 2024

Last updated: 15 February 2024

Last updated: 11 March 2024

Last updated: 12 December 2023

Last updated: 6 March 2024

Last updated: 10 April 2023

Last updated: 20 December 2023

Latest articles

Related topics, log in or sign up.

Get started for free


Data Collection Methods

Data collection is a process of collecting information from all the relevant sources to find answers to the research problem, test the hypothesis (if you are following deductive approach ) and evaluate the outcomes. Data collection methods can be divided into two categories: secondary methods of data collection and primary methods of data collection.

Secondary Data Collection Methods

Secondary data is a type of data that has already been published in books, newspapers, magazines, journals, online portals etc.  There is an abundance of data available in these sources about your research area in business studies, almost regardless of the nature of the research area. Therefore, application of appropriate set of criteria to select secondary data to be used in the study plays an important role in terms of increasing the levels of research validity and reliability.

These criteria include, but not limited to date of publication, credential of the author, reliability of the source, quality of discussions, depth of analyses, the extent of contribution of the text to the development of the research area etc. Secondary data collection is discussed in greater depth in Literature Review chapter.

Secondary data collection methods offer a range of advantages such as saving time, effort and expenses. However they have a major disadvantage. Specifically, secondary research does not make contribution to the expansion of the literature by producing fresh (new) data.

Primary Data Collection Methods

Primary data is the type of data that has not been around before. Primary data is unique findings of your research. Primary data collection and analysis typically requires more time and effort to conduct compared to the secondary data research. Primary data collection methods can be divided into two groups: quantitative and qualitative.

Quantitative data collection methods are based on mathematical calculations in various formats. Methods of quantitative data collection and analysis include questionnaires with closed-ended questions, methods of correlation and regression, mean, mode and median and others.

Quantitative methods are cheaper to apply and they can be applied within shorter duration of time compared to qualitative methods. Moreover, due to a high level of standardisation of quantitative methods, it is easy to make comparisons of findings.

Qualitative research methods , on the contrary, do not involve numbers or mathematical calculations. Qualitative research is closely associated with words, sounds, feeling, emotions, colours and other elements that are non-quantifiable.

Qualitative studies aim to ensure greater level of depth of understanding and qualitative data collection methods include interviews, questionnaires with open-ended questions, focus groups, observation, game or role-playing, case studies etc.

Your choice between quantitative or qualitative methods of data collection depends on the area of your research and the nature of research aims and objectives.

My e-book, The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline.

John Dudovskiy

Data Collection Methods


  • Data Science

Caltech Bootcamp / Blog / /

Data Collection Methods: A Comprehensive View

  • Written by John Terra
  • Updated on February 21, 2024

What Is Data Processing

Companies that want to be competitive in today’s digital economy enjoy the benefit of countless reams of data available for market research. In fact, thanks to the advent of big data, there’s a veritable tidal wave of information ready to be put to good use, helping businesses make intelligent decisions and thrive.

But before that data can be used, it must be processed. But before it can be processed, it must be collected, and that’s what we’re here for. This article explores the subject of data collection. We will learn about the types of data collection methods and why they are essential.

We will detail primary and secondary data collection methods and discuss data collection procedures. We’ll also share how you can learn practical skills through online data science training.

But first, let’s get the definition out of the way. What is data collection?

What is Data Collection?

Data collection is the act of collecting, measuring and analyzing different kinds of information using a set of validated standard procedures and techniques. The primary objective of data collection procedures is to gather reliable, information-rich data and analyze it to make critical business decisions. Once the desired data is collected, it undergoes a process of data cleaning and processing to make the information actionable and valuable for businesses.

Your choice of data collection method (or alternately called a data gathering procedure) depends on the research questions you’re working on, the type of data required, and the available time and resources and time. You can categorize data-gathering procedures into two main methods:

  • Primary data collection . Primary data is collected via first-hand experiences and does not reference or use the past. The data obtained by primary data collection methods is exceptionally accurate and geared to the research’s motive. They are divided into two categories: quantitative and qualitative. We’ll explore the specifics later.
  • Secondary data collection. Secondary data is the information that’s been used in the past. The researcher can obtain data from internal and external sources, including organizational data.

Let’s take a closer look at specific examples of both data collection methods.

The Specific Types of Data Collection Methods

As mentioned, primary data collection methods are split into quantitative and qualitative. We will examine each method’s data collection tools separately. Then, we will discuss secondary data collection methods.

Quantitative Methods

Quantitative techniques for demand forecasting and market research typically use statistical tools. When using these techniques, historical data is used to forecast demand. These primary data-gathering procedures are most often used to make long-term forecasts. Statistical analysis methods are highly reliable because they carry minimal subjectivity.

  • Barometric Method. Also called the leading indicators approach, data analysts and researchers employ this method to speculate on future trends based on current developments. When past events are used to predict future events, they are considered leading indicators.
  • Smoothing Techniques. Smoothing techniques can be used in cases where the time series lacks significant trends. These techniques eliminate random variation from historical demand and help identify demand levels and patterns to estimate future demand. The most popular methods used in these techniques are the simple moving average and the weighted moving average methods.
  • Time Series Analysis. The term “time series” refers to the sequential order of values in a variable, also known as a trend, at equal time intervals. Using patterns, organizations can predict customer demand for their products and services during the projected time.

Qualitative Methods

Qualitative data collection methods are instrumental when no historical information is available, or numbers and mathematical calculations aren’t required. Qualitative research is closely linked to words, emotions, sounds, feelings, colors, and other non-quantifiable elements. These techniques rely on experience, conjecture, intuition, judgment, emotion, etc. Quantitative methods do not provide motives behind the participants’ responses. Additionally, they often don’t reach underrepresented populations and usually involve long data collection periods. Therefore, you get the best results using quantitative and qualitative methods together.

  • Questionnaires . Questionnaires are a printed set of either open-ended or closed-ended questions. Respondents must answer based on their experience and knowledge of the issue. A questionnaire is a part of a survey, while the questionnaire’s end goal doesn’t necessarily have to be a survey.
  • Surveys. Surveys collect data from target audiences, gathering insights into their opinions, preferences, choices, and feedback on the organization’s goods and services. Most survey software has a wide range of question types, or you can also use a ready-made survey template that saves time and effort. Surveys can be distributed via different channels such as e-mail, offline apps, websites, social media, QR codes, etc.

Once researchers collect the data, survey software generates reports and runs analytics algorithms to uncover hidden insights. Survey dashboards give you statistics relating to completion rates, response rates, filters based on demographics, export and sharing options, etc. Practical business intelligence depends on the synergy between analytics and reporting. Analytics uncovers valuable insights while reporting communicates these findings to the stakeholders.

  • Polls. Polls consist of one or more multiple-choice questions. Marketers can turn to polls when they want to take a quick snapshot of the audience’s sentiments. Since polls tend to be short, getting people to respond is more manageable. Like surveys, online polls can be embedded into various media and platforms. Once the respondents answer the question(s), they can be shown how they stand concerning other people’s responses.
  • Delphi Technique. The name is a callback to the Oracle of Delphi, a priestess at Apollo’s temple in ancient Greece, renowned for her prophecies. In this method, marketing experts are given the forecast estimates and assumptions made by other industry experts. The first batch of experts may then use the information provided by the other experts to revise and reconsider their estimates and assumptions. The total expert consensus on the demand forecasts creates the final demand forecast.
  • Interviews. In this method, interviewers talk to the respondents either face-to-face or by telephone. In the first case, the interviewer asks the interviewee a series of questions in person and notes the responses. The interviewer can opt for a telephone interview if the parties cannot meet in person. This data collection form is practical for use with only a few respondents; repeating the same process with a considerably larger group takes longer.
  • Focus Groups. Focus groups are one of the primary examples of qualitative data in education. In focus groups, small groups of people, usually around 8-10 members, discuss the research problem’s common aspects. Each person provides their insights on the issue, and a moderator regulates the discussion. When the discussion ends, the group reaches a consensus.

Secondary Data Collection Methods

Secondary data is the information that’s been used in past situations. Secondary data collection methods can include quantitative and qualitative techniques. In addition, secondary data is easily available, so it’s less time-consuming and expensive than using primary data. However, the authenticity of data gathered with secondary data collection tools cannot be verified.

Internal secondary data sources:

  • CRM Software
  • Executive summaries
  • Financial Statements
  • Mission and vision statements
  • Organization’s health and safety records
  • Sales Reports

External secondary data sources:

  • Business journals
  • Government reports
  • Press releases

The Importance of Data Collection Methods

Data collection methods play a critical part in the research process as they determine the accuracy and quality and accuracy of the collected data. Here’s a sample of some reasons why data collection procedures are so important:

  • They determine the quality and accuracy of collected data
  • They ensure the data and the research findings are valid, relevant and reliable
  • They help reduce bias and increase the sample’s representation
  • They are crucial for making informed decisions and arriving at accurate conclusions
  • They provide accurate data, which facilitates the achievement of research objectives

So, What’s the Difference Between Data Collecting and Data Processing?

Data collection is the first step in the data processing process. Data collection involves gathering information (raw data) from various sources such as interviews, surveys, questionnaires, etc. Data processing describes the steps taken to organize, manipulate and transform the collected data into a useful and meaningful resource. This process may include tasks such as cleaning and validating data, analyzing and summarizing data, and creating visualizations or reports.

So, data collection is just one step in the overall data processing chain of events.

Do You Want to Become a Data Scientist?

If this discussion about data collection and the professionals who conduct it has sparked your enthusiasm for a new career, why not check out this online data science program ?

The Glassdoor.com jobs website shows that data scientists in the United States typically make an average yearly salary of $129,127 plus additional bonuses and cash incentives. So, if you’re interested in a new career or are already in the field but want to upskill or refresh your current skill set, sign up for this bootcamp and prepare to tackle the challenges of today’s big data.

Data Science Bootcamp

  • Learning Format:

Online Bootcamp

Leave a comment cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Recommended Articles

Why Python for Data Science

Why Use Python for Data Science?

This article explains why you should use Python for data science tasks, including how it’s done and the benefits.

Data Science Process

A Beginner’s Guide to the Data Science Process

Data scientists are in high demand today. If you’re considering pursuing a career in this rewarding field, read on to better understand the data science process, tools, roles, and more.

What Is Data Mining

What Is Data Mining? A Beginner’s Guide

This article explores data mining, including the steps involved in the data mining process, data mining tools and applications, and the associated challenges.

What Is Data Processing

What Is Data Processing? Definition, Examples, Trends

This article addresses the question, “What is data processing?” It covers the data processing cycle, types and methods of data processing, and examples.

Data Scientist Roles and Responsibilities

Navigating Data Scientist Roles and Responsibilities in Today’s Market

Data scientists are in high demand. If the job sounds interesting, read on to learn more about a data scientist’s roles and responsibilities.

data scientist vs data analyst

Differences Between Data Scientist and Data Analyst: Complete Explanation

The ever-changing world of information technology (IT) has brought us new innovations and ways of doing things and a host of new terms, phrases, and

Learning Format

Program Benefits

  • 12+ tools covered, 25+ hands-on projects
  • Masterclasses by distinguished Caltech CTME instructors
  • Caltech CTME Circle Membership
  • Industry-specific training from global experts
  • Call us on : 1800-212-7688

Logo for JCU Open eBooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

4.5 Data Collection Methods

Choosing the most appropriate and practical data collection method is an important decision that must be made carefully. It is important to recognise that the quality of data collected in a qualitative manner is a direct reflection of the skill and competence of the researcher. Advanced interpersonal skills are required, especially the ability to accurately interpret and respond to subtle participant behavior in a variety of situations. Interviews, focus groups and observations are the primary methods of data collection used in qualitative healthcare research (Figure 4.7). 62

what is data collection in research methodology

Interviews can be used to explore individual participants’ views, experiences, beliefs and motivations. There are three fundamental types of research interviews: structured, semi-structured and unstructured.

Structured interviews, also known as standardised open-ended interviews, are carefully prepared ahead of time, and each participant is asked the same question in a certain sequence. 63 A structured interview is essentially an oral questionnaire in which a pre-determined list of questions is asked, with little or no variation and no room for follow-up questions to answers that require further clarification. 63 Structured interviews are relatively quick and easy to develop and use and are especially useful when you need clarification on a specific question. 63 However, by its very nature, it allows only a limited number of participant responses, so it is of little use if “depth” is desired. This approach resists improvisation and the pursuit of intuition but can promote consistency among participants. 63

Semi-structured interviews, also known as the general interview guide approach, include an outline of questions to ensure that all pertinent topics are covered. 63 A semi-structured interview consists of a few key questions that help define the area to be explored but also allow the interviewer or respondent to diverge and explore ideas or responses in more detail. 64 This interview format is used most frequently in healthcare, as it provides participants with some guidance about what to talk about. The flexibility of this approach, especially when compared to structured interviews, is that it allows participants to discover or refine important information that may not have been previously considered relevant by the research team. 63

Unstructured interviews, also known as informal conversational interviews, consist of questions that are spontaneously generated in the natural flow of conversation, reflect no preconceptions or ideas, and have little or no organisation. 65 Such conversations can easily start with an opening question such as, “Can you tell me about your experience at the clinic?” It then proceeds primarily based on the initial response. Unstructured interviews tend to be very lengthy (often hours), lack pre-set interview questions, and provide little guidance on what to talk about, which can be difficult for participants. 63

As a result, they are often considered only when great “depth” is required, little is known about the subject, or another viewpoint on a known issue is requested. 63 Significant freedom in unstructured interviews allows for more intuitive and spontaneous exchanges between the researcher and the participants. 63. .

Advantages and Disadvantages

Interviews can be conducted via Phone, Face-to-Face or Online, depending on participants’ preferences and availability. Often participants are flattered to be asked and they make the time to speak with you and they reward you with candour. 66 Usually, interviews provide flexibility to schedule sessions at the convenience of the interviewees. 66 It also provides less observer or participant bias as other participants’ experiences or opinions do not influence the interviewee. Interviews also provide enough talk time for interviewees and spare them from spending time listening to others. Additionally, the interviewer can observe the non-verbal behaviour of the interviewee and potentially record it as data. 66

Interviews also have inherent weaknesses. Conducting interviews can be very costly and time-consuming. 66 Interviews also provide less anonymity, which is usually a major concern for many respondents. 66 Nonetheless, qualitative interviews can be a valuable tool to help uncover meaning and understanding of phenomena. 66

With your understanding of interviews, watch this video clip and identify what you would do differently and provide your thoughts in the Padlet below.

Now watch the video clip below to see how a good interview should be conducted

After watching the video, reflect on the responses you provided in the Padlet and consider if there is anything you may have missed out or need to revise.

Focus group

Focus groups are group interviews that explore participants’ knowledge and experiences and how and why individuals act in various ways. 67   This method involves bringing a small group together to discuss a specific topic or issue. The groups typically include 6-8 participants and are conducted by an experienced moderator who follows a topic guide or interview guide. 67 The conversations can be audio or videotaped and then transcribed, depending on the researchers’ and participants’ preferences. In addition, focus groups can include an observer who records nonverbal parts of the encounter, potentially with the help of an observation guide. 67

Advantages and disadvantages

Focus groups effectively bring together homogenous groups of people with relevant expertise and experience on a specific issue and can offer comprehensive information. 67 They are often used to gather information about group dynamics, attitudes, and perceptions and can provide a rich source of data. 67

Disadvantages include less control over the process and a lower level of participation by each individual. 67 Also, focus group moderators, as well as those responsible for data processing, require prior experience. Focus groups are less suitable for discussing sensitive themes as some participants may be reluctant to express their opinions in a group environment. 67 Furthermore, it is important to watch for the creation of “groupthink” or dominance of certain group members, as group dynamics and social dynamics can influence focus groups. 67


Observations involve the researcher observing and recording the behaviour and interactions of individuals or groups in a natural setting. 67 Observations are especially valuable for gaining insights about a specific situation and real behaviour. They can be participant (the researcher participates in the activity) or non-participant (the researcher observes from a distance) in nature. 67 The observer in participant observations is a member of the observed context, such as a nurse working in an intensive care unit. The observer is “on the outside looking in” in non-participant observations, i.e. present but not a part of the scenario, attempting not to impact the environment by their presence. 67 During the observation, the observer notes everything or specific elements of what is happening around them, such as physician-patient interactions or communication between different professional groups. 67

The advantage of performing observations includes reducing the gap between the researcher and the study. Issues may be found that the researcher was unaware of and are relevant in gaining a greater understanding of the research. 67 However, observation can be time-consuming, as the researcher may need to observe the behaviour or interactions for an extended period to collect enough data. In addition, they can be influenced by the researcher’s biases, which can affect the accuracy and validity of the data collected. 68

An Introduction to Research Methods for Undergraduate Health Profession Students Copyright © 2023 by Faith Alele and Bunmi Malau-Aduli is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

what is data collection in research methodology

Home QuestionPro QuestionPro Products

Data Collection Methods: Sources & Examples

Data Collection Methods

Data is a collection of facts, figures, objects, symbols, and events gathered from different sources. Organizations collect data using various data collection methods to make better decisions. Without data, it would be difficult for organizations to make appropriate decisions, so data is collected from different audiences at various points in time.

For example, an organization must collect data on product demand, customer preferences, and competitors before launching a new product. If data is not collected beforehand, the organization’s newly launched product may fail for many reasons, such as less demand and inability to meet customer needs. 

Although data is a valuable asset for every organization, it does not serve any purpose until it is analyzed or processed to achieve the desired results.

What are Data Collection Methods?

Data collection methods are techniques and procedures for gathering information for research purposes. They can range from simple self-reported surveys to more complex experiments and can involve either quantitative or qualitative approaches.

Some common data collection methods include surveys, interviews, observations, focus groups, experiments, and secondary data analysis . The data collected through these methods can then be analyzed and used to support or refute research hypotheses and draw conclusions about the study’s subject matter.

LEARN ABOUT: Self-Selection Bias

Understanding Data Collection Methods

Data collection methods encompass a variety of techniques and tools for gathering both quantitative and qualitative data. These methods are integral to the data collection process, ensuring accurate and comprehensive data acquisition. 

Quantitative data collection methods involve systematic approaches to collecting data, like numerical data, such as surveys, polls, and statistical analysis, aimed at quantifying phenomena and trends. 

Conversely, qualitative data collection methods focus on capturing non-numerical information, such as interviews, focus groups, and observations, to delve deeper into understanding attitudes, behaviors, and motivations. 

Employing a combination of quantitative and qualitative data collection techniques can enrich organizations’ datasets and gain comprehensive insights into complex phenomena.

Effective utilization of accurate data collection tools and techniques enhances the accuracy and reliability of collected data, facilitating informed decision-making and strategic planning.

Importance of Data Collection Methods

Data collection methods play a crucial role in the research process as they determine the quality and accuracy of the data collected. Here are some major importance of data collection methods.

  • Quality and Accuracy: The choice of data collection method directly impacts the quality and accuracy of the data obtained. Properly designed methods help ensure that the data collected is relevant to the research questions and free from errors.
  • Relevance, Validity, and Reliability: Effective data collection methods help ensure that the data collected is relevant to the research objectives, valid (measuring what it intends to measure), and reliable (consistent and reproducible).
  • Bias Reduction and Representativeness: Carefully chosen data collection methods can help minimize biases inherent in the research process, such as sampling bias or response bias. They also aid in achieving a representative sample, enhancing the findings’ generalizability.
  • Informed Decision Making: Accurate and reliable data collected through appropriate methods provide a solid foundation for making informed decisions based on research findings. This is crucial for both academic research and practical applications in various fields.
  • Achievement of Research Objectives: Data collection methods should align with the research objectives to ensure that the collected data effectively addresses the research questions or hypotheses. Properly collected data facilitates the attainment of these objectives.
  • Support for Validity and Reliability: Validity and reliability are essential aspects of research validity. The choice of data collection methods can either enhance or detract from the validity and reliability of research findings. Therefore, selecting appropriate methods is critical for ensuring the credibility of the research.

The importance of data collection methods cannot be overstated, as they play a key role in the research study’s overall success and internal validity .

LEARN ABOUT: Data Asset Management

Types of Data Collection Methods

The choice of data collection method depends on the research question being addressed, the type of data needed, and the resources and time available. Data collection methods can be categorized into primary and secondary methods.

1. Primary Data Collection Methods

Primary data is collected from first-hand experience and is not used in the past. The data gathered by primary data collection methods are highly accurate and specific to the research’s motive.

Primary data collection methods can be divided into two categories: quantitative methods and qualitative methods .

Quantitative Methods:

Quantitative techniques for market research and demand forecasting usually use statistical tools. In these techniques, demand is forecasted based on historical data. These methods of primary data collection are generally used to make long-term forecasts. Statistical analysis methods are highly reliable as subjectivity is minimal.

what is data collection in research methodology

  • Time Series Analysis: A time series refers to a sequential order of values of a variable, known as a trend, at equal time intervals. Using patterns, an organization can predict the demand for its products and services over a projected time period. 
  • Smoothing Techniques: Smoothing techniques can be used in cases where the time series lacks significant trends. They eliminate random variation from the historical demand, helping identify patterns and demand levels to estimate future demand.  The most common methods used in smoothing demand forecasting are the simple moving average and weighted moving average methods. 
  • Barometric Method: Also known as the leading indicators approach, researchers use this method to speculate future trends based on current developments. When past events are considered to predict future events, they act as leading indicators.

Qualitative Methods:

Qualitative data collection methods are especially useful when historical data is unavailable or when numbers or mathematical calculations are unnecessary.

Qualitative research is closely associated with words, sounds, feelings, emotions, colors, and non-quantifiable elements. These techniques are based on experience, judgment, intuition, conjecture, emotion, etc.

Quantitative methods do not provide the motive behind participants’ responses, often don’t reach underrepresented populations, and require long periods of time to collect the data. Hence, it is best to combine quantitative methods with qualitative methods.

1. Surveys: Surveys collect data from the target audience and gather insights into their preferences, opinions, choices, and feedback related to their products and services. Most survey software offers a wide range of question types.

You can also use a ready-made survey template to save time and effort. Online surveys can be customized to match the business’s brand by changing the theme, logo, etc. They can be distributed through several channels, such as email, website, offline app, QR code, social media, etc. 

You can select the channel based on your audience’s type and source. Once the data is collected, survey software can generate various reports and run analytics algorithms to discover hidden insights. 

A survey dashboard can give you statistics related to response rate, completion rate, demographics-based filters, export and sharing options, etc. Integrating survey builders with third-party apps can maximize the effort spent on online real-time data collection . 

Practical business intelligence relies on the synergy between analytics and reporting , where analytics uncovers valuable insights, and reporting communicates these findings to stakeholders.

2. Polls: Polls comprise one single or multiple-choice question . They are useful when you need to get a quick pulse of the audience’s sentiments. Because they are short, it is easier to get responses from people.

Like surveys, online polls can be embedded into various platforms. Once the respondents answer the question, they can also be shown how they compare to others’ responses.

Interviews: In this method, the interviewer asks the respondents face-to-face or by telephone. 

3. Interviews: In face-to-face interviews, the interviewer asks a series of questions to the interviewee in person and notes down responses. If it is not feasible to meet the person, the interviewer can go for a telephone interview. 

This form of data collection is suitable for only a few respondents. It is too time-consuming and tedious to repeat the same process if there are many participants.

what is data collection in research methodology

4. Delphi Technique: In the Delphi method, market experts are provided with the estimates and assumptions of other industry experts’ forecasts. Experts may reconsider and revise their estimates and assumptions based on this information. The consensus of all experts on demand forecasts constitutes the final demand forecast.

5. Focus Groups: Focus groups are one example of qualitative data in education . In a focus group, a small group of people, around 8-10 members, discuss the common areas of the research problem. Each individual provides his or her insights on the issue concerned. 

A moderator regulates the discussion among the group members. At the end of the discussion, the group reaches a consensus.

6. Questionnaire: A questionnaire is a printed set of open-ended or closed-ended questions that respondents must answer based on their knowledge and experience with the issue. The questionnaire is part of the survey, whereas the questionnaire’s end goal may or may not be a survey.

Secondary Data Collection Methods

Secondary data is data that has been used in the past. The researcher can obtain data from the data sources , both internal and external, to the organizational data . 

Internal sources of secondary data:

  • Organization’s health and safety records
  • Mission and vision statements
  • Financial Statements
  • Sales Report
  • CRM Software
  • Executive summaries

External sources of secondary data:

  • Government reports
  • Press releases
  • Business journals

Secondary data collection methods can also involve quantitative and qualitative techniques. Secondary data is easily available, less time-consuming, and expensive than primary data. However, the authenticity of the data gathered cannot be verified using these methods.

Secondary data collection methods can also involve quantitative and qualitative observation techniques. Secondary data is easily available, less time-consuming, and more expensive than primary data. 

However, the authenticity of the data gathered cannot be verified using these methods.

Regardless of the data collection method of your choice, there must be direct communication with decision-makers so that they understand and commit to acting according to the results.

For this reason, we must pay special attention to the analysis and presentation of the information obtained. Remember that these data must be useful and functional to us, so the data collection method used has much to do with it.

How QuestionPro Can Help in Data Collection Methods

QuestionPro is a comprehensive online survey software platform that can greatly assist in various data collection methods. Here’s how it can help:

  • Survey Creation: QuestionPro offers a user-friendly interface for creating surveys with various question types, including multiple-choice, open-ended, Likert scale, and more. Researchers can customize surveys to fit their specific research needs and objectives.
  • Diverse Distribution Channels: The platform provides multiple channels for distributing surveys, including email, web links, social media, and embedding surveys on websites. This enables researchers to reach a wide audience and collect data efficiently.
  • Panel Management: QuestionPro offers panel management features, allowing researchers to create and manage panels of respondents for targeted data collection. This is particularly useful for longitudinal studies or when targeting specific demographics.
  • Data Analysis Tools: The platform includes robust data analysis tools that enable researchers to analyze survey responses in real time. Researchers can generate customizable reports, visualize data through charts and graphs, and identify trends and patterns within the data.
  • Data Security and Compliance: QuestionPro prioritizes data security and compliance with regulations such as GDPR and HIPAA. The platform offers features such as SSL encryption, data masking, and secure data storage to ensure the confidentiality and integrity of collected data.
  • Mobile Compatibility: With the increasing use of mobile devices, QuestionPro ensures that surveys are mobile-responsive, allowing respondents to participate in surveys conveniently from their smartphones or tablets.
  • Integration Capabilities: QuestionPro integrates with various third-party tools and platforms, including CRMs, email marketing software, and analytics tools. This allows researchers to streamline their data collection processes and incorporate survey data into their existing workflows.
  • Customization and Branding: Researchers can customize surveys with their branding elements, such as logos, colors, and themes, enhancing the professional appearance of surveys and increasing respondent engagement.

The conclusion you obtain from your investigation will set the course of the company’s decision-making, so present your report clearly, and list the steps you followed to obtain those results.

Make sure that whoever will take the corresponding actions understands the importance of the information collected and that it gives them the solutions they expect.

QuestionPro offers a comprehensive suite of features and tools that can significantly streamline the data collection process, from survey creation to analysis, while ensuring data security and compliance. Remember that at QuestionPro, we can help you collect data easily and efficiently. Request a demo and learn about all the tools we have for you.



8 leading brand health tracker to track your brand reputation.

Mar 26, 2024

UX research tools

Top 15 UX Research Tools in 2024: Complete Guide

customer loyalty software

10 Top Customer Loyalty Software to Boost Your Business

Mar 25, 2024

anonymous employee feedback tools

Top 13 Anonymous Employee Feedback Tools for 2024

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

What is Data Collection? Methods, Types, Tools, Examples

Appinio Research · 09.11.2023 · 33min read

What is Data Collection Methods Types Tools Examples

Are you ready to unlock the power of data? In today's data-driven world, understanding the art and science of data collection is the key to informed decision-making and achieving your objectives.

This guide will walk you through the intricate data collection process, from its fundamental principles to advanced strategies and ethical considerations. Whether you're a business professional, researcher, or simply curious about the world of data, this guide will equip you with the knowledge and tools needed to harness the potential of data collection effectively.

What is Data Collection?

Data collection is the systematic process of gathering and recording information or data from various sources for analysis, interpretation, and decision-making. It is a fundamental step in research, business operations, and virtually every field where information is used to understand, improve, or make informed choices.

Key Elements of Data Collection

  • Sources: Data can be collected from a wide range of sources, including surveys , interviews, observations, sensors, databases, social media, and more.
  • Methods: Various methods are employed to collect data, such as questionnaires, data entry, web scraping, and sensor networks. The choice of method depends on the type of data, research objectives, and available resources.
  • Data Types: Data can be qualitative (descriptive) or quantitative (numerical), structured (organized into a predefined format) or unstructured (free-form text or media), and primary (collected directly) or secondary (obtained from existing sources).
  • Data Collection Tools: Technology plays a significant role in modern data collection, with software applications, mobile apps, sensors, and data collection platforms facilitating efficient and accurate data capture.
  • Ethical Considerations: Ethical guidelines, including informed consent and privacy protection, must be followed to ensure that data collection respects the rights and well-being of individuals.
  • Data Quality: The accuracy, completeness, and reliability of collected data are critical to its usefulness. Data quality assurance measures are implemented to minimize errors and biases.
  • Data Storage: Collected data needs to be securely stored and managed to prevent loss, unauthorized access, and breaches. Data storage solutions range from on-premises servers to cloud-based platforms.

Importance of Data Collection in Modern Businesses

Data collection is of paramount importance in modern businesses for several compelling reasons:

  • Informed Decision-Making: Collected data serves as the foundation for informed decision-making at all levels of an organization. It provides valuable insights into customer behavior, market trends, operational efficiency, and more.
  • Competitive Advantage: Businesses that effectively collect and analyze data gain a competitive edge. Data-driven insights help identify opportunities, optimize processes, and stay ahead of competitors .
  • Customer Understanding: Data collection allows businesses to better understand their customers, their preferences, and their pain points. This insight is invaluable for tailoring products, services, and marketing strategies.
  • Performance Measurement: Data collection enables organizations to assess the performance of various aspects of their operations, from marketing campaigns to production processes. This helps identify areas for improvement.
  • Risk Management: Businesses can use data to identify potential risks and develop strategies to mitigate them. This includes financial risks, supply chain disruptions, and cybersecurity threats.
  • Innovation: Data collection supports innovation by providing insights into emerging trends and customer demands. Businesses can use this information to develop new products or services.
  • Resource Allocation: Data-driven decision-making helps allocate resources efficiently. For example, marketing budgets can be optimized based on the performance of different channels.

Goals and Objectives of Data Collection

The goals and objectives of data collection depend on the specific context and the needs of the organization or research project. However, there are some common overarching objectives:

  • Information Gathering: The primary goal is to gather accurate, relevant, and reliable information that addresses specific questions or objectives.
  • Analysis and Insight: Collected data is meant to be analyzed to uncover patterns, trends, relationships, and insights that can inform decision-making and strategy development.
  • Measurement and Evaluation: Data collection allows for the measurement and evaluation of various factors, such as performance, customer satisfaction , or market potential.
  • Problem Solving: Data collection can be directed toward solving specific problems or challenges faced by an organization, such as identifying the root causes of quality issues.
  • Monitoring and Surveillance: In some cases, data collection serves as a continuous monitoring or surveillance function, allowing organizations to track ongoing processes or conditions.
  • Benchmarking: Data collection can be used for benchmarking against industry standards or competitors, helping organizations assess their performance relative to others.
  • Planning and Strategy: Data collected over time can support long-term planning and strategy development, ensuring that organizations adapt to changing circumstances.

In summary, data collection is a foundational activity with diverse applications across industries and sectors. Its objectives range from understanding customers and making informed decisions to improving processes, managing risks, and driving innovation. The quality and relevance of collected data are pivotal in achieving these goals.

How to Plan Your Data Collection Strategy?

Before kicking things off, we'll review the crucial steps of planning your data collection strategy. Your success in data collection largely depends on how well you define your objectives, select suitable sources, set clear goals, and choose appropriate collection methods.

Defining Your Research Questions

Defining your research questions is the foundation of any effective data collection effort. The more precise and relevant your questions, the more valuable the data you collect.

  • Specificity is Key: Make sure your research questions are specific and focused. Instead of asking, "How can we improve customer satisfaction?" ask, "What specific aspects of our service do customers find most satisfying or dissatisfying?"
  • Prioritize Questions: Determine the most critical questions that will have the most significant impact on your goals. Not all questions are equally important, so allocate your resources accordingly.
  • Alignment with Objectives: Ensure that your research questions directly align with your overall objectives. If your goal is to increase sales, your research questions should be geared toward understanding customer buying behaviors and preferences.

Identifying Key Data Sources

Identifying the proper data sources is essential for gathering accurate and relevant information. Here are some examples of key data sources for different industries and purposes.

  • Customer Data: This can include customer demographics, purchase history, website behavior, and feedback from customer service interactions.
  • Market Research Reports: Utilize industry reports, competitor analyses, and market trend studies to gather external data and insights.
  • Internal Records: Your organization's databases, financial records, and operational data can provide valuable insights into your business's performance.
  • Social Media Platforms: Monitor social media channels to gather customer feedback, track brand mentions , and identify emerging trends in your industry.
  • Web Analytics: Collect data on website traffic, user behavior, and conversion rates to optimize your online presence.

Setting Clear Data Collection Goals

Setting clear and measurable goals is essential to ensure your data collection efforts remain on track and deliver valuable results. Goals should be:

  • Specific: Clearly define what you aim to achieve with your data collection. For instance, increasing website traffic by 20% in six months is a specific goal.
  • Measurable: Establish criteria to measure your progress and success. Use metrics such as revenue growth, customer satisfaction scores, or conversion rates.
  • Achievable: Set realistic goals that your team can realistically work towards. Overly ambitious goals can lead to frustration and burnout.
  • Relevant : Ensure your goals align with your organization's broader objectives and strategic initiatives.
  • Time-Bound: Set a timeframe within which you plan to achieve your goals. This adds a sense of urgency and helps you track progress effectively.

Choosing Data Collection Methods

Selecting the correct data collection methods is crucial for obtaining accurate and reliable data. Your choice should align with your research questions and goals. Here's a closer look at various data collection methods and their practical applications.

Types of Data Collection Methods

Now, let's explore different data collection methods in greater detail, including examples of when and how to use them effectively:

Surveys and Questionnaires

Surveys and questionnaires are versatile tools for gathering data from a large number of respondents. They are commonly used for:

  • Customer Feedback: Collecting opinions and feedback on products, services, and overall satisfaction.
  • Market Research: Assessing market preferences, identifying trends, and evaluating consumer behavior .
  • Employee Surveys : Measuring employee engagement, job satisfaction, and feedback on workplace conditions.

Example: If you're running an e-commerce business and want to understand customer preferences, you can create an online survey asking customers about their favorite product categories, preferred payment methods, and shopping frequency.

To enhance your data collection endeavors, check out Appinio , a modern research platform that simplifies the process and maximizes the quality of insights. Appinio offers user-friendly survey and questionnaire tools that enable you to effortlessly design surveys tailored to your needs. It also provides seamless integration with interview and observation data, allowing you to consolidate your findings in one place.

Discover how Appinio can elevate your data collection efforts. Book a demo today to unlock a world of possibilities in gathering valuable insights!

Book a Demo

Interviews involve one-on-one or group conversations with participants to gather detailed insights. They are particularly useful for:

  • Qualitative Research: Exploring complex topics, motivations, and personal experiences.
  • In-Depth Analysis: Gaining a deep understanding of specific issues or situations.
  • Expert Opinions: Interviewing industry experts or thought leaders to gather valuable insights.

Example: If you're a healthcare provider aiming to improve patient experiences, conducting interviews with patients can help you uncover specific pain points and suggestions for improvement.


Observations entail watching and recording behaviors or events in their natural context. This method is ideal for:

  • Behavioral Studies: Analyzing how people interact with products or environments.
  • Field Research: Collecting data in real-world settings, such as retail stores, public spaces, or classrooms.
  • Ethnographic Research: Immersing yourself in a specific culture or community to understand their practices and customs.

Example: If you manage a retail store, observing customer traffic flow and purchasing behaviors can help optimize store layout and product placement.

Document Analysis

Document analysis involves reviewing and extracting information from written or digital documents. It is valuable for:

  • Historical Research: Studying historical records, manuscripts, and archives.
  • Content Analysis: Analyzing textual or visual content from websites, reports, or publications.
  • Legal and Compliance: Reviewing contracts, policies, and legal documents for compliance purposes.

Example: If you're a content marketer, you can analyze competitor blog posts to identify common topics and keywords used in your industry.

Web Scraping

Web scraping is the automated process of extracting data from websites. It's suitable for:

  • Competitor Analysis: Gathering data on competitor product prices, descriptions, and customer reviews.
  • Market Research: Collecting data on product listings, reviews, and trends from e-commerce websites.
  • News and Social Media Monitoring: Tracking news articles, social media posts, and comments related to your brand or industry.

Example: If you're in the travel industry, web scraping can help you collect pricing data for flights and accommodations from various travel booking websites to stay competitive.

Social Media Monitoring

Social media monitoring involves tracking and analyzing conversations and activities on social media platforms. It's valuable for:

  • Brand Reputation Management: Monitoring brand mentions and sentiment to address customer concerns or capitalize on positive feedback.
  • Competitor Analysis: Keeping tabs on competitors' social media strategies and customer engagement.
  • Trend Identification: Identifying emerging trends and viral content within your industry.

Example: If you run a restaurant, social media monitoring can help you track customer reviews, comments, and hashtags related to your establishment, allowing you to respond promptly to customer feedback and trends.

By understanding the nuances and applications of these data collection methods, you can choose the most appropriate approach to gather valuable insights for your specific objectives. Remember that a well-thought-out data collection strategy is the cornerstone of informed decision-making and business success.

How to Design Your Data Collection Instruments?

Now that you've defined your research questions, identified data sources, set clear goals, and chosen appropriate data collection methods, it's time to design the instruments you'll use to collect data effectively.

Design Effective Survey Questions

Designing survey questions is a crucial step in gathering accurate and meaningful data. Here are some key considerations:

  • Clarity: Ensure that your questions are clear and concise. Avoid jargon or ambiguous language that may confuse respondents.
  • Relevance: Ask questions that directly relate to your research objectives. Avoid unnecessary or irrelevant questions that can lead to survey fatigue.
  • Avoid Leading Questions: Formulate questions that do not guide respondents toward a particular answer. Maintain neutrality to get unbiased responses.
  • Response Options: Provide appropriate response options, including multiple-choice, Likert scales, or open-ended formats, depending on the type of data you need.
  • Pilot Testing: Before deploying your survey, conduct pilot tests with a small group to identify any issues with question wording or response options.

Craft Interview Questions for Insightful Conversations

Developing interview questions requires thoughtful consideration to elicit valuable insights from participants:

  • Open-Ended Questions: Use open-ended questions to encourage participants to share their thoughts, experiences, and perspectives without being constrained by predefined answers.
  • Probing Questions: Prepare follow-up questions to delve deeper into specific topics or clarify responses.
  • Structured vs. Semi-Structured Interviews: Decide whether your interviews will follow a structured format with predefined questions or a semi-structured approach that allows flexibility.
  • Avoid Biased Questions: Ensure your questions do not steer participants toward desired responses. Maintain objectivity throughout the interview.

Build an Observation Checklist for Data Collection

When conducting observations, having a well-structured checklist is essential:

  • Clearly Defined Variables: Identify the specific variables or behaviors you are observing and ensure they are well-defined.
  • Checklist Format: Create a checklist format that is easy to use and follow during observations. This may include checkboxes, scales, or space for notes.
  • Training Observers: If you have a team of observers, provide thorough training to ensure consistency and accuracy in data collection.
  • Pilot Observations: Before starting formal data collection, conduct pilot observations to refine your checklist and ensure it captures the necessary information.

Streamline Data Collection with Forms and Templates

Creating user-friendly data collection forms and templates helps streamline the process:

  • Consistency: Ensure that all data collection forms follow a consistent format and structure, making it easier to compare and analyze data.
  • Data Validation: Incorporate data validation checks to reduce errors during data entry. This can include dropdown menus, date pickers, or required fields.
  • Digital vs. Paper Forms: Decide whether digital forms or traditional paper forms are more suitable for your data collection needs. Digital forms often offer real-time data validation and remote access.
  • Accessibility: Make sure your forms and templates are accessible to all team members involved in data collection. Provide training if necessary.

The Data Collection Process

Now that your data collection instruments are ready, it's time to embark on the data collection process itself. This section covers the practical steps involved in collecting high-quality data.

1. Preparing for Data Collection

Adequate preparation is essential to ensure a smooth data collection process:

  • Resource Allocation: Allocate the necessary resources, including personnel, technology, and materials, to support data collection activities.
  • Training: Train data collection teams or individuals on the use of data collection instruments and adherence to protocols.
  • Pilot Testing: Conduct pilot data collection runs to identify and resolve any issues or challenges that may arise.
  • Ethical Considerations: Ensure that data collection adheres to ethical standards and legal requirements. Obtain necessary permissions or consent as applicable.

2. Conducting Data Collection

During data collection, it's crucial to maintain consistency and accuracy:

  • Follow Protocols: Ensure that data collection teams adhere to established protocols and procedures to maintain data integrity.
  • Supervision: Supervise data collection teams to address questions, provide guidance, and resolve any issues that may arise.
  • Documentation: Maintain detailed records of the data collection process, including dates, locations, and any deviations from the plan.
  • Data Security: Implement data security measures to protect collected information from unauthorized access or breaches.

3. Ensuring Data Quality and Reliability

After collecting data, it's essential to validate and ensure its quality:

  • Data Cleaning: Review collected data for errors, inconsistencies, and missing values. Clean and preprocess the data to ensure accuracy.
  • Quality Checks: Perform quality checks to identify outliers or anomalies that may require further investigation or correction.
  • Data Validation: Cross-check data with source documents or original records to verify its accuracy and reliability.
  • Data Auditing: Conduct periodic audits to assess the overall quality of the collected data and make necessary adjustments.

4. Managing Data Collection Teams

If you have multiple team members involved in data collection, effective management is crucial:

  • Communication: Maintain open and transparent communication channels with team members to address questions, provide guidance, and ensure consistency.
  • Performance Monitoring: Regularly monitor the performance of data collection teams, identifying areas for improvement or additional training.
  • Problem Resolution: Be prepared to promptly address any challenges or issues that arise during data collection.
  • Feedback Loop: Establish a feedback loop for data collection teams to share insights and best practices, promoting continuous improvement.

By following these steps and best practices in the data collection process, you can ensure that the data you collect is reliable, accurate, and aligned with your research objectives. This lays the foundation for meaningful analysis and informed decision-making.

How to Store and Manage Data?

It's time to explore the critical aspects of data storage and management, which are pivotal in ensuring the security, accessibility, and usability of your collected data.

Choosing Data Storage Solutions

Selecting the proper data storage solutions is a strategic decision that impacts data accessibility, scalability, and security. Consider the following factors:

  • Cloud vs. On-Premises: Decide whether to store your data in the cloud or on-premises. Cloud solutions offer scalability, accessibility, and automatic backups, while on-premises solutions provide more control but require significant infrastructure investments.
  • Data Types: Assess the types of data you're collecting, such as structured, semi-structured, or unstructured data. Choose storage solutions that accommodate your data formats efficiently.
  • Scalability: Ensure that your chosen solution can scale as your data volume grows. This is crucial for preventing storage bottlenecks.
  • Data Accessibility: Opt for storage solutions that provide easy and secure access to authorized users, whether they are on-site or remote.
  • Data Recovery and Backup: Implement robust data backup and recovery mechanisms to safeguard against data loss due to hardware failures or disasters.

Data Security and Privacy

Data security and privacy are paramount, especially when handling sensitive or personal information.

  • Encryption: Implement encryption for data at rest and in transit. Use encryption protocols like SSL/TLS for communication and robust encryption algorithms for storage.
  • Access Control: Set up role-based access control (RBAC) to restrict access to data based on job roles and responsibilities. Limit access to only those who need it.
  • Compliance: Ensure that your data storage and management practices comply with relevant data protection regulations, such as GDPR, HIPAA, or CCPA.
  • Data Masking: Use data masking techniques to conceal sensitive information in non-production environments.
  • Monitoring and Auditing: Continuously monitor access logs and perform regular audits to detect unauthorized activities and maintain compliance.

Data Organization and Cataloging

Organizing and cataloging your data is essential for efficient retrieval, analysis, and decision-making.

  • Metadata Management: Maintain detailed metadata for each dataset, including data source, date of collection, data owner, and description. This makes it easier to locate and understand your data.
  • Taxonomies and Categories: Develop taxonomies or data categorization schemes to classify data into logical groups, making it easier to find and manage.
  • Data Versioning: Implement data versioning to track changes and updates over time. This ensures data lineage and transparency.
  • Data Catalogs: Use data cataloging tools and platforms to create a searchable inventory of your data assets, facilitating discovery and reuse.
  • Data Retention Policies: Establish clear data retention policies that specify how long data should be retained and when it should be securely deleted or archived.

How to Analyze and Interpret Data?

Once you've collected your data, let's take a look at the process of extracting valuable insights from your collected data through analysis and interpretation.

Data Cleaning and Preprocessing

Data cleaning and preprocessing are essential steps to ensure that your data is accurate and ready for analysis.

  • Handling Missing Data: Develop strategies for dealing with missing data, such as imputation or removal, based on the nature of your data and research objectives.
  • Outlier Detection: Identify and address outliers that can skew analysis results. Consider whether outliers should be corrected, removed, or retained based on their significance.
  • Normalization and Scaling: Normalize or scale data to bring it within a common range, making it suitable for certain algorithms and models.
  • Data Transformation: Apply data transformations, such as logarithmic scaling or categorical encoding, to prepare data for specific types of analysis.
  • Data Imbalance: Address class imbalance issues in datasets, particularly machine learning applications, to avoid biased model training.

Exploratory Data Analysis (EDA)

EDA is the process of visually and statistically exploring your data to uncover patterns, trends, and potential insights.

  • Descriptive Statistics: Calculate basic statistics like mean, median, and standard deviation to summarize data distributions.
  • Data Visualization: Create visualizations such as histograms, scatter plots, and heatmaps to reveal relationships and patterns within the data.
  • Correlation Analysis: Examine correlations between variables to understand how they influence each other.
  • Hypothesis Testing: Conduct hypothesis tests to assess the significance of observed differences or relationships in your data.

Statistical Analysis Techniques

Choose appropriate statistical analysis techniques based on your research questions and data types.

  • Descriptive Statistics: Use descriptive statistics to summarize and describe your data, providing an initial overview of key features.
  • Inferential Statistics: Apply inferential statistics, including t-tests, ANOVA, or regression analysis, to test hypotheses and draw conclusions about population parameters.
  • Non-parametric Tests: Employ non-parametric tests when assumptions of normality are not met or when dealing with ordinal or nominal data.
  • Time Series Analysis: Analyze time-series data to uncover trends, seasonality, and temporal patterns.

Data Visualization

Data visualization is a powerful tool for conveying complex information in a digestible format.

  • Charts and Graphs: Utilize various charts and graphs, such as bar charts, line charts, pie charts, and heatmaps, to represent data visually.
  • Interactive Dashboards: Create interactive dashboards using tools like Tableau, Power BI, or custom web applications to allow stakeholders to explore data dynamically.
  • Storytelling: Use data visualization to tell a compelling data-driven story, highlighting key findings and insights.
  • Accessibility: Ensure that data visualizations are accessible to all audiences, including those with disabilities, by following accessibility guidelines.

Drawing Conclusions and Insights

Finally, drawing conclusions and insights from your data analysis is the ultimate goal.

  • Contextual Interpretation: Interpret your findings in the context of your research objectives and the broader business or research landscape.
  • Actionable Insights: Identify actionable insights that can inform decision-making, strategy development, or future research directions.
  • Report Generation: Create comprehensive reports or presentations that communicate your findings clearly and concisely to stakeholders.
  • Validation: Cross-check your conclusions with domain experts or subject matter specialists to ensure accuracy and relevance.

By following these steps in data analysis and interpretation, you can transform raw data into valuable insights that drive informed decisions, optimize processes, and create new opportunities for your organization.

How to Report and Present Data?

Now, let's explore the crucial steps of reporting and presenting data effectively, ensuring that your findings are communicated clearly and meaningfully to stakeholders.

1. Create Data Reports

Data reports are the culmination of your data analysis efforts, presenting your findings in a structured and comprehensible manner.

  • Report Structure: Organize your report with a clear structure, including an introduction, methodology, results, discussion, and conclusions.
  • Visualization Integration: Incorporate data visualizations, charts, and graphs to illustrate key points and trends.
  • Clarity and Conciseness: Use clear and concise language, avoiding technical jargon, to make your report accessible to a diverse audience.
  • Actionable Insights: Highlight actionable insights and recommendations that stakeholders can use to make informed decisions.
  • Appendices: Include appendices with detailed methodology, data sources, and any additional information that supports your findings.

2. Leverage Data Visualization Tools

Data visualization tools can significantly enhance your ability to convey complex information effectively. Top data visualization tools include:

  • Tableau: Tableau offers a wide range of visualization options and interactive dashboards, making it a popular choice for data professionals.
  • Power BI: Microsoft's Power BI provides powerful data visualization and business intelligence capabilities, suitable for creating dynamic reports and dashboards.
  • Python Libraries: Utilize Python libraries such as Matplotlib, Seaborn, and Plotly for custom data visualizations and analysis.
  • Excel: Microsoft Excel remains a versatile tool for creating basic charts and graphs, particularly for smaller datasets.
  • Custom Development: Consider custom development for specialized visualization needs or when existing tools don't meet your requirements.

3. Communicate Findings to Stakeholders

Effectively communicating your findings to stakeholders is essential for driving action and decision-making.

  • Audience Understanding : Tailor your communication to the specific needs and background knowledge of your audience. Avoid technical jargon when speaking to non-technical stakeholders.
  • Visual Storytelling: Craft a narrative that guides stakeholders through the data, highlighting key insights and their implications.
  • Engagement: Use engaging and interactive presentations or reports to maintain the audience's interest and encourage participation.
  • Question Handling: Be prepared to answer questions and provide clarifications during presentations or discussions. Anticipate potential concerns or objections.
  • Feedback Loop: Encourage feedback and open dialogue with stakeholders to ensure your findings align with their objectives and expectations.

Data Collection Examples

To better understand the practical application of data collection in various domains, let's explore some real-world examples, including those in the business context. These examples illustrate how data collection can drive informed decision-making and lead to meaningful insights.

Business Customer Feedback Surveys

Scenario: A retail company wants to enhance its customer experience and improve product offerings. To achieve this, they initiate customer feedback surveys.

Data Collection Approach:

  • Survey Creation: The company designs a survey with specific questions about customer preferences , shopping experiences , and product satisfaction.
  • Distribution: Surveys are distributed through various channels, including email, in-store kiosks, and the company's website.
  • Data Gathering: Responses from thousands of customers are collected and stored in a centralized database.

Data Analysis and Insights:

  • Customer Sentiment Analysis: Using natural language processing (NLP) techniques, the company analyzes open-ended responses to gauge customer sentiment.
  • Product Performance: Analyzing survey data, the company identifies which products receive the highest and lowest ratings, leading to decisions on which products to improve or discontinue.
  • Store Layout Optimization: By examining feedback related to in-store experiences, the company can adjust store layouts and signage to enhance customer flow and convenience.

Healthcare Patient Record Digitization

Scenario: A healthcare facility aims to transition from paper-based patient records to digital records for improved efficiency and patient care.

  • Scanning and Data Entry: Existing paper records are scanned, and data entry personnel convert them into digital format.
  • Electronic Health Record (EHR) Implementation: The facility adopts an EHR system to store and manage patient data securely.
  • Continuous Data Entry: As new patient information is collected, it is directly entered into the EHR system.
  • Patient History Access: Physicians and nurses gain instant access to patient records, improving diagnostic accuracy and treatment.
  • Data Analytics: Aggregated patient data can be analyzed to identify trends in diseases, treatment outcomes, and healthcare resource utilization.
  • Resource Optimization: Analysis of patient data allows the facility to allocate resources more efficiently, such as staff scheduling based on patient admission patterns.

Social Media Engagement Monitoring

Scenario: A digital marketing agency manages social media campaigns for various clients and wants to track campaign performance and audience engagement.

  • Social Media Monitoring Tools: The agency employs social media monitoring tools to collect data on post engagement, reach, likes, shares, and comments.
  • Custom Tracking Links: Unique tracking links are created for each campaign to monitor traffic and conversions.
  • Audience Demographics: Data on the demographics of engaged users is gathered from platform analytics.
  • Campaign Effectiveness: The agency assesses which campaigns are most effective in terms of engagement and conversion rates.
  • Audience Segmentation: Insights into audience demographics help tailor future campaigns to specific target demographics.
  • Content Strategy: Analyzing which types of content (e.g., videos, infographics) generate the most engagement informs content strategy decisions.

These examples showcase how data collection serves as the foundation for informed decision-making and strategy development across diverse sectors. Whether improving customer experiences, enhancing healthcare services, or optimizing marketing efforts, data collection empowers organizations to harness valuable insights for growth and improvement.

Ethical Considerations in Data Collection

Ethical considerations are paramount in data collection to ensure privacy, fairness, and transparency. Addressing these issues is not only responsible but also crucial for building trust with stakeholders.

Informed Consent

Obtaining informed consent from participants is an ethical imperative. Transparency is critical, and participants should fully understand the purpose of data collection, how their data will be used, and any potential risks or benefits involved. Consent should be voluntary, and participants should have the option to withdraw their consent at any time without consequences.

Consent forms should be clear and comprehensible, avoiding overly complex language or legal jargon. Special care should be taken when collecting sensitive or personal data to ensure privacy rights are respected.

Privacy Protection

Protecting individuals' privacy is essential to maintain trust and comply with data protection regulations. Data anonymization or pseudonymization should be used to prevent the identification of individuals, especially when sharing or publishing data. Data encryption methods should be implemented to protect data both in transit and at rest, safeguarding it from unauthorized access.

Strict access controls should be in place to restrict data access to authorized personnel only, and clear data retention policies should be established and adhered to, preventing unnecessary data storage. Regular privacy audits should be conducted to identify and address potential vulnerabilities or compliance issues.

Bias and Fairness in Data Collection

Addressing bias and ensuring fairness in data collection is critical to avoid perpetuating inequalities. Data collection methods should be designed to minimize potential biases , such as selection bias or response bias. Efforts should be made to achieve diverse and representative samples , ensuring that data accurately reflects the population of interest. Fair treatment of all participants and data sources is essential, with discrimination based on characteristics such as race, gender, or socioeconomic status strictly avoided.

If algorithms are used in data collection or analysis, biases that may arise from automated processes should be assessed and mitigated. Ethical reviews or expert consultations may be considered when dealing with sensitive or potentially biased data. By adhering to ethical principles throughout the data collection process, individuals' rights are protected, and a foundation for responsible and trustworthy data-driven decision-making is established.

Data collection is the cornerstone of informed decision-making and insight generation in today's data-driven world. Whether you're a business seeking to understand your customers better, a researcher uncovering valuable trends, or anyone eager to harness the power of data, this guide has equipped you with the essential knowledge and tools. Remember, ethical considerations are paramount, and the quality of data matters.

Furthermore, as you embark on your data collection journey, always keep in mind the impact and potential of the information you gather. Each data point is a piece of the puzzle that can help you shape strategies, optimize operations, and make a positive difference. Data collection is not just a task; it's a powerful tool that empowers you to unlock opportunities, solve challenges, and stay ahead in a dynamic and ever-changing landscape. So, continue to explore, analyze, and draw valuable insights from your data, and let it be your compass on the path to success.

How to Collect Data in Minutes?

Imagine having the power to conduct your own market research in minutes, without the need for a PhD in research. Appinio is the real-time market research platform that empowers you to get instant consumer insights, fueling your data-driven decisions. We've transformed market research from boring and intimidating to exciting and intuitive.

Here's why Appinio is your go-to platform:

  • Lightning-Fast Insights: From questions to insights in minutes. When you need answers, Appinio delivers swiftly.
  • User-Friendly: Our platform is so intuitive that anyone can use it; no research degree required.
  • Global Reach: Define your target group from over 1200 characteristics and survey them in 90+ countries.
  • Guided Expertise: Our dedicated research consultants will support you every step of the way, ensuring your research journey is seamless and effective.

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Situational Analysis Definition Methods Process Examples

27.03.2024 | 31min read

Situational Analysis: Definition, Methods, Process, Examples

What is Ad Hoc Analysis and Reporting Process Examples

26.03.2024 | 31min read

What is Ad Hoc Analysis and Reporting? Process, Examples

What is Predictive Modeling Definition Types Techniques

21.03.2024 | 28min read

What is Predictive Modeling? Definition, Types, Techniques

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Data Collection Methods | Step-by-Step Guide & Examples

Data Collection Methods | Step-by-Step Guide & Examples

Published on 4 May 2022 by Pritha Bhandari .

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental, or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address, and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analysed through statistical methods .
  • Qualitative data is expressed in words and analysed through interpretations and categorisations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data.

If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism, run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research, and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design .


Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalisation means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness, and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection.

Standardising procedures

If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorise observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organise and store your data.

  • If you are collecting data from people, you will likely need to anonymise and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimise distortion.
  • You can prevent loss of data by having an organisation system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1 to 5. The data produced is numerical and can be statistically analysed for averages and patterns.

To ensure that high-quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
  • You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research , you also have to consider the internal and external validity of your experiment.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, May 04). Data Collection Methods | Step-by-Step Guide & Examples. Scribbr. Retrieved 25 March 2024, from https://www.scribbr.co.uk/research-methods/data-collection-guide/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs quantitative research | examples & methods, triangulation in research | guide, types, examples, what is a conceptual framework | tips & examples.

  • 7 Data Collection Methods & Tools For Research


  • Data Collection

The underlying need for Data collection is to capture quality evidence that seeks to answer all the questions that have been posed. Through data collection businesses or management can deduce quality information that is a prerequisite for making informed decisions.

To improve the quality of information, it is expedient that data is collected so that you can draw inferences and make informed decisions on what is considered factual.

At the end of this article, you would understand why picking the best data collection method is necessary for achieving your set objective. 

Sign up on Formplus Builder to create your preferred online surveys or questionnaire for data collection. You don’t need to be tech-savvy! Start creating quality questionnaires with Formplus.

What is Data Collection?

Data collection is a methodical process of gathering and analyzing specific information to proffer solutions to relevant questions and evaluate the results. It focuses on finding out all there is to a particular subject matter. Data is collected to be further subjected to hypothesis testing which seeks to explain a phenomenon.

Hypothesis testing eliminates assumptions while making a proposition from the basis of reason.

what is data collection in research methodology

For collectors of data, there is a range of outcomes for which the data is collected. But the key purpose for which data is collected is to put a researcher in a vantage position to make predictions about future probabilities and trends.

The core forms in which data can be collected are primary and secondary data. While the former is collected by a researcher through first-hand sources, the latter is collected by an individual other than the user. 

Types of Data Collection 

Before broaching the subject of the various types of data collection. It is pertinent to note that data collection in itself falls under two broad categories; Primary data collection and secondary data collection.

Primary Data Collection

Primary data collection by definition is the gathering of raw data collected at the source. It is a process of collecting the original data collected by a researcher for a specific research purpose. It could be further analyzed into two segments; qualitative research and quantitative data collection methods. 

  • Qualitative Research Method 

The qualitative research methods of data collection do not involve the collection of data that involves numbers or a need to be deduced through a mathematical calculation, rather it is based on the non-quantifiable elements like the feeling or emotion of the researcher. An example of such a method is an open-ended questionnaire.

what is data collection in research methodology

  • Quantitative Method

Quantitative methods are presented in numbers and require a mathematical calculation to deduce. An example would be the use of a questionnaire with close-ended questions to arrive at figures to be calculated Mathematically. Also, methods of correlation and regression, mean, mode and median.

what is data collection in research methodology

Read Also: 15 Reasons to Choose Quantitative over Qualitative Research

Secondary Data Collection

Secondary data collection, on the other hand, is referred to as the gathering of second-hand data collected by an individual who is not the original user. It is the process of collecting data that is already existing, be it already published books, journals, and/or online portals. In terms of ease, it is much less expensive and easier to collect.

Your choice between Primary data collection and secondary data collection depends on the nature, scope, and area of your research as well as its aims and objectives. 

Importance of Data Collection

There are a bunch of underlying reasons for collecting data, especially for a researcher. Walking you through them, here are a few reasons; 

  • Integrity of the Research

A key reason for collecting data, be it through quantitative or qualitative methods is to ensure that the integrity of the research question is indeed maintained.

  • Reduce the likelihood of errors

The correct use of appropriate data collection of methods reduces the likelihood of errors consistent with the results. 

  • Decision Making

To minimize the risk of errors in decision-making, it is important that accurate data is collected so that the researcher doesn’t make uninformed decisions. 

  • Save Cost and Time

Data collection saves the researcher time and funds that would otherwise be misspent without a deeper understanding of the topic or subject matter.

  • To support a need for a new idea, change, and/or innovation

To prove the need for a change in the norm or the introduction of new information that will be widely accepted, it is important to collect data as evidence to support these claims.

What is a Data Collection Tool?

Data collection tools refer to the devices/instruments used to collect data, such as a paper questionnaire or computer-assisted interviewing system. Case Studies, Checklists, Interviews, Observation sometimes, and Surveys or Questionnaires are all tools used to collect data.

It is important to decide on the tools for data collection because research is carried out in different ways and for different purposes. The objective behind data collection is to capture quality evidence that allows analysis to lead to the formulation of convincing and credible answers to the posed questions.

The objective behind data collection is to capture quality evidence that allows analysis to lead to the formulation of convincing and credible answers to the questions that have been posed – Click to Tweet

The Formplus online data collection tool is perfect for gathering primary data, i.e. raw data collected from the source. You can easily get data with at least three data collection methods with our online and offline data-gathering tool. I.e Online Questionnaires , Focus Groups, and Reporting. 

In our previous articles, we’ve explained why quantitative research methods are more effective than qualitative methods . However, with the Formplus data collection tool, you can gather all types of primary data for academic, opinion or product research.

Top Data Collection Methods and Tools for Academic, Opinion, or Product Research

The following are the top 7 data collection methods for Academic, Opinion-based, or product research. Also discussed in detail are the nature, pros, and cons of each one. At the end of this segment, you will be best informed about which method best suits your research. 

An interview is a face-to-face conversation between two individuals with the sole purpose of collecting relevant information to satisfy a research purpose. Interviews are of different types namely; Structured, Semi-structured , and unstructured with each having a slight variation from the other.

Use this interview consent form template to let an interviewee give you consent to use data gotten from your interviews for investigative research purposes.

  • Structured Interviews – Simply put, it is a verbally administered questionnaire. In terms of depth, it is surface level and is usually completed within a short period. For speed and efficiency, it is highly recommendable, but it lacks depth.
  • Semi-structured Interviews – In this method, there subsist several key questions which cover the scope of the areas to be explored. It allows a little more leeway for the researcher to explore the subject matter.
  • Unstructured Interviews – It is an in-depth interview that allows the researcher to collect a wide range of information with a purpose. An advantage of this method is the freedom it gives a researcher to combine structure with flexibility even though it is more time-consuming.
  • In-depth information
  • Freedom of flexibility
  • Accurate data.
  • Time-consuming
  • Expensive to collect.

What are The Best Data Collection Tools for Interviews? 

For collecting data through interviews, here are a few tools you can use to easily collect data.

  • Audio Recorder

An audio recorder is used for recording sound on disc, tape, or film. Audio information can meet the needs of a wide range of people, as well as provide alternatives to print data collection tools.

  • Digital Camera

An advantage of a digital camera is that it can be used for transmitting those images to a monitor screen when the need arises.

A camcorder is used for collecting data through interviews. It provides a combination of both an audio recorder and a video camera. The data provided is qualitative in nature and allows the respondents to answer questions asked exhaustively. If you need to collect sensitive information during an interview, a camcorder might not work for you as you would need to maintain your subject’s privacy.

Want to conduct an interview for qualitative data research or a special report? Use this online interview consent form template to allow the interviewee to give their consent before you use the interview data for research or report. With premium features like e-signature, upload fields, form security, etc., Formplus Builder is the perfect tool to create your preferred online consent forms without coding experience. 


This is the process of collecting data through an instrument consisting of a series of questions and prompts to receive a response from the individuals it is administered to. Questionnaires are designed to collect data from a group. 

For clarity, it is important to note that a questionnaire isn’t a survey, rather it forms a part of it. A survey is a process of data gathering involving a variety of data collection methods, including a questionnaire.

On a questionnaire, there are three kinds of questions used. They are; fixed-alternative, scale, and open-ended. With each of the questions tailored to the nature and scope of the research.

  • Can be administered in large numbers and is cost-effective.
  • It can be used to compare and contrast previous research to measure change.
  • Easy to visualize and analyze.
  • Questionnaires offer actionable data.
  • Respondent identity is protected.
  • Questionnaires can cover all areas of a topic.
  • Relatively inexpensive.
  • Answers may be dishonest or the respondents lose interest midway.
  • Questionnaires can’t produce qualitative data.
  • Questions might be left unanswered.
  • Respondents may have a hidden agenda.
  • Not all questions can be analyzed easily.

What are the Best Data Collection Tools for Questionnaires? 

  • Formplus Online Questionnaire

Formplus lets you create powerful forms to help you collect the information you need. Formplus helps you create the online forms that you like. The Formplus online questionnaire form template to get actionable trends and measurable responses. Conduct research, optimize knowledge of your brand or just get to know an audience with this form template. The form template is fast, free and fully customizable.

  • Paper Questionnaire

A paper questionnaire is a data collection tool consisting of a series of questions and/or prompts for the purpose of gathering information from respondents. Mostly designed for statistical analysis of the responses, they can also be used as a form of data collection.

By definition, data reporting is the process of gathering and submitting data to be further subjected to analysis. The key aspect of data reporting is reporting accurate data because inaccurate data reporting leads to uninformed decision-making.

  • Informed decision-making.
  • Easily accessible.
  • Self-reported answers may be exaggerated.
  • The results may be affected by bias.
  • Respondents may be too shy to give out all the details.
  • Inaccurate reports will lead to uninformed decisions.

What are the Best Data Collection Tools for Reporting?

Reporting tools enable you to extract and present data in charts, tables, and other visualizations so users can find useful information. You could source data for reporting from Non-Governmental Organizations (NGO) reports, newspapers, website articles, and hospital records.

  • NGO Reports

Contained in NGO report is an in-depth and comprehensive report on the activities carried out by the NGO, covering areas such as business and human rights. The information contained in these reports is research-specific and forms an acceptable academic base for collecting data. NGOs often focus on development projects which are organized to promote particular causes.

Newspaper data are relatively easy to collect and are sometimes the only continuously available source of event data. Even though there is a problem of bias in newspaper data, it is still a valid tool in collecting data for Reporting.

  • Website Articles

Gathering and using data contained in website articles is also another tool for data collection. Collecting data from web articles is a quicker and less expensive data collection Two major disadvantages of using this data reporting method are biases inherent in the data collection process and possible security/confidentiality concerns.

  • Hospital Care records

Health care involves a diverse set of public and private data collection systems, including health surveys, administrative enrollment and billing records, and medical records, used by various entities, including hospitals, CHCs, physicians, and health plans. The data provided is clear, unbiased and accurate, but must be obtained under legal means as medical data is kept with the strictest regulations.


This is the introduction of new investigative questions in addition to/other than the ones originally used when the data was initially gathered. It involves adding measurement to a study or research. An example would be sourcing data from an archive.

  • Accuracy is very high.
  • Easily accessible information.
  • Problems with evaluation.
  • Difficulty in understanding.

What are the Best Data Collection Tools for Existing Data?

The concept of Existing data means that data is collected from existing sources to investigate research questions other than those for which the data were originally gathered. Tools to collect existing data include: 

  • Research Journals – Unlike newspapers and magazines, research journals are intended for an academic or technical audience, not general readers. A journal is a scholarly publication containing articles written by researchers, professors, and other experts.
  • Surveys – A survey is a data collection tool for gathering information from a sample population, with the intention of generalizing the results to a larger population. Surveys have a variety of purposes and can be carried out in many ways depending on the objectives to be achieved.

This is a data collection method by which information on a phenomenon is gathered through observation. The nature of the observation could be accomplished either as a complete observer, an observer as a participant, a participant as an observer, or as a complete participant. This method is a key base for formulating a hypothesis.

  • Easy to administer.
  • There subsists a greater accuracy with results.
  • It is a universally accepted practice.
  • It diffuses the situation of the unwillingness of respondents to administer a report.
  • It is appropriate for certain situations.
  • Some phenomena aren’t open to observation.
  • It cannot be relied upon.
  • Bias may arise.
  • It is expensive to administer.
  • Its validity cannot be predicted accurately.

What are the Best Data Collection Tools for Observation?

Observation involves the active acquisition of information from a primary source. Observation can also involve the perception and recording of data via the use of scientific instruments. The best tools for Observation are:

  • Checklists – state-specific criteria, that allow users to gather information and make judgments about what they should know in relation to the outcomes. They offer systematic ways of collecting data about specific behaviors, knowledge, and skills.
  • Direct observation – This is an observational study method of collecting evaluative information. The evaluator watches the subject in his or her usual environment without altering that environment.


The opposite of quantitative research which involves numerical-based data, this data collection method focuses more on qualitative research. It falls under the primary category of data based on the feelings and opinions of the respondents. This research involves asking open-ended questions to a group of individuals usually ranging from 6-10 people, to provide feedback.

  • Information obtained is usually very detailed.
  • Cost-effective when compared to one-on-one interviews.
  • It reflects speed and efficiency in the supply of results.
  • Lacking depth in covering the nitty-gritty of a subject matter.
  • Bias might still be evident.
  • Requires interviewer training
  • The researcher has very little control over the outcome.
  • A few vocal voices can drown out the rest.
  • Difficulty in assembling an all-inclusive group.

What are the Best Data Collection Tools for Focus Groups?

A focus group is a data collection method that is tightly facilitated and structured around a set of questions. The purpose of the meeting is to extract from the participants’ detailed responses to these questions. The best tools for tackling Focus groups are: 

  • Two-Way – One group watches another group answer the questions posed by the moderator. After listening to what the other group has to offer, the group that listens is able to facilitate more discussion and could potentially draw different conclusions .
  • Dueling-Moderator – There are two moderators who play the devil’s advocate. The main positive of the dueling-moderator focus group is to facilitate new ideas by introducing new ways of thinking and varying viewpoints.

This method of data collection encompasses the use of innovative methods to enhance participation in both individuals and groups. Also under the primary category, it is a combination of Interviews and Focus Groups while collecting qualitative data . This method is key when addressing sensitive subjects. 

  • Encourage participants to give responses.
  • It stimulates a deeper connection between participants.
  • The relative anonymity of respondents increases participation.
  • It improves the richness of the data collected.
  • It costs the most out of all the top 7.
  • It’s the most time-consuming.

What are the Best Data Collection Tools for Combination Research? 

The Combination Research method involves two or more data collection methods, for instance, interviews as well as questionnaires or a combination of semi-structured telephone interviews and focus groups. The best tools for combination research are: 

  • Online Survey –  The two tools combined here are online interviews and the use of questionnaires. This is a questionnaire that the target audience can complete over the Internet. It is timely, effective, and efficient. Especially since the data to be collected is quantitative in nature.
  • Dual-Moderator – The two tools combined here are focus groups and structured questionnaires. The structured questionnaires give a direction as to where the research is headed while two moderators take charge of the proceedings. Whilst one ensures the focus group session progresses smoothly, the other makes sure that the topics in question are all covered. Dual-moderator focus groups typically result in a more productive session and essentially lead to an optimum collection of data.

Why Formplus is the Best Data Collection Tool

  • Vast Options for Form Customization 

With Formplus, you can create your unique survey form. With options to change themes, font color, font, font type, layout, width, and more, you can create an attractive survey form. The builder also gives you as many features as possible to choose from and you do not need to be a graphic designer to create a form.

  • Extensive Analytics

Form Analytics, a feature in formplus helps you view the number of respondents, unique visits, total visits, abandonment rate, and average time spent before submission. This tool eliminates the need for a manual calculation of the received data and/or responses as well as the conversion rate for your poll.

  • Embed Survey Form on Your Website

Copy the link to your form and embed it as an iframe which will automatically load as your website loads, or as a popup that opens once the respondent clicks on the link. Embed the link on your Twitter page to give instant access to your followers.

what is data collection in research methodology

  • Geolocation Support

The geolocation feature on Formplus lets you ascertain where individual responses are coming. It utilises Google Maps to pinpoint the longitude and latitude of the respondent, to the nearest accuracy, along with the responses.

  • Multi-Select feature

This feature helps to conserve horizontal space as it allows you to put multiple options in one field. This translates to including more information on the survey form. 

Read Also: 10 Reasons to Use Formplus for Online Data Collection

How to Use Formplus to collect online data in 7 simple steps. 

  • Register or sign up on Formplus builder : Start creating your preferred questionnaire or survey by signing up with either your Google, Facebook, or Email account.

what is data collection in research methodology

Formplus gives you a free plan with basic features you can use to collect online data. Pricing plans with vast features starts at $20 monthly, with reasonable discounts for Education and Non-Profit Organizations. 

2. Input your survey title and use the form builder choice options to start creating your surveys. 

Use the choice option fields like single select, multiple select, checkbox, radio, and image choices to create your preferred multi-choice surveys online.

what is data collection in research methodology

3. Do you want customers to rate any of your products or services delivery? 

Use the rating to allow survey respondents rate your products or services. This is an ideal quantitative research method of collecting data. 

what is data collection in research methodology

4. Beautify your online questionnaire with Formplus Customisation features.

what is data collection in research methodology

  • Change the theme color
  • Add your brand’s logo and image to the forms
  • Change the form width and layout
  • Edit the submission button if you want
  • Change text font color and sizes
  • Do you have already made custom CSS to beautify your questionnaire? If yes, just copy and paste it to the CSS option.

5. Edit your survey questionnaire settings for your specific needs

Choose where you choose to store your files and responses. Select a submission deadline, choose a timezone, limit respondents’ responses, enable Captcha to prevent spam, and collect location data of customers.

what is data collection in research methodology

Set an introductory message to respondents before they begin the survey, toggle the “start button” post final submission message or redirect respondents to another page when they submit their questionnaires. 

Change the Email Notifications inventory and initiate an autoresponder message to all your survey questionnaire respondents. You can also transfer your forms to other users who can become form administrators.

6. Share links to your survey questionnaire page with customers.

There’s an option to copy and share the link as “Popup” or “Embed code” The data collection tool automatically creates a QR Code for Survey Questionnaire which you can download and share as appropriate. 

what is data collection in research methodology

Congratulations if you’ve made it to this stage. You can start sharing the link to your survey questionnaire with your customers.

7. View your Responses to the Survey Questionnaire

Toggle with the presentation of your summary from the options. Whether as a single, table or cards.

what is data collection in research methodology

8. Allow Formplus Analytics to interpret your Survey Questionnaire Data

what is data collection in research methodology

  With online form builder analytics, a business can determine;

  • The number of times the survey questionnaire was filled
  • The number of customers reached
  • Abandonment Rate: The rate at which customers exit the form without submitting it.
  • Conversion Rate: The percentage of customers who completed the online form
  • Average time spent per visit
  • Location of customers/respondents.
  • The type of device used by the customer to complete the survey questionnaire.

7 Tips to Create The Best Surveys For Data Collections

  •  Define the goal of your survey – Once the goal of your survey is outlined, it will aid in deciding which questions are the top priority. A clear attainable goal would, for example, mirror a clear reason as to why something is happening. e.g. “The goal of this survey is to understand why Employees are leaving an establishment.”
  • Use close-ended clearly defined questions – Avoid open-ended questions and ensure you’re not suggesting your preferred answer to the respondent. If possible offer a range of answers with choice options and ratings.
  • Survey outlook should be attractive and Inviting – An attractive-looking survey encourages a higher number of recipients to respond to the survey. Check out Formplus Builder for colorful options to integrate into your survey design. You could use images and videos to keep participants glued to their screens.
  •   Assure Respondents about the safety of their data – You want your respondents to be assured whilst disclosing details of their personal information to you. It’s your duty to inform the respondents that the data they provide is confidential and only collected for the purpose of research.
  • Ensure your survey can be completed in record time – Ideally, in a typical survey, users should be able to respond in 100 seconds. It is pertinent to note that they, the respondents, are doing you a favor. Don’t stress them. Be brief and get straight to the point.
  • Do a trial survey – Preview your survey before sending out your surveys to the intended respondents. Make a trial version which you’ll send to a few individuals. Based on their responses, you can draw inferences and decide whether or not your survey is ready for the big time.
  • Attach a reward upon completion for users – Give your respondents something to look forward to at the end of the survey. Think of it as a penny for their troubles. It could well be the encouragement they need to not abandon the survey midway.

Try out Formplus today . You can start making your own surveys with the Formplus online survey builder. By applying these tips, you will definitely get the most out of your online surveys.

Top Survey Templates For Data Collection 

  • Customer Satisfaction Survey Template 

On the template, you can collect data to measure customer satisfaction over key areas like the commodity purchase and the level of service they received. It also gives insight as to which products the customer enjoyed, how often they buy such a product, and whether or not the customer is likely to recommend the product to a friend or acquaintance. 

  • Demographic Survey Template

With this template, you would be able to measure, with accuracy, the ratio of male to female, age range, and the number of unemployed persons in a particular country as well as obtain their personal details such as names and addresses.

Respondents are also able to state their religious and political views about the country under review.

  • Feedback Form Template

Contained in the template for the online feedback form is the details of a product and/or service used. Identifying this product or service and documenting how long the customer has used them.

The overall satisfaction is measured as well as the delivery of the services. The likelihood that the customer also recommends said product is also measured.

  • Online Questionnaire Template

The online questionnaire template houses the respondent’s data as well as educational qualifications to collect information to be used for academic research.

Respondents can also provide their gender, race, and field of study as well as present living conditions as prerequisite data for the research study.

  • Student Data Sheet Form Template 

The template is a data sheet containing all the relevant information of a student. The student’s name, home address, guardian’s name, record of attendance as well as performance in school is well represented on this template. This is a perfect data collection method to deploy for a school or an education organization.

Also included is a record for interaction with others as well as a space for a short comment on the overall performance and attitude of the student. 

  • Interview Consent Form Template

This online interview consent form template allows the interviewee to sign off their consent to use the interview data for research or report to journalists. With premium features like short text fields, upload, e-signature, etc., Formplus Builder is the perfect tool to create your preferred online consent forms without coding experience.

What is the Best Data Collection Method for Qualitative Data?

Answer: Combination Research

The best data collection method for a researcher for gathering qualitative data which generally is data relying on the feelings, opinions, and beliefs of the respondents would be Combination Research.

The reason why combination research is the best fit is that it encompasses the attributes of Interviews and Focus Groups. It is also useful when gathering data that is sensitive in nature. It can be described as an all-purpose quantitative data collection method.

Above all, combination research improves the richness of data collected when compared with other data collection methods for qualitative data.

what is data collection in research methodology

What is the Best Data Collection Method for Quantitative Research Data?

Ans: Questionnaire

The best data collection method a researcher can employ in gathering quantitative data which takes into consideration data that can be represented in numbers and figures that can be deduced mathematically is the Questionnaire.

These can be administered to a large number of respondents while saving costs. For quantitative data that may be bulky or voluminous in nature, the use of a Questionnaire makes such data easy to visualize and analyze.

Another key advantage of the Questionnaire is that it can be used to compare and contrast previous research work done to measure changes.

Technology-Enabled Data Collection Methods

There are so many diverse methods available now in the world because technology has revolutionized the way data is being collected. It has provided efficient and innovative methods that anyone, especially researchers and organizations. Below are some technology-enabled data collection methods:

  • Online Surveys: Online surveys have gained popularity due to their ease of use and wide reach. You can distribute them through email, social media, or embed them on websites. Online surveys allow you to quickly complete data collection, automated data capture, and real-time analysis. Online surveys also offer features like skip logic, validation checks, and multimedia integration.
  • Mobile Surveys: With the widespread use of smartphones, mobile surveys’ popularity is also on the rise. Mobile surveys leverage the capabilities of mobile devices, and this allows respondents to participate at their convenience. This includes multimedia elements, location-based information, and real-time feedback. Mobile surveys are the best for capturing in-the-moment experiences or opinions.
  • Social Media Listening: Social media platforms are a good source of unstructured data that you can analyze to gain insights into customer sentiment and trends. Social media listening involves monitoring and analyzing social media conversations, mentions, and hashtags to understand public opinion, identify emerging topics, and assess brand reputation.
  • Wearable Devices and Sensors: You can embed wearable devices, such as fitness trackers or smartwatches, and sensors in everyday objects to capture continuous data on various physiological and environmental variables. This data can provide you with insights into health behaviors, activity patterns, sleep quality, and environmental conditions, among others.
  • Big Data Analytics: Big data analytics leverages large volumes of structured and unstructured data from various sources, such as transaction records, social media, and internet browsing. Advanced analytics techniques, like machine learning and natural language processing, can extract meaningful insights and patterns from this data, enabling organizations to make data-driven decisions.
Read Also: How Technology is Revolutionizing Data Collection

Faulty Data Collection Practices – Common Mistakes & Sources of Error

While technology-enabled data collection methods offer numerous advantages, there are some pitfalls and sources of error that you should be aware of. Here are some common mistakes and sources of error in data collection:

  • Population Specification Error: Population specification error occurs when the target population is not clearly defined or misidentified. This error leads to a mismatch between the research objectives and the actual population being studied, resulting in biased or inaccurate findings.
  • Sample Frame Error: Sample frame error occurs when the sampling frame, the list or source from which the sample is drawn, does not adequately represent the target population. This error can introduce selection bias and affect the generalizability of the findings.
  • Selection Error: Selection error occurs when the process of selecting participants or units for the study introduces bias. It can happen due to nonrandom sampling methods, inadequate sampling techniques, or self-selection bias. Selection error compromises the representativeness of the sample and affects the validity of the results.
  • Nonresponse Error: Nonresponse error occurs when selected participants choose not to participate or fail to respond to the data collection effort. Nonresponse bias can result in an unrepresentative sample if those who choose not to respond differ systematically from those who do respond. Efforts should be made to mitigate nonresponse and encourage participation to minimize this error.
  • Measurement Error: Measurement error arises from inaccuracies or inconsistencies in the measurement process. It can happen due to poorly designed survey instruments, ambiguous questions, respondent bias, or errors in data entry or coding. Measurement errors can lead to distorted or unreliable data, affecting the validity and reliability of the findings.

In order to mitigate these errors and ensure high-quality data collection, you should carefully plan your data collection procedures, and validate measurement tools. You should also use appropriate sampling techniques, employ randomization where possible, and minimize nonresponse through effective communication and incentives. Ensure you conduct regular checks and implement validation processes, and data cleaning procedures to identify and rectify errors during data analysis.

Best Practices for Data Collection

  • Clearly Define Objectives: Clearly define the research objectives and questions to guide the data collection process. This helps ensure that the collected data aligns with the research goals and provides relevant insights.
  • Plan Ahead: Develop a detailed data collection plan that includes the timeline, resources needed, and specific procedures to follow. This helps maintain consistency and efficiency throughout the data collection process.
  • Choose the Right Method: Select data collection methods that are appropriate for the research objectives and target population. Consider factors such as feasibility, cost-effectiveness, and the ability to capture the required data accurately.
  • Pilot Test : Before full-scale data collection, conduct a pilot test to identify any issues with the data collection instruments or procedures. This allows for refinement and improvement before data collection with the actual sample.
  • Train Data Collectors: If data collection involves human interaction, ensure that data collectors are properly trained on the data collection protocols, instruments, and ethical considerations. Consistent training helps minimize errors and maintain data quality.
  • Maintain Consistency: Follow standardized procedures throughout the data collection process to ensure consistency across data collectors and time. This includes using consistent measurement scales, instructions, and data recording methods.
  • Minimize Bias: Be aware of potential sources of bias in data collection and take steps to minimize their impact. Use randomization techniques, employ diverse data collectors, and implement strategies to mitigate response biases.
  • Ensure Data Quality: Implement quality control measures to ensure the accuracy, completeness, and reliability of the collected data. Conduct regular checks for data entry errors, inconsistencies, and missing values.
  • Maintain Data Confidentiality: Protect the privacy and confidentiality of participants’ data by implementing appropriate security measures. Ensure compliance with data protection regulations and obtain informed consent from participants.
  • Document the Process: Keep detailed documentation of the data collection process, including any deviations from the original plan, challenges encountered, and decisions made. This documentation facilitates transparency, replicability, and future analysis.

FAQs about Data Collection

  • What are secondary sources of data collection? Secondary sources of data collection are defined as the data that has been previously gathered and is available for your use as a researcher. These sources can include published research papers, government reports, statistical databases, and other existing datasets.
  • What are the primary sources of data collection? Primary sources of data collection involve collecting data directly from the original source also known as the firsthand sources. You can do this through surveys, interviews, observations, experiments, or other direct interactions with individuals or subjects of study.
  • How many types of data are there? There are two main types of data: qualitative and quantitative. Qualitative data is non-numeric and it includes information in the form of words, images, or descriptions. Quantitative data, on the other hand, is numeric and you can measure and analyze it statistically.
Sign up on Formplus Builder to create your preferred online surveys or questionnaire for data collection. You don’t need to be tech-savvy!


Connect to Formplus, Get Started Now - It's Free!

  • academic research
  • Data collection method
  • data collection techniques
  • data collection tool
  • data collection tools
  • field data collection
  • online data collection tool
  • product research
  • qualitative research data
  • quantitative research data
  • scientific research
  • busayo.longe


You may also like:

User Research: Definition, Methods, Tools and Guide

In this article, you’ll learn to provide value to your target market with user research. As a bonus, we’ve added user research tools and...

what is data collection in research methodology

Data Collection Plan: Definition + Steps to Do It

Introduction A data collection plan is a way to get specific information on your audience. You can use it to better understand what they...

Data Collection Sheet: Types + [Template Examples]

Simple guide on data collection sheet. Types, tools, and template examples.

How Technology is Revolutionizing Data Collection

As global industrialization continues to transform, it is becoming evident that there is a ubiquity of large datasets driven by the need...

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Can J Hosp Pharm
  • v.68(3); May-Jun 2015

Logo of cjhp

Qualitative Research: Data Collection, Analysis, and Management


In an earlier paper, 1 we presented an introduction to using qualitative research methods in pharmacy practice. In this article, we review some principles of the collection, analysis, and management of qualitative data to help pharmacists interested in doing research in their practice to continue their learning in this area. Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. Whereas quantitative research methods can be used to determine how many people undertake particular behaviours, qualitative methods can help researchers to understand how and why such behaviours take place. Within the context of pharmacy practice research, qualitative approaches have been used to examine a diverse array of topics, including the perceptions of key stakeholders regarding prescribing by pharmacists and the postgraduation employment experiences of young pharmacists (see “Further Reading” section at the end of this article).

In the previous paper, 1 we outlined 3 commonly used methodologies: ethnography 2 , grounded theory 3 , and phenomenology. 4 Briefly, ethnography involves researchers using direct observation to study participants in their “real life” environment, sometimes over extended periods. Grounded theory and its later modified versions (e.g., Strauss and Corbin 5 ) use face-to-face interviews and interactions such as focus groups to explore a particular research phenomenon and may help in clarifying a less-well-understood problem, situation, or context. Phenomenology shares some features with grounded theory (such as an exploration of participants’ behaviour) and uses similar techniques to collect data, but it focuses on understanding how human beings experience their world. It gives researchers the opportunity to put themselves in another person’s shoes and to understand the subjective experiences of participants. 6 Some researchers use qualitative methodologies but adopt a different standpoint, and an example of this appears in the work of Thurston and others, 7 discussed later in this paper.

Qualitative work requires reflection on the part of researchers, both before and during the research process, as a way of providing context and understanding for readers. When being reflexive, researchers should not try to simply ignore or avoid their own biases (as this would likely be impossible); instead, reflexivity requires researchers to reflect upon and clearly articulate their position and subjectivities (world view, perspectives, biases), so that readers can better understand the filters through which questions were asked, data were gathered and analyzed, and findings were reported. From this perspective, bias and subjectivity are not inherently negative but they are unavoidable; as a result, it is best that they be articulated up-front in a manner that is clear and coherent for readers.


What qualitative study seeks to convey is why people have thoughts and feelings that might affect the way they behave. Such study may occur in any number of contexts, but here, we focus on pharmacy practice and the way people behave with regard to medicines use (e.g., to understand patients’ reasons for nonadherence with medication therapy or to explore physicians’ resistance to pharmacists’ clinical suggestions). As we suggested in our earlier article, 1 an important point about qualitative research is that there is no attempt to generalize the findings to a wider population. Qualitative research is used to gain insights into people’s feelings and thoughts, which may provide the basis for a future stand-alone qualitative study or may help researchers to map out survey instruments for use in a quantitative study. It is also possible to use different types of research in the same study, an approach known as “mixed methods” research, and further reading on this topic may be found at the end of this paper.

The role of the researcher in qualitative research is to attempt to access the thoughts and feelings of study participants. This is not an easy task, as it involves asking people to talk about things that may be very personal to them. Sometimes the experiences being explored are fresh in the participant’s mind, whereas on other occasions reliving past experiences may be difficult. However the data are being collected, a primary responsibility of the researcher is to safeguard participants and their data. Mechanisms for such safeguarding must be clearly articulated to participants and must be approved by a relevant research ethics review board before the research begins. Researchers and practitioners new to qualitative research should seek advice from an experienced qualitative researcher before embarking on their project.


Whatever philosophical standpoint the researcher is taking and whatever the data collection method (e.g., focus group, one-to-one interviews), the process will involve the generation of large amounts of data. In addition to the variety of study methodologies available, there are also different ways of making a record of what is said and done during an interview or focus group, such as taking handwritten notes or video-recording. If the researcher is audio- or video-recording data collection, then the recordings must be transcribed verbatim before data analysis can begin. As a rough guide, it can take an experienced researcher/transcriber 8 hours to transcribe one 45-minute audio-recorded interview, a process than will generate 20–30 pages of written dialogue.

Many researchers will also maintain a folder of “field notes” to complement audio-taped interviews. Field notes allow the researcher to maintain and comment upon impressions, environmental contexts, behaviours, and nonverbal cues that may not be adequately captured through the audio-recording; they are typically handwritten in a small notebook at the same time the interview takes place. Field notes can provide important context to the interpretation of audio-taped data and can help remind the researcher of situational factors that may be important during data analysis. Such notes need not be formal, but they should be maintained and secured in a similar manner to audio tapes and transcripts, as they contain sensitive information and are relevant to the research. For more information about collecting qualitative data, please see the “Further Reading” section at the end of this paper.


If, as suggested earlier, doing qualitative research is about putting oneself in another person’s shoes and seeing the world from that person’s perspective, the most important part of data analysis and management is to be true to the participants. It is their voices that the researcher is trying to hear, so that they can be interpreted and reported on for others to read and learn from. To illustrate this point, consider the anonymized transcript excerpt presented in Appendix 1 , which is taken from a research interview conducted by one of the authors (J.S.). We refer to this excerpt throughout the remainder of this paper to illustrate how data can be managed, analyzed, and presented.

Interpretation of Data

Interpretation of the data will depend on the theoretical standpoint taken by researchers. For example, the title of the research report by Thurston and others, 7 “Discordant indigenous and provider frames explain challenges in improving access to arthritis care: a qualitative study using constructivist grounded theory,” indicates at least 2 theoretical standpoints. The first is the culture of the indigenous population of Canada and the place of this population in society, and the second is the social constructivist theory used in the constructivist grounded theory method. With regard to the first standpoint, it can be surmised that, to have decided to conduct the research, the researchers must have felt that there was anecdotal evidence of differences in access to arthritis care for patients from indigenous and non-indigenous backgrounds. With regard to the second standpoint, it can be surmised that the researchers used social constructivist theory because it assumes that behaviour is socially constructed; in other words, people do things because of the expectations of those in their personal world or in the wider society in which they live. (Please see the “Further Reading” section for resources providing more information about social constructivist theory and reflexivity.) Thus, these 2 standpoints (and there may have been others relevant to the research of Thurston and others 7 ) will have affected the way in which these researchers interpreted the experiences of the indigenous population participants and those providing their care. Another standpoint is feminist standpoint theory which, among other things, focuses on marginalized groups in society. Such theories are helpful to researchers, as they enable us to think about things from a different perspective. Being aware of the standpoints you are taking in your own research is one of the foundations of qualitative work. Without such awareness, it is easy to slip into interpreting other people’s narratives from your own viewpoint, rather than that of the participants.

To analyze the example in Appendix 1 , we will adopt a phenomenological approach because we want to understand how the participant experienced the illness and we want to try to see the experience from that person’s perspective. It is important for the researcher to reflect upon and articulate his or her starting point for such analysis; for example, in the example, the coder could reflect upon her own experience as a female of a majority ethnocultural group who has lived within middle class and upper middle class settings. This personal history therefore forms the filter through which the data will be examined. This filter does not diminish the quality or significance of the analysis, since every researcher has his or her own filters; however, by explicitly stating and acknowledging what these filters are, the researcher makes it easer for readers to contextualize the work.

Transcribing and Checking

For the purposes of this paper it is assumed that interviews or focus groups have been audio-recorded. As mentioned above, transcribing is an arduous process, even for the most experienced transcribers, but it must be done to convert the spoken word to the written word to facilitate analysis. For anyone new to conducting qualitative research, it is beneficial to transcribe at least one interview and one focus group. It is only by doing this that researchers realize how difficult the task is, and this realization affects their expectations when asking others to transcribe. If the research project has sufficient funding, then a professional transcriber can be hired to do the work. If this is the case, then it is a good idea to sit down with the transcriber, if possible, and talk through the research and what the participants were talking about. This background knowledge for the transcriber is especially important in research in which people are using jargon or medical terms (as in pharmacy practice). Involving your transcriber in this way makes the work both easier and more rewarding, as he or she will feel part of the team. Transcription editing software is also available, but it is expensive. For example, ELAN (more formally known as EUDICO Linguistic Annotator, developed at the Technical University of Berlin) 8 is a tool that can help keep data organized by linking media and data files (particularly valuable if, for example, video-taping of interviews is complemented by transcriptions). It can also be helpful in searching complex data sets. Products such as ELAN do not actually automatically transcribe interviews or complete analyses, and they do require some time and effort to learn; nonetheless, for some research applications, it may be a valuable to consider such software tools.

All audio recordings should be transcribed verbatim, regardless of how intelligible the transcript may be when it is read back. Lines of text should be numbered. Once the transcription is complete, the researcher should read it while listening to the recording and do the following: correct any spelling or other errors; anonymize the transcript so that the participant cannot be identified from anything that is said (e.g., names, places, significant events); insert notations for pauses, laughter, looks of discomfort; insert any punctuation, such as commas and full stops (periods) (see Appendix 1 for examples of inserted punctuation), and include any other contextual information that might have affected the participant (e.g., temperature or comfort of the room).

Dealing with the transcription of a focus group is slightly more difficult, as multiple voices are involved. One way of transcribing such data is to “tag” each voice (e.g., Voice A, Voice B). In addition, the focus group will usually have 2 facilitators, whose respective roles will help in making sense of the data. While one facilitator guides participants through the topic, the other can make notes about context and group dynamics. More information about group dynamics and focus groups can be found in resources listed in the “Further Reading” section.

Reading between the Lines

During the process outlined above, the researcher can begin to get a feel for the participant’s experience of the phenomenon in question and can start to think about things that could be pursued in subsequent interviews or focus groups (if appropriate). In this way, one participant’s narrative informs the next, and the researcher can continue to interview until nothing new is being heard or, as it says in the text books, “saturation is reached”. While continuing with the processes of coding and theming (described in the next 2 sections), it is important to consider not just what the person is saying but also what they are not saying. For example, is a lengthy pause an indication that the participant is finding the subject difficult, or is the person simply deciding what to say? The aim of the whole process from data collection to presentation is to tell the participants’ stories using exemplars from their own narratives, thus grounding the research findings in the participants’ lived experiences.

Smith 9 suggested a qualitative research method known as interpretative phenomenological analysis, which has 2 basic tenets: first, that it is rooted in phenomenology, attempting to understand the meaning that individuals ascribe to their lived experiences, and second, that the researcher must attempt to interpret this meaning in the context of the research. That the researcher has some knowledge and expertise in the subject of the research means that he or she can have considerable scope in interpreting the participant’s experiences. Larkin and others 10 discussed the importance of not just providing a description of what participants say. Rather, interpretative phenomenological analysis is about getting underneath what a person is saying to try to truly understand the world from his or her perspective.

Once all of the research interviews have been transcribed and checked, it is time to begin coding. Field notes compiled during an interview can be a useful complementary source of information to facilitate this process, as the gap in time between an interview, transcribing, and coding can result in memory bias regarding nonverbal or environmental context issues that may affect interpretation of data.

Coding refers to the identification of topics, issues, similarities, and differences that are revealed through the participants’ narratives and interpreted by the researcher. This process enables the researcher to begin to understand the world from each participant’s perspective. Coding can be done by hand on a hard copy of the transcript, by making notes in the margin or by highlighting and naming sections of text. More commonly, researchers use qualitative research software (e.g., NVivo, QSR International Pty Ltd; www.qsrinternational.com/products_nvivo.aspx ) to help manage their transcriptions. It is advised that researchers undertake a formal course in the use of such software or seek supervision from a researcher experienced in these tools.

Returning to Appendix 1 and reading from lines 8–11, a code for this section might be “diagnosis of mental health condition”, but this would just be a description of what the participant is talking about at that point. If we read a little more deeply, we can ask ourselves how the participant might have come to feel that the doctor assumed he or she was aware of the diagnosis or indeed that they had only just been told the diagnosis. There are a number of pauses in the narrative that might suggest the participant is finding it difficult to recall that experience. Later in the text, the participant says “nobody asked me any questions about my life” (line 19). This could be coded simply as “health care professionals’ consultation skills”, but that would not reflect how the participant must have felt never to be asked anything about his or her personal life, about the participant as a human being. At the end of this excerpt, the participant just trails off, recalling that no-one showed any interest, which makes for very moving reading. For practitioners in pharmacy, it might also be pertinent to explore the participant’s experience of akathisia and why this was left untreated for 20 years.

One of the questions that arises about qualitative research relates to the reliability of the interpretation and representation of the participants’ narratives. There are no statistical tests that can be used to check reliability and validity as there are in quantitative research. However, work by Lincoln and Guba 11 suggests that there are other ways to “establish confidence in the ‘truth’ of the findings” (p. 218). They call this confidence “trustworthiness” and suggest that there are 4 criteria of trustworthiness: credibility (confidence in the “truth” of the findings), transferability (showing that the findings have applicability in other contexts), dependability (showing that the findings are consistent and could be repeated), and confirmability (the extent to which the findings of a study are shaped by the respondents and not researcher bias, motivation, or interest).

One way of establishing the “credibility” of the coding is to ask another researcher to code the same transcript and then to discuss any similarities and differences in the 2 resulting sets of codes. This simple act can result in revisions to the codes and can help to clarify and confirm the research findings.

Theming refers to the drawing together of codes from one or more transcripts to present the findings of qualitative research in a coherent and meaningful way. For example, there may be examples across participants’ narratives of the way in which they were treated in hospital, such as “not being listened to” or “lack of interest in personal experiences” (see Appendix 1 ). These may be drawn together as a theme running through the narratives that could be named “the patient’s experience of hospital care”. The importance of going through this process is that at its conclusion, it will be possible to present the data from the interviews using quotations from the individual transcripts to illustrate the source of the researchers’ interpretations. Thus, when the findings are organized for presentation, each theme can become the heading of a section in the report or presentation. Underneath each theme will be the codes, examples from the transcripts, and the researcher’s own interpretation of what the themes mean. Implications for real life (e.g., the treatment of people with chronic mental health problems) should also be given.


In this final section of this paper, we describe some ways of drawing together or “synthesizing” research findings to represent, as faithfully as possible, the meaning that participants ascribe to their life experiences. This synthesis is the aim of the final stage of qualitative research. For most readers, the synthesis of data presented by the researcher is of crucial significance—this is usually where “the story” of the participants can be distilled, summarized, and told in a manner that is both respectful to those participants and meaningful to readers. There are a number of ways in which researchers can synthesize and present their findings, but any conclusions drawn by the researchers must be supported by direct quotations from the participants. In this way, it is made clear to the reader that the themes under discussion have emerged from the participants’ interviews and not the mind of the researcher. The work of Latif and others 12 gives an example of how qualitative research findings might be presented.

Planning and Writing the Report

As has been suggested above, if researchers code and theme their material appropriately, they will naturally find the headings for sections of their report. Qualitative researchers tend to report “findings” rather than “results”, as the latter term typically implies that the data have come from a quantitative source. The final presentation of the research will usually be in the form of a report or a paper and so should follow accepted academic guidelines. In particular, the article should begin with an introduction, including a literature review and rationale for the research. There should be a section on the chosen methodology and a brief discussion about why qualitative methodology was most appropriate for the study question and why one particular methodology (e.g., interpretative phenomenological analysis rather than grounded theory) was selected to guide the research. The method itself should then be described, including ethics approval, choice of participants, mode of recruitment, and method of data collection (e.g., semistructured interviews or focus groups), followed by the research findings, which will be the main body of the report or paper. The findings should be written as if a story is being told; as such, it is not necessary to have a lengthy discussion section at the end. This is because much of the discussion will take place around the participants’ quotes, such that all that is needed to close the report or paper is a summary, limitations of the research, and the implications that the research has for practice. As stated earlier, it is not the intention of qualitative research to allow the findings to be generalized, and therefore this is not, in itself, a limitation.

Planning out the way that findings are to be presented is helpful. It is useful to insert the headings of the sections (the themes) and then make a note of the codes that exemplify the thoughts and feelings of your participants. It is generally advisable to put in the quotations that you want to use for each theme, using each quotation only once. After all this is done, the telling of the story can begin as you give your voice to the experiences of the participants, writing around their quotations. Do not be afraid to draw assumptions from the participants’ narratives, as this is necessary to give an in-depth account of the phenomena in question. Discuss these assumptions, drawing on your participants’ words to support you as you move from one code to another and from one theme to the next. Finally, as appropriate, it is possible to include examples from literature or policy documents that add support for your findings. As an exercise, you may wish to code and theme the sample excerpt in Appendix 1 and tell the participant’s story in your own way. Further reading about “doing” qualitative research can be found at the end of this paper.


Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. It can be used in pharmacy practice research to explore how patients feel about their health and their treatment. Qualitative research has been used by pharmacists to explore a variety of questions and problems (see the “Further Reading” section for examples). An understanding of these issues can help pharmacists and other health care professionals to tailor health care to match the individual needs of patients and to develop a concordant relationship. Doing qualitative research is not easy and may require a complete rethink of how research is conducted, particularly for researchers who are more familiar with quantitative approaches. There are many ways of conducting qualitative research, and this paper has covered some of the practical issues regarding data collection, analysis, and management. Further reading around the subject will be essential to truly understand this method of accessing peoples’ thoughts and feelings to enable researchers to tell participants’ stories.

Appendix 1. Excerpt from a sample transcript

The participant (age late 50s) had suffered from a chronic mental health illness for 30 years. The participant had become a “revolving door patient,” someone who is frequently in and out of hospital. As the participant talked about past experiences, the researcher asked:

  • What was treatment like 30 years ago?
  • Umm—well it was pretty much they could do what they wanted with you because I was put into the er, the er kind of system er, I was just on
  • endless section threes.
  • Really…
  • But what I didn’t realize until later was that if you haven’t actually posed a threat to someone or yourself they can’t really do that but I didn’t know
  • that. So wh-when I first went into hospital they put me on the forensic ward ’cause they said, “We don’t think you’ll stay here we think you’ll just
  • run-run away.” So they put me then onto the acute admissions ward and – er – I can remember one of the first things I recall when I got onto that
  • ward was sitting down with a er a Dr XXX. He had a book this thick [gestures] and on each page it was like three questions and he went through
  • all these questions and I answered all these questions. So we’re there for I don’t maybe two hours doing all that and he asked me he said “well
  • when did somebody tell you then that you have schizophrenia” I said “well nobody’s told me that” so he seemed very surprised but nobody had
  • actually [pause] whe-when I first went up there under police escort erm the senior kind of consultants people I’d been to where I was staying and
  • ermm so er [pause] I . . . the, I can remember the very first night that I was there and given this injection in this muscle here [gestures] and just
  • having dreadful side effects the next day I woke up [pause]
  • . . . and I suffered that akathesia I swear to you, every minute of every day for about 20 years.
  • Oh how awful.
  • And that side of it just makes life impossible so the care on the wards [pause] umm I don’t know it’s kind of, it’s kind of hard to put into words
  • [pause]. Because I’m not saying they were sort of like not friendly or interested but then nobody ever seemed to want to talk about your life [pause]
  • nobody asked me any questions about my life. The only questions that came into was they asked me if I’d be a volunteer for these student exams
  • and things and I said “yeah” so all the questions were like “oh what jobs have you done,” er about your relationships and things and er but
  • nobody actually sat down and had a talk and showed some interest in you as a person you were just there basically [pause] um labelled and you
  • know there was there was [pause] but umm [pause] yeah . . .

This article is the 10th in the CJHP Research Primer Series, an initiative of the CJHP Editorial Board and the CSHP Research Committee. The planned 2-year series is intended to appeal to relatively inexperienced researchers, with the goal of building research capacity among practising pharmacists. The articles, presenting simple but rigorous guidance to encourage and support novice researchers, are being solicited from authors with appropriate expertise.

Previous articles in this series:

Bond CM. The research jigsaw: how to get started. Can J Hosp Pharm . 2014;67(1):28–30.

Tully MP. Research: articulating questions, generating hypotheses, and choosing study designs. Can J Hosp Pharm . 2014;67(1):31–4.

Loewen P. Ethical issues in pharmacy practice research: an introductory guide. Can J Hosp Pharm. 2014;67(2):133–7.

Tsuyuki RT. Designing pharmacy practice research trials. Can J Hosp Pharm . 2014;67(3):226–9.

Bresee LC. An introduction to developing surveys for pharmacy practice research. Can J Hosp Pharm . 2014;67(4):286–91.

Gamble JM. An introduction to the fundamentals of cohort and case–control studies. Can J Hosp Pharm . 2014;67(5):366–72.

Austin Z, Sutton J. Qualitative research: getting started. C an J Hosp Pharm . 2014;67(6):436–40.

Houle S. An introduction to the fundamentals of randomized controlled trials in pharmacy research. Can J Hosp Pharm . 2014; 68(1):28–32.

Charrois TL. Systematic reviews: What do you need to know to get started? Can J Hosp Pharm . 2014;68(2):144–8.

Competing interests: None declared.

Further Reading

Examples of qualitative research in pharmacy practice.

  • Farrell B, Pottie K, Woodend K, Yao V, Dolovich L, Kennie N, et al. Shifts in expectations: evaluating physicians’ perceptions as pharmacists integrated into family practice. J Interprof Care. 2010; 24 (1):80–9. [ PubMed ] [ Google Scholar ]
  • Gregory P, Austin Z. Postgraduation employment experiences of new pharmacists in Ontario in 2012–2013. Can Pharm J. 2014; 147 (5):290–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Marks PZ, Jennnings B, Farrell B, Kennie-Kaulbach N, Jorgenson D, Pearson-Sharpe J, et al. “I gained a skill and a change in attitude”: a case study describing how an online continuing professional education course for pharmacists supported achievement of its transfer to practice outcomes. Can J Univ Contin Educ. 2014; 40 (2):1–18. [ Google Scholar ]
  • Nair KM, Dolovich L, Brazil K, Raina P. It’s all about relationships: a qualitative study of health researchers’ perspectives on interdisciplinary research. BMC Health Serv Res. 2008; 8 :110. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pojskic N, MacKeigan L, Boon H, Austin Z. Initial perceptions of key stakeholders in Ontario regarding independent prescriptive authority for pharmacists. Res Soc Adm Pharm. 2014; 10 (2):341–54. [ PubMed ] [ Google Scholar ]

Qualitative Research in General

  • Breakwell GM, Hammond S, Fife-Schaw C. Research methods in psychology. Thousand Oaks (CA): Sage Publications; 1995. [ Google Scholar ]
  • Given LM. 100 questions (and answers) about qualitative research. Thousand Oaks (CA): Sage Publications; 2015. [ Google Scholar ]
  • Miles B, Huberman AM. Qualitative data analysis. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]
  • Patton M. Qualitative research and evaluation methods. Thousand Oaks (CA): Sage Publications; 2002. [ Google Scholar ]
  • Willig C. Introducing qualitative research in psychology. Buckingham (UK): Open University Press; 2001. [ Google Scholar ]

Group Dynamics in Focus Groups

  • Farnsworth J, Boon B. Analysing group dynamics within the focus group. Qual Res. 2010; 10 (5):605–24. [ Google Scholar ]

Social Constructivism

  • Social constructivism. Berkeley (CA): University of California, Berkeley, Berkeley Graduate Division, Graduate Student Instruction Teaching & Resource Center; [cited 2015 June 4]. Available from: http://gsi.berkeley.edu/gsi-guide-contents/learning-theory-research/social-constructivism/ [ Google Scholar ]

Mixed Methods

  • Creswell J. Research design: qualitative, quantitative, and mixed methods approaches. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]

Collecting Qualitative Data

  • Arksey H, Knight P. Interviewing for social scientists: an introductory resource with examples. Thousand Oaks (CA): Sage Publications; 1999. [ Google Scholar ]
  • Guest G, Namey EE, Mitchel ML. Collecting qualitative data: a field manual for applied research. Thousand Oaks (CA): Sage Publications; 2013. [ Google Scholar ]

Constructivist Grounded Theory

  • Charmaz K. Grounded theory: objectivist and constructivist methods. In: Denzin N, Lincoln Y, editors. Handbook of qualitative research. 2nd ed. Thousand Oaks (CA): Sage Publications; 2000. pp. 509–35. [ Google Scholar ]

Grad Coach

What Is Research Methodology? A Plain-Language Explanation & Definition (With Examples)

By Derek Jansen (MBA)  and Kerryn Warren (PhD) | June 2020 (Last updated April 2023)

If you’re new to formal academic research, it’s quite likely that you’re feeling a little overwhelmed by all the technical lingo that gets thrown around. And who could blame you – “research methodology”, “research methods”, “sampling strategies”… it all seems never-ending!

In this post, we’ll demystify the landscape with plain-language explanations and loads of examples (including easy-to-follow videos), so that you can approach your dissertation, thesis or research project with confidence. Let’s get started.

Research Methodology 101

  • What exactly research methodology means
  • What qualitative , quantitative and mixed methods are
  • What sampling strategy is
  • What data collection methods are
  • What data analysis methods are
  • How to choose your research methodology
  • Example of a research methodology

Free Webinar: Research Methodology 101

What is research methodology?

Research methodology simply refers to the practical “how” of a research study. More specifically, it’s about how  a researcher  systematically designs a study  to ensure valid and reliable results that address the research aims, objectives and research questions . Specifically, how the researcher went about deciding:

  • What type of data to collect (e.g., qualitative or quantitative data )
  • Who  to collect it from (i.e., the sampling strategy )
  • How to  collect  it (i.e., the data collection method )
  • How to  analyse  it (i.e., the data analysis methods )

Within any formal piece of academic research (be it a dissertation, thesis or journal article), you’ll find a research methodology chapter or section which covers the aspects mentioned above. Importantly, a good methodology chapter explains not just   what methodological choices were made, but also explains  why they were made. In other words, the methodology chapter should justify  the design choices, by showing that the chosen methods and techniques are the best fit for the research aims, objectives and research questions. 

So, it’s the same as research design?

Not quite. As we mentioned, research methodology refers to the collection of practical decisions regarding what data you’ll collect, from who, how you’ll collect it and how you’ll analyse it. Research design, on the other hand, is more about the overall strategy you’ll adopt in your study. For example, whether you’ll use an experimental design in which you manipulate one variable while controlling others. You can learn more about research design and the various design types here .

Need a helping hand?

what is data collection in research methodology

What are qualitative, quantitative and mixed-methods?

Qualitative, quantitative and mixed-methods are different types of methodological approaches, distinguished by their focus on words , numbers or both . This is a bit of an oversimplification, but its a good starting point for understanding.

Let’s take a closer look.

Qualitative research refers to research which focuses on collecting and analysing words (written or spoken) and textual or visual data, whereas quantitative research focuses on measurement and testing using numerical data . Qualitative analysis can also focus on other “softer” data points, such as body language or visual elements.

It’s quite common for a qualitative methodology to be used when the research aims and research questions are exploratory  in nature. For example, a qualitative methodology might be used to understand peoples’ perceptions about an event that took place, or a political candidate running for president. 

Contrasted to this, a quantitative methodology is typically used when the research aims and research questions are confirmatory  in nature. For example, a quantitative methodology might be used to measure the relationship between two variables (e.g. personality type and likelihood to commit a crime) or to test a set of hypotheses .

As you’ve probably guessed, the mixed-method methodology attempts to combine the best of both qualitative and quantitative methodologies to integrate perspectives and create a rich picture. If you’d like to learn more about these three methodological approaches, be sure to watch our explainer video below.

What is sampling strategy?

Simply put, sampling is about deciding who (or where) you’re going to collect your data from . Why does this matter? Well, generally it’s not possible to collect data from every single person in your group of interest (this is called the “population”), so you’ll need to engage a smaller portion of that group that’s accessible and manageable (this is called the “sample”).

How you go about selecting the sample (i.e., your sampling strategy) will have a major impact on your study.  There are many different sampling methods  you can choose from, but the two overarching categories are probability   sampling and  non-probability   sampling .

Probability sampling  involves using a completely random sample from the group of people you’re interested in. This is comparable to throwing the names all potential participants into a hat, shaking it up, and picking out the “winners”. By using a completely random sample, you’ll minimise the risk of selection bias and the results of your study will be more generalisable  to the entire population. 

Non-probability sampling , on the other hand,  doesn’t use a random sample . For example, it might involve using a convenience sample, which means you’d only interview or survey people that you have access to (perhaps your friends, family or work colleagues), rather than a truly random sample. With non-probability sampling, the results are typically not generalisable .

To learn more about sampling methods, be sure to check out the video below.

What are data collection methods?

As the name suggests, data collection methods simply refers to the way in which you go about collecting the data for your study. Some of the most common data collection methods include:

  • Interviews (which can be unstructured, semi-structured or structured)
  • Focus groups and group interviews
  • Surveys (online or physical surveys)
  • Observations (watching and recording activities)
  • Biophysical measurements (e.g., blood pressure, heart rate, etc.)
  • Documents and records (e.g., financial reports, court records, etc.)

The choice of which data collection method to use depends on your overall research aims and research questions , as well as practicalities and resource constraints. For example, if your research is exploratory in nature, qualitative methods such as interviews and focus groups would likely be a good fit. Conversely, if your research aims to measure specific variables or test hypotheses, large-scale surveys that produce large volumes of numerical data would likely be a better fit.

What are data analysis methods?

Data analysis methods refer to the methods and techniques that you’ll use to make sense of your data. These can be grouped according to whether the research is qualitative  (words-based) or quantitative (numbers-based).

Popular data analysis methods in qualitative research include:

  • Qualitative content analysis
  • Thematic analysis
  • Discourse analysis
  • Narrative analysis
  • Interpretative phenomenological analysis (IPA)
  • Visual analysis (of photographs, videos, art, etc.)

Qualitative data analysis all begins with data coding , after which an analysis method is applied. In some cases, more than one analysis method is used, depending on the research aims and research questions . In the video below, we explore some  common qualitative analysis methods, along with practical examples.  

Moving on to the quantitative side of things, popular data analysis methods in this type of research include:

  • Descriptive statistics (e.g. means, medians, modes )
  • Inferential statistics (e.g. correlation, regression, structural equation modelling)

Again, the choice of which data collection method to use depends on your overall research aims and objectives , as well as practicalities and resource constraints. In the video below, we explain some core concepts central to quantitative analysis.

How do I choose a research methodology?

As you’ve probably picked up by now, your research aims and objectives have a major influence on the research methodology . So, the starting point for developing your research methodology is to take a step back and look at the big picture of your research, before you make methodology decisions. The first question you need to ask yourself is whether your research is exploratory or confirmatory in nature.

If your research aims and objectives are primarily exploratory in nature, your research will likely be qualitative and therefore you might consider qualitative data collection methods (e.g. interviews) and analysis methods (e.g. qualitative content analysis). 

Conversely, if your research aims and objective are looking to measure or test something (i.e. they’re confirmatory), then your research will quite likely be quantitative in nature, and you might consider quantitative data collection methods (e.g. surveys) and analyses (e.g. statistical analysis).

Designing your research and working out your methodology is a large topic, which we cover extensively on the blog . For now, however, the key takeaway is that you should always start with your research aims, objectives and research questions (the golden thread). Every methodological choice you make needs align with those three components. 

Example of a research methodology chapter

In the video below, we provide a detailed walkthrough of a research methodology from an actual dissertation, as well as an overview of our free methodology template .

what is data collection in research methodology

Psst… there’s more (for free)

This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project. 

You Might Also Like:

What is descriptive statistics?


Leo Balanlay

Thank you for this simple yet comprehensive and easy to digest presentation. God Bless!

Derek Jansen

You’re most welcome, Leo. Best of luck with your research!


I found it very useful. many thanks

Solomon F. Joel

This is really directional. A make-easy research knowledge.

Upendo Mmbaga

Thank you for this, I think will help my research proposal


Thanks for good interpretation,well understood.

Alhaji Alie Kanu

Good morning sorry I want to the search topic

Baraka Gombela

Thank u more


Thank you, your explanation is simple and very helpful.

Suleiman Abubakar

Very educative a.nd exciting platform. A bigger thank you and I’ll like to always be with you

Daniel Mondela

That’s the best analysis


So simple yet so insightful. Thank you.

Wendy Lushaba

This really easy to read as it is self-explanatory. Very much appreciated…


Thanks for this. It’s so helpful and explicit. For those elements highlighted in orange, they were good sources of referrals for concepts I didn’t understand. A million thanks for this.

Tabe Solomon Matebesi

Good morning, I have been reading your research lessons through out a period of times. They are important, impressive and clear. Want to subscribe and be and be active with you.

Hafiz Tahir

Thankyou So much Sir Derek…

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on it so that we’ll continue to understand more.sorry that’s a suggestion.

James Olukoya

Beautiful presentation. I love it.


please provide a research mehodology example for zoology

Ogar , Praise

It’s very educative and well explained

Joseph Chan

Thanks for the concise and informative data.

Goja Terhemba John

This is really good for students to be safe and well understand that research is all about

Prakash thapa

Thank you so much Derek sir🖤🙏🤗


Very simple and reliable

Chizor Adisa

This is really helpful. Thanks alot. God bless you.


very useful, Thank you very much..

nakato justine

thanks a lot its really useful


in a nutshell..thank you!


Thanks for updating my understanding on this aspect of my Thesis writing.


thank you so much my through this video am competently going to do a good job my thesis


Very simple but yet insightful Thank you

Adegboyega ADaeBAYO

This has been an eye opening experience. Thank you grad coach team.


Very useful message for research scholars


Really very helpful thank you


yes you are right and i’m left


Research methodology with a simplest way i have never seen before this article.

wogayehu tuji

wow thank u so much

Good morning thanks so much for the on line lectures am a student of university of Makeni.select a research topic and deliberate on is so that we will continue to understand more.sorry that’s a suggestion.


Very precise and informative.

Javangwe Nyeketa

Thanks for simplifying these terms for us, really appreciate it.

Mary Benard Mwanganya

Thanks this has really helped me. It is very easy to understand.


I found the notes and the presentation assisting and opening my understanding on research methodology

Godfrey Martin Assenga

Good presentation

Nhubu Tawanda

Im so glad you clarified my misconceptions. Im now ready to fry my onions. Thank you so much. God bless


Thank you a lot.


thanks for the easy way of learning and desirable presentation.

Ajala Tajudeen

Thanks a lot. I am inspired

Visor Likali

Well written

Pondris Patrick

I am writing a APA Format paper . I using questionnaire with 120 STDs teacher for my participant. Can you write me mthology for this research. Send it through email sent. Just need a sample as an example please. My topic is ” impacts of overcrowding on students learning

Thanks for your comment.

We can’t write your methodology for you. If you’re looking for samples, you should be able to find some sample methodologies on Google. Alternatively, you can download some previous dissertations from a dissertation directory and have a look at the methodology chapters therein.

All the best with your research.


Thank you so much for this!! God Bless


Thank you. Explicit explanation


Thank you, Derek and Kerryn, for making this simple to understand. I’m currently at the inception stage of my research.


Thnks a lot , this was very usefull on my assignment

Beulah Emmanuel

excellent explanation

Gino Raz

I’m currently working on my master’s thesis, thanks for this! I’m certain that I will use Qualitative methodology.


Thanks a lot for this concise piece, it was quite relieving and helpful. God bless you BIG…

Yonas Tesheme

I am currently doing my dissertation proposal and I am sure that I will do quantitative research. Thank you very much it was extremely helpful.

zahid t ahmad

Very interesting and informative yet I would like to know about examples of Research Questions as well, if possible.

Maisnam loyalakla

I’m about to submit a research presentation, I have come to understand from your simplification on understanding research methodology. My research will be mixed methodology, qualitative as well as quantitative. So aim and objective of mixed method would be both exploratory and confirmatory. Thanks you very much for your guidance.

Mila Milano

OMG thanks for that, you’re a life saver. You covered all the points I needed. Thank you so much ❤️ ❤️ ❤️


Thank you immensely for this simple, easy to comprehend explanation of data collection methods. I have been stuck here for months 😩. Glad I found your piece. Super insightful.


I’m going to write synopsis which will be quantitative research method and I don’t know how to frame my topic, can I kindly get some ideas..


Thanks for this, I was really struggling.

This was really informative I was struggling but this helped me.

Modie Maria Neswiswi

Thanks a lot for this information, simple and straightforward. I’m a last year student from the University of South Africa UNISA South Africa.

Mursel Amin

its very much informative and understandable. I have enlightened.

Mustapha Abubakar

An interesting nice exploration of a topic.


Thank you. Accurate and simple🥰

Sikandar Ali Shah

This article was really helpful, it helped me understanding the basic concepts of the topic Research Methodology. The examples were very clear, and easy to understand. I would like to visit this website again. Thank you so much for such a great explanation of the subject.


Thanks dude


Thank you Doctor Derek for this wonderful piece, please help to provide your details for reference purpose. God bless.


Many compliments to you


Great work , thank you very much for the simple explanation


Thank you. I had to give a presentation on this topic. I have looked everywhere on the internet but this is the best and simple explanation.

omodara beatrice

thank you, its very informative.


Well explained. Now I know my research methodology will be qualitative and exploratory. Thank you so much, keep up the good work


Well explained, thank you very much.

Ainembabazi Rose

This is good explanation, I have understood the different methods of research. Thanks a lot.

Kamran Saeed

Great work…very well explanation

Hyacinth Chebe Ukwuani

Thanks Derek. Kerryn was just fantastic!

Great to hear that, Hyacinth. Best of luck with your research!

Matobela Joel Marabi

Its a good templates very attractive and important to PhD students and lectuter

Thanks for the feedback, Matobela. Good luck with your research methodology.


Thank you. This is really helpful.

You’re very welcome, Elie. Good luck with your research methodology.

Sakina Dalal

Well explained thanks


This is a very helpful site especially for young researchers at college. It provides sufficient information to guide students and equip them with the necessary foundation to ask any other questions aimed at deepening their understanding.

Thanks for the kind words, Edward. Good luck with your research!

Ngwisa Marie-claire NJOTU

Thank you. I have learned a lot.

Great to hear that, Ngwisa. Good luck with your research methodology!


Thank you for keeping your presentation simples and short and covering key information for research methodology. My key takeaway: Start with defining your research objective the other will depend on the aims of your research question.


My name is Zanele I would like to be assisted with my research , and the topic is shortage of nursing staff globally want are the causes , effects on health, patients and community and also globally

Oluwafemi Taiwo

Thanks for making it simple and clear. It greatly helped in understanding research methodology. Regards.


This is well simplified and straight to the point

Gabriel mugangavari

Thank you Dr

Dina Haj Ibrahim

I was given an assignment to research 2 publications and describe their research methodology? I don’t know how to start this task can someone help me?

Sure. You’re welcome to book an initial consultation with one of our Research Coaches to discuss how we can assist – https://gradcoach.com/book/new/ .


Thanks a lot I am relieved of a heavy burden.keep up with the good work

Ngaka Mokoena

I’m very much grateful Dr Derek. I’m planning to pursue one of the careers that really needs one to be very much eager to know. There’s a lot of research to do and everything, but since I’ve gotten this information I will use it to the best of my potential.

Pritam Pal

Thank you so much, words are not enough to explain how helpful this session has been for me!


Thanks this has thought me alot.

kenechukwu ambrose

Very concise and helpful. Thanks a lot

Eunice Shatila Sinyemu 32070

Thank Derek. This is very helpful. Your step by step explanation has made it easier for me to understand different concepts. Now i can get on with my research.


I wish i had come across this sooner. So simple but yet insightful

yugine the

really nice explanation thank you so much


I’m so grateful finding this site, it’s really helpful…….every term well explained and provide accurate understanding especially to student going into an in-depth research for the very first time, even though my lecturer already explained this topic to the class, I think I got the clear and efficient explanation here, much thanks to the author.


It is very helpful material

Lubabalo Ntshebe

I would like to be assisted with my research topic : Literature Review and research methodologies. My topic is : what is the relationship between unemployment and economic growth?


Its really nice and good for us.

Ekokobe Aloysius



Short but sweet.Thank you

Shishir Pokharel

Informative article. Thanks for your detailed information.

Badr Alharbi

I’m currently working on my Ph.D. thesis. Thanks a lot, Derek and Kerryn, Well-organized sequences, facilitate the readers’ following.


great article for someone who does not have any background can even understand

Hasan Chowdhury

I am a bit confused about research design and methodology. Are they the same? If not, what are the differences and how are they related?

Thanks in advance.

Ndileka Myoli

concise and informative.

Sureka Batagoda

Thank you very much

More Smith

How can we site this article is Harvard style?


Very well written piece that afforded better understanding of the concept. Thank you!

Denis Eken Lomoro

Am a new researcher trying to learn how best to write a research proposal. I find your article spot on and want to download the free template but finding difficulties. Can u kindly send it to my email, the free download entitled, “Free Download: Research Proposal Template (with Examples)”.

fatima sani

Thank too much


Thank you very much for your comprehensive explanation about research methodology so I like to thank you again for giving us such great things.

Aqsa Iftijhar

Good very well explained.Thanks for sharing it.

Krishna Dhakal

Thank u sir, it is really a good guideline.


so helpful thank you very much.

Joelma M Monteiro

Thanks for the video it was very explanatory and detailed, easy to comprehend and follow up. please, keep it up the good work


It was very helpful, a well-written document with precise information.

orebotswe morokane

how do i reference this?


MLA Jansen, Derek, and Kerryn Warren. “What (Exactly) Is Research Methodology?” Grad Coach, June 2021, gradcoach.com/what-is-research-methodology/.

APA Jansen, D., & Warren, K. (2021, June). What (Exactly) Is Research Methodology? Grad Coach. https://gradcoach.com/what-is-research-methodology/


Your explanation is easily understood. Thank you

Dr Christie

Very help article. Now I can go my methodology chapter in my thesis with ease

Alice W. Mbuthia

I feel guided ,Thank you

Joseph B. Smith

This simplification is very helpful. It is simple but very educative, thanks ever so much

Dr. Ukpai Ukpai Eni

The write up is informative and educative. It is an academic intellectual representation that every good researcher can find useful. Thanks

chimbini Joseph

Wow, this is wonderful long live.


Nice initiative


thank you the video was helpful to me.


Thank you very much for your simple and clear explanations I’m really satisfied by the way you did it By now, I think I can realize a very good article by following your fastidious indications May God bless you


Thanks very much, it was very concise and informational for a beginner like me to gain an insight into what i am about to undertake. I really appreciate.

Adv Asad Ali

very informative sir, it is amazing to understand the meaning of question hidden behind that, and simple language is used other than legislature to understand easily. stay happy.

Jonas Tan

This one is really amazing. All content in your youtube channel is a very helpful guide for doing research. Thanks, GradCoach.

mahmoud ali

research methodologies

Lucas Sinyangwe

Please send me more information concerning dissertation research.

Amamten Jr.

Nice piece of knowledge shared….. #Thump_UP

Hajara Salihu

This is amazing, it has said it all. Thanks to Gradcoach

Gerald Andrew Babu

This is wonderful,very elaborate and clear.I hope to reach out for your assistance in my research very soon.


This is the answer I am searching about…

realy thanks a lot

Ahmed Saeed

Thank you very much for this awesome, to the point and inclusive article.

Soraya Kolli

Thank you very much I need validity and reliability explanation I have exams


Thank you for a well explained piece. This will help me going forward.

Emmanuel Chukwuma

Very simple and well detailed Many thanks

Zeeshan Ali Khan

This is so very simple yet so very effective and comprehensive. An Excellent piece of work.

Molly Wasonga

I wish I saw this earlier on! Great insights for a beginner(researcher) like me. Thanks a mil!

Blessings Chigodo

Thank you very much, for such a simplified, clear and practical step by step both for academic students and general research work. Holistic, effective to use and easy to read step by step. One can easily apply the steps in practical terms and produce a quality document/up-to standard

Thanks for simplifying these terms for us, really appreciated.

Joseph Kyereme

Thanks for a great work. well understood .


This was very helpful. It was simple but profound and very easy to understand. Thank you so much!


Great and amazing research guidelines. Best site for learning research

ankita bhatt

hello sir/ma’am, i didn’t find yet that what type of research methodology i am using. because i am writing my report on CSR and collect all my data from websites and articles so which type of methodology i should write in dissertation report. please help me. i am from India.


how does this really work?

princelow presley

perfect content, thanks a lot

George Nangpaak Duut

As a researcher, I commend you for the detailed and simplified information on the topic in question. I would like to remain in touch for the sharing of research ideas on other topics. Thank you


Impressive. Thank you, Grad Coach 😍

Thank you Grad Coach for this piece of information. I have at least learned about the different types of research methodologies.

Varinder singh Rana

Very useful content with easy way

Mbangu Jones Kashweeka

Thank you very much for the presentation. I am an MPH student with the Adventist University of Africa. I have successfully completed my theory and starting on my research this July. My topic is “Factors associated with Dental Caries in (one District) in Botswana. I need help on how to go about this quantitative research

Carolyn Russell

I am so grateful to run across something that was sooo helpful. I have been on my doctorate journey for quite some time. Your breakdown on methodology helped me to refresh my intent. Thank you.

Indabawa Musbahu

thanks so much for this good lecture. student from university of science and technology, Wudil. Kano Nigeria.

Limpho Mphutlane

It’s profound easy to understand I appreciate

Mustafa Salimi

Thanks a lot for sharing superb information in a detailed but concise manner. It was really helpful and helped a lot in getting into my own research methodology.

Rabilu yau

Comment * thanks very much

Ari M. Hussein

This was sooo helpful for me thank you so much i didn’t even know what i had to write thank you!

You’re most welcome 🙂

Varsha Patnaik

Simple and good. Very much helpful. Thank you so much.


This is very good work. I have benefited.

Dr Md Asraul Hoque

Thank you so much for sharing

Nkasa lizwi

This is powerful thank you so much guys

I am nkasa lizwi doing my research proposal on honors with the university of Walter Sisulu Komani I m on part 3 now can you assist me.my topic is: transitional challenges faced by educators in intermediate phase in the Alfred Nzo District.

Atonisah Jonathan

Appreciate the presentation. Very useful step-by-step guidelines to follow.

Bello Suleiman

I appreciate sir


wow! This is super insightful for me. Thank you!

Emerita Guzman

Indeed this material is very helpful! Kudos writers/authors.


I want to say thank you very much, I got a lot of info and knowledge. Be blessed.

Akanji wasiu

I want present a seminar paper on Optimisation of Deep learning-based models on vulnerability detection in digital transactions.

Need assistance

Clement Lokwar

Dear Sir, I want to be assisted on my research on Sanitation and Water management in emergencies areas.

Peter Sone Kome

I am deeply grateful for the knowledge gained. I will be getting in touch shortly as I want to be assisted in my ongoing research.


The information shared is informative, crisp and clear. Kudos Team! And thanks a lot!

Bipin pokhrel

hello i want to study


Hello!! Grad coach teams. I am extremely happy in your tutorial or consultation. i am really benefited all material and briefing. Thank you very much for your generous helps. Please keep it up. If you add in your briefing, references for further reading, it will be very nice.


All I have to say is, thank u gyz.


Good, l thanks

Artak Ghonyan

thank you, it is very useful


  • What Is A Literature Review (In A Dissertation Or Thesis) - Grad Coach - […] the literature review is to inform the choice of methodology for your own research. As we’ve discussed on the Grad Coach blog,…
  • Free Download: Research Proposal Template (With Examples) - Grad Coach - […] Research design (methodology) […]
  • Dissertation vs Thesis: What's the difference? - Grad Coach - […] and thesis writing on a daily basis – everything from how to find a good research topic to which…

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

How to align data collection to an organization or project’s overall mission 

  • Share on LinkedIn
  • Share through Email
  • Print this page
  • Bookmark this page

Data Commons Guide

Learn how to design and use a logic model 

Create a logic model for one of your projects or programs , determine key metrics and data needed to evaluate your work , determine your collection tool(s) , ‘so what’ and next steps   .

  • Exploring Program Logic
  • Logic Models

Guide Objectives

  • Understand the design and value of a logic model to optimize data collection
  • Learn how to create your own organizational or project logic models
  • Align data, metrics, and data collection objectives to the logic model(s)

Social impact organizations (SIOs) need to regularly collect data from their stakeholders and the communities they serve to evaluate their impact and processes. Collecting such data is often done actively via surveys, interviews, or focus groups. It can also be through passive data collection, such as from attendance lists, user product logs, webpage performance, etc.   With the various approaches to collecting data and the various requests for it (via clients, funders, and project teams), getting the data you need can be a messy process.  

Multiple data sources and varying evaluation needs can make systematic data collection difficult. This guide will teach you how to design a logic model to ensure your data collection is intentional, optimized, and aligned with your mission.   

Guide Specific Disclaimer

A logic model is both a means to align stakeholders around a desired impact and a starting point for developing key metrics and systematic evaluations for your organization’s work. For a logic model to become a valuable tool, it must be created in collaboration with key stakeholders. It’s designed to be flexible and should be periodically updated to reflect new insights or changes in your organization’s strategy. 

The first step is to understand the flow of a logic model and its key components. A logic model is a systematic framework that helps align the activities and data collection of an organization (or a project) with its overarching mission to deliver impact.  

You can review the resources in this step to better understand logic models and how they are developed. 

Key components of a logic model

Inputs: The resources that an organization commits to a program to produce the intended outputs, outcomes, and impact. Examples of resources are: people, technology, or equipment. 

Activities: The actions or events undergone using the documented inputs. Examples of activities are: holding a course, running an awareness campaign, or creating a data visualization. 

Outputs: The immediate result of a program’s activities. Examples of outputs are: number of course graduates, amount of revenue generated, or creation of a product. 

Outcomes: Socially meaningful changes that are outcomes of the outputs. Generally defined in terms of expected changes in knowledge, skills, attitudes, behavior, condition, or status. For example, if an output is the number of graduates or a program, the associated outcome could be a greater number of people attaining jobs in a certain sector. 

Impact: The results that can be directly attributed to the outcomes of a given program or collection of programs. For example, if the outcome is a greater number of people attaining jobs, an associated impact could be a higher income per capita in a region.  

Now that you’re familiar with the concept and value of a logic model, it’s time to craft one for a specific project or program. The first step is to download the Logic Model Template .

If it is your first time making a logic model, we recommend starting with a specific program or project you know well rather than creating a logic model that covers the entire organization.  

When setting up your logic model you can start by writing a problem statement. This is the key problem your project or program is trying to solve. An example of this is: While jobs in STEM fields pay higher than the average salaries, there are far fewer women than men in these fields in California.  

After you have entered your problem statement, you can enter the information for each column. We recommend you start with the Impact column. You may find you need to jump between columns when filling it out, but the important thing is to make a linear connection between each component along a row. 

You can reference the example on the template to see the level of detail you should be providing for each section. 

It is important to note that there are many variations of the logic model. Depending on what resource you are referring to, it may use slightly different terminology and use different components to the core ones provided in our template.

Once you have developed your logic model, you can now use it to determine your data collection needs. To do so, you can move to the Metrics & Data tab on the template. 

While filling in the information on this tab, consider the metrics needed to measure success and performance regarding the Output, Outcome, and Impact components of the logic model, and the underlying data needed to develop these metrics. Provide this information in Columns C and D.  

Filling out this Metrics & Data tab will bring you one step closer to aligning your data collection tools and methodology to the logic model. 

Metrics may be based on one data point, multiple data points, and/or multiple types of data. For example, measuring the change in graduation rates across a period of time requires multiple data points:  

  • number of graduates in year Y 
  • number of graduates in year X  
  • number of overall students in year Y 
  • number of overall students in year X

If there are multiple data points needed to calculate a metric, list them all in the relevant cell in Column D. 

Now that you know what you need to measure (i.e. the components of the logic model) and how you need to measure it (i.e. the metrics), you can now determine the best way to collect the data. Data sources can be from active data collection (surveys, focus groups, interviews, etc.), passive data collection (observations, program applications, etc.), or external datasets. 

In this process, It is important to take a step back to understand the objectives of your data collection before deciding on the tools and frequency of collecting data (via surveys, focus groups, etc.) For example, if the data is collected for real-time performance evaluations and improvement, data may need to be updated frequently. On the other hand, if it’s collected primarily for reporting purposes, six-month or annual assessments may be suitable. 

Fill out this information in Column E, listing the data source (or various data sources) for each dataset noted in Column D. For the “type of data collection”, this could be active or passive data collection. 

After documenting how your logic model ties into your data collection plans, a natural next step is to develop the tools for collecting data. Surveys are often a primary tool in this regard. Consult our How to streamline data collection through surveys guide for a comprehensive approach to survey design.

Was this guide helpful? Please rate this guide and share any additional feedback on how we might improve it.

Explore additional pathways and perspectives to inform your data for social impact journey.

Check out Expert Q&A

Discover community groups, take the data maturity accessment.

  • Open access
  • Published: 18 March 2024

Utilization of EHRs for clinical trials: a systematic review

  • Leila R. Kalankesh 1 , 3 &
  • Elham Monaghesh 2 , 3  

BMC Medical Research Methodology volume  24 , Article number:  70 ( 2024 ) Cite this article

308 Accesses

1 Altmetric

Metrics details

Background and objective

Clinical trials are of high importance for medical progress. This study conducted a systematic review to identify the applications of EHRs in supporting and enhancing clinical trials.

Materials and methods

A systematic search of PubMed was conducted on 12/3/2023 to identify relevant studies on the use of EHRs in clinical trials. Studies were included if they (1) were full-text journal articles, (2) were written in English, (3) examined applications of EHR data to support clinical trial processes (e.g. recruitment, screening, data collection). A standardized form was used by two reviewers to extract data on: study design, EHR-enabled process(es), related outcomes, and limitations.

Following full-text review, 19 studies met the predefined eligibility criteria and were included. Overall, included studies consistently demonstrated that EHR data integration improves clinical trial feasibility and efficiency in recruitment, screening, data collection, and trial design.


According to the results of the present study, the use of Electronic Health Records in conducting clinical trials is very helpful. Therefore, it is better for researchers to use EHR in their studies for easy access to more accurate and comprehensive data. EHRs collects all individual data, including demographic, clinical, diagnostic, and therapeutic data. Moreover, all data is available seamlessly in EHR. In future studies, it is better to consider the cost-effectiveness of using EHR in clinical trials.

Peer Review reports


Clinical trials are of high importance for medical progress [ 1 ]. Well designed and well-executed clinical trial studies provide the foundational data for evidence-based medicine [ 2 ], which are the standard for evaluating the benefits and harms of medical interventions [ 3 ]. Numerous factors lead to the success of clinical trials, such as appropriate trial design(e.g. randomization, blinding, and controls), thorough training of research staff, and recruitment of an adequate sample size by identifying and enrolling qualified participants in a timely manner [ 4 , 5 ] and maintaining good participation through study completion [ 2 , 6 ].

Strategic selection of study sites with access to suitable patient populations can optimize recruitment. Moreover, developing practical yet scientifically sound protocols through careful planning and analysis helps ensure trials are completed in an accurate and cost-effective manner [ 2 ]. Traditionally, many trials have relied heavily on physician referrals to identify and attract potential participants [ 7 ]. While essential, sole dependence on this approach has limitations including referral bias and logistical challenges that could hamper recruitment. To strengthen the recruiting process, manually reviewing patient’s electronic records to identify and diagnose eligible candidates for clinical trials has become a standard practice [ 8 ]. However, this manual chart review method is often time-consuming and resource-intensive [ 9 ].

To modernize clinical recruitment and conduct, new tools have been developed that enable data-driven insights into patient populations within EHR systems [ 10 ]. In fact, to digitalize processes, the TransCelerate e-Resource initiative, launched in January 2016, aims to facilitate understanding the e-resource landscape and the optimal use of electronic data resources to improve clinical science and clinical trial implementation for stakeholders. The eSource initiative also aligns well with other TransCelerate initiatives designed to help modernize trial execution and ways to enroll patients in clinical trials [ 38 ].

EHR systems contain comprehensive demographic, medical and treatment history collected during routine care, which offer potential to efficiently pre-scan, identify and recruit appropriate patients for clinical trials [ 11 , 12 , 13 , 14 ]. Specifically, recruiting patients through the EHR allows pre-assessment of eligibility criteria, selection of targeted population, and automated outreach to participants [ 15 ]. EHRs also provide ongoing access to detailed patient data that may decrease redundant measurements and data collection during trials [ 12 ]. Overall, EHR-enabled recruitment and workflow processes have potential to make clinical trials more cost-effective and feasible [ 11 ]. This study conducted a systematic review to identify the applications of EHR in supporting and enhancing clinical trials.

Study design

This systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.

Literature search

A systematic search of PubMed was conducted on 12/3/2023 to identify relevant studies on the use of EHRs in clinical trials. The search included a combination of Medical Subject Headings (MeSH terms) and keywords related to electronic health records (EHR OR electronic medical record) AND clinical trials. The search was limited to title and abstract fields. No date or language limits were applied. The specific Boolean search syntax was:

("EHR"[Title/Abstract] OR "Electronic health record"[Title/Abstract] OR "Electronic health records"[Title/Abstract] OR "EMR"[Title/Abstract] OR "Electronic medical record"[Title/Abstract] OR "Electronic medical records"[Title/Abstract]) AND (clinical trial* [Title/Abstract]).

Reference lists of included studies were hand-searched to identify additional relevant articles. The search was performed without any time limit.

Eligibility criteria

Studies were included if they (1) were full-text journal articles, (2) were written in English, (3) examined applications of EHR data to support clinical trial processes (e.g. recruitment, screening, data collection). Reviews, letters, abstracts, editorials and other non-research studies were excluded.

Study selection and data extraction

Two researchers (EM and LRK) independently screened titles and abstracts of retrieved records to identify potentially eligible studies. After obtaining full texts of potential articles, the two investigators independently assessed eligibility based on predefined criteria. Disagreements were resolved through discussion and consensus. A form was used by two reviewers to extract data on: study design, EHR-enabled process(es), related outcomes, and limitations.

Evidence synthesis

A qualitative synthesis was conducted summarizing key outcomes and limitations of included studies grouped by the EHR-enabled process examined. The study authors met regularly to discuss consensus on findings.

The systematic literature search yielded 2161 records, out of which 312 were selected for full-text review after screening titles and abstracts. After conducting a thorough review of the full-texts and resolving disagreements regarding 2 articles, a total of 19 studies that met the predefined eligibility criteria were included in the final qualitative synthesis (Fig.  1 ).

figure 1

PRISMA flow diagram illustrating study selection for utilization of EHRs in Clinical Trials

Characteristics of included studies

The key characteristics of the 19 included studies are summarized in Table  1 . The studies were published in a variety of international journals, with the majority (14/19) from the United States. The remaining studies originated from China, Switzerland, Germany, Belgium, and Finland. The sample sizes ranged from 165 to 5,529,407 patients.

Clinical Trial Processes and Outcomes

Nineteen studies examined the impacts of EHR use on clinical trial processes and outcomes. Table 2 summarizes the key findings on EHR applications for recruitment, screening, data collection, and trial design. Overall, the included studies consistently demonstrated that utilization of EHR data improved clinical trial feasibility and efficiency in the following ways:

Recruitment: 19 studies evaluating EHR-enabled recruitment have reported increased enrollment efficiency compared to standard practices.

Screening: In 5 studies, EHR pre-screening excluded patients prior to full eligibility screening, reducing unnecessary procedures.

Data collection: In 3 studies using EHR data reduced data collection costs compared to standard methods.

Trial Design: In one study examining this application, EHR data informed optimization of eligibility criteria to improve statistical power for a COVID-19 trial.

Purposes of using EHR

The most frequent application of EHR data was to identify and recruit eligible participants into clinical trials. By containing diverse information on demographics, clinical history, diagnoses, and more, EHRs allowed pre-screening and outreach to potential candidates that met enrollment criteria. In several studies, EHR data was leveraged for secondary research purposes including data collection, data analysis and optimizing trial design [ 16 , 27 , 28 , 29 , 30 , 38 ]. Specifically, one study utilized EHR data from COVID-19 patients to inform eligibility criteria selection and improve statistical power for COVID-19 trials [ 23 ]. Overall, the primary use case was to enable secondary research applications of EHR data beyond routine clinical care to facilitate clinical trial processes. Key limitations of these applications included potential for selection bias, generalizability concerns in single health system populations, and heterogeneity in methods and endpoints assessed across studies. Further investigation using standardized methodology is needed to realize the full potential of EHR-enabled clinical research.

This systematic review aimed to identify applications and impacts of electronic health record (EHR) use in clinical trials. The included studies demonstrated EHR data has been leveraged to serve various key functions, including identifying eligible participants, facilitating recruitment, enabling data collection and analysis, and optimizing trial design.

In one study, EHR data was from 59639 patients who encountered health care system. The results showed that the EHR data could be used as a promising clinical tool to assist physicians in early identification of patients suitable for palliative care counseling [ 35 ]. Although this study used EHR for therapeutic purposes, it can be concluded that EHR data is very effective in identifying individuals with any target.

Another study found that primary care electronic health record data could be used effectively to identify patients who have been prescribed specific medications and patients who are potentially experiencing drug side effects [ 36 ]. In general, based on the results of this study, EHR can also be utilized in clinical trials for purposes other than patient care and in particular for the secondary use of this tool. In fact, according to the studies [ 20 , 28 , 29 , 30 , 31 , 32 ], the use of EHR serves various purposes in clinical trials, including identifying eligible participants, facilitating their recruitment and analyzing patient data to assess outcomes and measure the safety and efficacy of the intervention.

EHRs can be used as a database for the use of data needed in clinical trials. For example, a study in Brazil used EHR data to obtain benchmark for stroke patients [ 37 ].

According to the results of the present study, the use of EHR in conducting clinical trials is very helpful. Therefore, it is better for researchers to use EHR in their studies for easy access to more accurate and comprehensive data. EHRs collects all individual data, including demographic, clinical, diagnostic, and therapeutic data. So that all data is available seamlessly. Real-time access to patient data directly from EHRs could eliminate the need for manual data entry, minimizing errors and ensuring data integrity.

Moreover, EHRs enable the seamless integration of clinical trial data with other relevant health information, providing a more comprehensive picture of patient health and facilitating the evaluation of long term outcomes. In future studies, it is better to consider the cost-effectiveness of using EHR in clinical trials. Because due to the increasing use and effectiveness of using EHR in clinical trials, its cost-effectiveness should also be determined. Also, conducting such research would be useful for the wider scientific community. Also, in future studies, many metrics can be investigated and reported to reflect the effectiveness of EHR for patient registration. Also, some statistics can be shown to illustrate this.

One of the limitations of the present study was the lack of access to some databases due to sanctions. Another limitation is the lack of a similar study that comprehensively examines the role and effectiveness of EHR in clinical trials. There are also a small number of studies that have examined the effectiveness, how the EHR is used, and its uses in clinical trials.

Another limitation is related to the comparison of the studies included in this study, considering that the EHR system used in different countries, even in each country, is very different in many aspects, including the type of system used, the culture of each country, the level of EHR implementation, technical infrastructure, etc. Therefore, the comparison between systems was one of the limitations of this study.

According to the results of the present study, it can be concluded that EHR in clinical trials is used for various purposes. While promising, several limitations should be considered when interpreting the evidence. Many EHRs may rely on single health system populations, limiting generalizability of findings. Heterogeneity in methods and endpoints used to evaluate the same EHR processes is another issue to be considered. Additional limitations included potential for selection and referral bias. More research is needed to develop standardized methodology and reporting for EHR-enabled clinical trials. Future directions of the research should be to optimize EHRs for supporting clinical trials. This may be realized through enhanced interoperability and data sharing between EHR systems to facilitate multi-site and diverse patient populations trials and expand access to diverse patient populations beyond single health systems. Standardization of data formats, development of shared platforms, and policies enabling access are needed. Integration of clinical trial-specific modules into EHRs is required to simplify participant screening, recruitment, enrollment, and data collection. This could include dashboards, automated alerts, and documentation templates. Advanced analytics and machine learning applied to EHR data can also be a part of agenda for future research. Stronger privacy protections and cybersecurity measures should be in place to securely operationalize EHR data for research while maintaining patient confidentiality.

There is also gap in cost-effectiveness studies to quantify financial benefits and guide investments in EHR-enabled research infrastructure.

Availability of data and materials

Not applicable.

Kohl CD, Garde S, Knaup P. Facilitating secondary use of medical data by using openEHR archetypes. Stud Health Technol Inform. 2010;160(Pt 2):1117–21.

PubMed   Google Scholar  

Laaksonen N, Varjonen J-M, Blomster M, Palomäki A, Vasankari T, Airaksinen J, et al. Assessing an electronic health record research platform for identification of clinical trial participants. Contemp Clin Trials Commun. 2021;21:100692.

Article   PubMed   Google Scholar  

Bothwell L, Greene J, Podolsky S, Jones D. Assessing the gold standard-lessons from the history of RCTs. N Engl J Med. 2016;374(22):2175–81.

Article   PubMed   CAS   Google Scholar  

Foster JM, Sawyer SM, Smith L, Reddel HK, Usherwood T. Barriers and facilitators to patient recruitment to a cluster randomized controlled trial in primary care: lessons for future trials. BMC Med Res Methodol. 2015;15(1):1–9.

Article   Google Scholar  

Farrell B, Kenyon S, Shakur H. Managing clinical trials. Trials. 2010;11(1):1–6.

Menachemi N, Collum TH. Benefits and drawbacks of electronic health record systems. Risk Manag Healthcare Policy. 2011;4:47.

Mapstone J, Elbourne DD, Roberts IG. Strategies to improve recruitment to research studies. Cochrane Database of Systematic Reviews. 2002(3).

Vickers AJ. How to improve accrual to clinical trials of symptom control 2: design issues. J Soc Integr Oncol. 2007;5(2):61.

Article   PubMed   PubMed Central   Google Scholar  

Doods J, Bache R, McGilchrist M, Daniel C, Dugas M, Fritz F. Piloting the EHR4CR feasibility platform across Europe. Methods Inf Med. 2014;53(04):264–8.

Kellar E, Bornstein SM, Caban A, Célingant C, Crouthamel M, Johnson C, et al. Optimizing the use of electronic data sources in clinical trials: the landscape, part 1. Ther Innov Reg Sci. 2016;50(6):682–96.

Mc Cord KA, Ewald H, Ladanie A, Briel M, Speich B, Bucher HC, et al. Current use and costs of electronic health records for clinical trial research: a descriptive study. CMAJ Open. 2019;7(1):E23.

Zuidgeest MG, Goetz I, Groenwold RH, Irving E, van Thiel GJ, Grobbee DE, et al. Series: pragmatic trials and real world evidence: paper 1 introduction. J Clin Epidemiol. 2017;88:7–13.

Mc Cord KA, Salman RAS, Treweek S, Gardner H, Strech D, Whiteley W, et al. Routinely collected data for randomized trials: promises, barriers, and implications. Trials. 2018;19(1):1–9.

Beaver JA, Howie LJ, Pelosof L, Kim T, Liu J, Goldberg KB, et al. A 25-year experience of US food and drug administration accelerated approval of malignant hematology and oncology drugs and biologics: a review. JAMA Oncol. 2018;4(6):849–56.

Li G, Sajobi TT, Menon BK, Korngut L, Lowerison M, James M, et al. Registry-based randomized controlled trials-what are the advantages, challenges, and areas for future research? J Clin Epidemiol. 2016;80:16–24.

Ateya MB, Delaney BC, Speedie SM. The value of structured data elements from electronic health records for identifying subjects for primary care clinical trials. BMC Med Inform Decis Mak. 2016;16:1.

Beresniak A, Schmidt A, Proeve J, Bolanos E, Patel N, Ammour N, et al. Cost-benefit assessment of using electronic health records data for clinical research versus current practices: Contribution of the Electronic Health Records for Clinical Research (EHR4CR) European Project. Contemp Clin Trials. 2016;46:85–91.

Bruland P, McGilchrist M, Zapletal E, Acosta D, Proeve J, Askin S, et al. Common data elements for secondary use of electronic health record data for clinical trial execution and serious adverse event reporting. BMC Med Res Methodol. 2016;16(1):159.

Carrion J. Improving the patient-clinician interface of clinical trials through health informatics technologies. J Med Syst. 2018;42(7):120.

De Moor G, Sundgren M, Kalra D, Schmidt A, Dugas M, Claerhout B, et al. Using electronic health records for clinical research: the case of the EHR4CR project. J Biomed Inform. 2015;53:162–73.

Embi PJ, Jain A, Clark J, Bizjack S, Hornung R, Harris CM. Effect of a clinical trial alert system on physician participation in trial recruitment. Arch Intern Med. 2005;165(19):2272–7.

Ernecoff NC, Wessell KL, Gabriel S, Carey TS, Hanson LC. A novel screening method to identify late-stage dementia patients for palliative care research and practice. J Pain Symptom Manage. 2018;55(4):1152–8.e1.

Kim JH, Ta CN, Liu C, Sung C, Butler AM, Stewart LA, et al. Towards clinical data-driven eligibility criteria optimization for interventional COVID-19 clinical trials. J American Med Inform Assoc. 2021;28(1):14–22.

Kirshner J, Cohn K, Dunder S, Donahue K, Richey M, Larson P, et al. Automated electronic health record-based tool for identification of patients with metastatic disease to facilitate clinical trial patient ascertainment. JCO Clin Cancer Inform. 2021;5:719–27.

Laaksonen N, Varjonen JM, Blomster M, Palomäki A, Vasankari T, Airaksinen J, et al. Assessing an electronic health record research platform for identification of clinical trial participants. Contemp Clin Trials Commun. 2021;21:100692.

Li M, Cai H, Nan S, Li J, Lu X, Duan H. A Patient-screening tool for clinical research based on electronic health records using openEHR: development study. JMIR Med Inform. 2021;9(10):e33192.

Meystre SM, Heider PM, Kim Y, Aruch DB, Britten CD. Automatic trial eligibility surveillance based on unstructured clinical data. Int J Med Informatics. 2019;129:13–9.

Miotto R, Weng C. Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials. J American Med Inform Assoc. 2015;22(e1):e141–50.

Nelson SJ, Drury B, Hood D, Harper J, Bernard T, Weng C, et al. EHR-based cohort assessment for multicenter RCTs: a fast and flexible model for identifying potential study sites. Journal of the American Medical Informatics Association : JAMIA. 2021.

Ni Y, Bermudez M, Kennebeck S, Liddy-Hicks S, Dexheimer J. A real-time automated patient screening system for clinical trials eligibility in an emergency department: design and evaluation. JMIR Med Inform. 2019;7(3):e14185.

O’Brien EC, Raman SR, Ellis A, Hammill BG, Berdan LG, Rorick T, et al. The use of electronic health records for recruitment in clinical trials: a mixed methods analysis of the harmony outcomes electronic health record ancillary study. Trials. 2021;22(1):465.

Rogers JR, Liu C, Hripcsak G, Cheung YK, Weng C. Comparison of clinical characteristics between clinical trial participants and nonparticipants using electronic health record data. JAMA Netw Open. 2021;4(4):e214732.

Sun Y, Butler A, Diallo I, Kim JH, Ta C, Rogers JR, et al. A framework for systematic assessment of clinical trial population representativeness using electronic health records data. Appl Clin Inform. 2021;12(4):816–25.

Zimmerman LP, Goel S, Sathar S, Gladfelter CE, Onate A, Kane LL, et al. A novel patient recruitment strategy: patient selection directly from the community through linkage to clinical data. Appl Clin Inform. 2018;9(1):114–21.

Guo A, Foraker R, White P, Chivers C, Courtright K, Moore N. Using electronic health records and claims data to identify high-risk patients likely to benefit from palliative care. American J Managed Care. 2021;27(1):e7–15.

Cole AM, Stephens KA, West I, Keppel GA, Thummel K, Baldwin L-M. Use of electronic health record data from diverse primary care practices to identify and characterize patients’ prescribed common medications. Health Informatics J. 2020;26(1):172–80.

Valêncio RFZ, Souza JTd, Winckler FC, Modolo GP, Ferreira NC, Bazan SGZ, et al. Semi-automated data collection from electronic health records in a stroke unit in Brazil. Arquivos de Neuro-Psiquiatria. 2021.

US Food and Drug Administration. Guidance for industry: electronic source data in clinical investigations. Silver Spring MD. 2013;16:15.

Google Scholar  

Download references


The research protocol was approved & Supported by Student Research Committee, Tabriz University of Medical Sciences (grant number: IR.TBZMED.VCR.REC.1401.152).

The authors received no financial support for the research, authorship, and/or publication of this article.

Author information

Authors and affiliations.

Tabriz Health Services Management Research Center, Tabriz University of Medical Sciences, Tabriz, Iran

Leila R. Kalankesh

Student Research Committee, Tabriz University of Medical Sciences, Tabriz, Iran

Elham Monaghesh

Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran

Leila R. Kalankesh & Elham Monaghesh

You can also search for this author in PubMed   Google Scholar


E.M. Writing the main manuscript text, Data curation, prepared figures, writing – review & editing. L.K. Validation, Investigation, Conceptualization, Methodology, Supervision, All authors reviewed the manuscript.

Corresponding author

Correspondence to Elham Monaghesh .

Ethics declarations

Ethics approval and consent to participate.

The study is approved by ethical committee of Tabriz University of Medical Sciences (IR.TBZMED.VCR.REC.1401.152).

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Kalankesh, L.R., Monaghesh, E. Utilization of EHRs for clinical trials: a systematic review. BMC Med Res Methodol 24 , 70 (2024). https://doi.org/10.1186/s12874-024-02177-7

Download citation

Received : 03 September 2023

Accepted : 08 February 2024

Published : 18 March 2024

DOI : https://doi.org/10.1186/s12874-024-02177-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Electronic Health Record
  • Clinical trials

BMC Medical Research Methodology

ISSN: 1471-2288

what is data collection in research methodology

  • Study Guides
  • Homework Questions

DOC640 Module 2 Discussion - Data Collection, Analysis, and Interpretation

  • Open access
  • Published: 23 March 2024

Technology, data, people, and partnerships in addressing unmet social needs within Medicaid Managed Care

  • Rachel Hogg-Graham 1 ,
  • Allison M. Scott 2 ,
  • Emily R. Clear 1 ,
  • Elizabeth N. Riley 1 &
  • Teresa M. Waters 3  

BMC Health Services Research volume  24 , Article number:  368 ( 2024 ) Cite this article

121 Accesses

Metrics details

Individuals with unmet social needs experience adverse health outcomes and are subject to greater inequities in health and social outcomes. Given the high prevalence of unmet needs among Medicaid enrollees, many Medicaid managed care organizations (MCOs) are now screening enrollees for unmet social needs and connecting them to community-based organizations (CBOs) with knowledge and resources to address identified needs. The use of screening and referral technology and data sharing are often considered key components in programs integrating health and social services. Despite this emphasis on technology and data collection, research suggests substantial barriers exist in operationalizing effective systems.

We used qualitative methods to examine cross-sector perspectives on the use of data and technology to facilitate MCO and CBO partnerships in Kentucky, a state with high Medicaid enrollment, to address enrollee social needs. We recruited participants through targeted sampling, and conducted 46 in-depth interviews with 26 representatives from all six Kentucky MCOs and 20 CBO leaders. Qualitative descriptive analysis, an inductive approach, was used to identify salient themes.

We found that MCOs and CBOs have differing levels of need for data, varying incentives for collecting and sharing data, and differing valuations of what data can or should do. Four themes emerged from interviewees’ descriptions of how they use data, including 1) to screen for patient needs, 2) to case manage, 3) to evaluate the effectiveness of programs, and 4) to partner with each other. Underlying these data use themes were areas of alignment between MCOs/CBOs, areas of incongruence, and areas of tension (both practical and ideological). The inability to interface with community partners for data privacy and ownership concerns contributes to division. Our findings suggest a disconnect between MCOs and CBOs regarding terms of their technology interfacing despite their shared mission of meeting the unmet social needs of enrollees.


While data and technology can be used to identify enrollee needs and determine the most critical need, it is not sufficient in resolving challenges. People and relationships across sectors are vital in connecting enrollees with the community resources to resolve unmet needs.

Peer Review reports


Individuals with unmet social needs, like food and housing insecurity and transportation challenges, experience higher rates of adverse health outcomes [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ] and are subject to greater inequities in health and social outcomes [ 8 ]. Unmet social needs are especially prevalent among Medicaid enrollees [ 9 ]. For this reason, state Medicaid programs are particularly interested in testing strategies that encourage and incentivize Medicaid managed care organizations (MCOs) to identify and address the complex social needs of enrollees [ 10 , 11 ]. Many Medicaid MCOs are now screening enrollees for their unmet social needs and connecting them to community-based organizations (CBOs) better equipped with knowledge and resources to address these needs [ 12 , 13 ].

The use of screening and referral technology and data sharing are often considered key components in programs integrating health and social services to address social needs [ 12 , 14 ]. Data sharing infrastructure has been highlighted as a way to streamline coordination and social need resolution [ 12 , 14 ]. In some instances, successful integration has facilitated strong connections between health and social services organizations, ensuring that patients move efficiently between sectors [ 14 , 15 , 16 ]. Despite this emphasis on technology and data collection and some positive integration, research suggests substantial barriers exist in operationalizing effective systems [ 12 , 17 ]. CBOs often have limited resources, financial and personnel, to put toward the use of advanced social need screening and referral systems [ 12 , 17 , 18 , 19 ]. The reliance on grant funding and other time-limited resource streams likely presents another barrier in the adoption of tools [ 17 ]. CBOs can also be hesitant to adopt technology and data systems owned by MCOs, hospitals, and other clinically oriented organizations because of data privacy and HIPAA-related issues [ 16 , 20 ].

Research examining health and community partnerships has identified technology adoption by CBOs and other social services organizations as an important barrier to collaboration [ 14 , 15 , 17 ]. Most prior studies examining data and technology include clinical organization perspectives on the use of tools but do not include robust information from community partners [ 12 , 14 , 16 ]. Further, those studies that do include perspectives from multiple organization types on the integration of health and social services are not focused on adopting screening and referral systems. Technology typically emerges in subthemes, and the evidence included does not provide in-depth information on benefits and challenges from both community and clinical partners [ 17 ].

This study examines CBO and MCO perspectives on the use of technology in social need screening and referral. The qualitative analysis presented here is part of a larger mixed methods study examining how Kentucky (KY) MCOs address unmet social needs in partnership with community organizations [ 21 ]. KY offers a unique opportunity to examine strategies addressing Medicaid enrollee needs. Just under 29% of all KY residents are enrolled in Medicaid, making it the third highest enrollment among US states [ 22 ]. KY is also geographically diverse, with distinct urban, rural, and Appalachian regions.

Setting and study population

A project Stakeholder Advisory Board (SAB), including representatives from all Medicaid MCOs, academia, a community-based organization, the State Department for Medicaid Services, and enrollees, met quarterly to provide expertise, guide research, and assist with the dissemination of study results. MCO representatives serving on our SAB were asked to 1) identify individuals in their organization leading efforts to address unmet social needs and population health outcomes among their enrollees and 2) identify CBOs they work closely with in their social need referral process. As part of a targeted sampling strategy, identified contacts were invited via email by the research team to participate in key informant interviews to discuss how MCOs and CBOs address social needs. Inclusion criteria were that participants were at least 18 years old, were employed at an MCO/CBO in Kentucky, and were willing to engage in an interview in English. A total of 32 MCO contacts were invited and 33 CBOs, giving us response rates of 81% and 58% respectively.


Our sample of 46 participants comprised 26 representatives from 6 MCOs (ranging from 3 to 6 participants per MCO) and 20 representatives from 19 unique CBOs. MCO participants represented various organizational roles, including vice presidents, directors, population health, case management, and community engagement. CBO participants represented roles including directors, Chief Executive Officers, Chief Operating Officers, Medical Coordinators, Presidents, Chief Engagement officers, program managers, and outreach coordinators. The services provided by community-based organizations included food security, health, housing, employment, and work readiness, refugee and immigrant services, and community support; many CBOs addressed multiple social needs. CBO interviewees represented organizations operating in both urban and rural areas of the state.

Data collection

In-depth one-on-one interviews with 46 stakeholders from identified CBOs ( n  = 20) and MCOs ( n  = 26) were conducted between May 24, 2021, and November 8, 2021. Interviews were conducted via Zoom, audio-recorded, and transcribed verbatim. The qualitative researcher and facilitator conducting these interviews have extensive training and experience with structural interviewing using a semi-structured interview guide. The guide used was developed for this study [ 23 ].

Data analysis

We conducted an iterative content analysis of the transcribed interview data using qualitative descriptive analysis [ 24 ], an inductive, low-inference method designed to gain an accurate understanding of a phenomenon in the everyday terms of stakeholders. Our data analysis unfolded in two stages. The first stage involved open coding [ 25 ], in which the transcripts were independently coded by two authors and one study team member (AM, ER, and HS), who then met to discuss and reach consensus on the central themes in the data related to technology and data sharing. In this meeting, the authors identified the themes of to screen for patient needs, to case manage, to evaluate the effectiveness of programs, and to partner with each other. The second stage of analysis involved focused coding, with the three individuals again independently coding transcripts for subthemes within each identified central theme. The coders met again to compare findings and finalize themes (and subthemes for Theme 4). At this time, we recognized that there were areas of alignment, incongruence, and tension between the responses of participants from MCOs and CBOs, and we reached agreement in this meeting about which themes demonstrated each dynamic. Finally, all authors met a third time to review the subthemes and select illustrative quotations for each. All analytic decisions were made through discussion until consensus was reached. We used the team-based approach to reaching consensus, which considered dependability and trustworthiness of the data [ 26 ]. This paper focuses on responses addressing technology platforms and data sharing to support MCO and CBO partnerships.

We identified several themes related to the use of technology and data in MCO-CBO partnerships to address enrollee social needs. MCOs and CBOs noted differing levels of need for data, differing incentives for collecting and sharing data, and differing valuations of what data can or should do. MCO and CBO interviewees described how they collect and use data in their work, which fell into four major themes: to screen for patient needs, to case manage, to evaluate the effectiveness of programs, and to partner with each other. Within these themes, the interview responses illuminated areas of alignment between MCOs/CBOs, incongruence, and tension (both practical and ideological; see Table  1 ).

Theme 1. Alignment on collecting data to identify and prioritize patient needs

Using data to identify and prioritize patient needs was largely an area of alignment for MCOs and CBOs. All MCOs and nearly all CBOs recognized the value of data in this area. As one CBO noted,

“By completing the needs assessment with our families, it helps the case managers understand your immediate needs.”

Similarly, MCOs often used the data for targeted programming and social needs referrals,

“ When our members are enrolled, we attempt to engage them in our health risk assessment. And so that health risk assessment is going to not only ask them questions about their specific health, but also about some additional needs that would help us be able to identify them at enrollment and also to be able to target them for programs and other [benefits].”

Several MCO and CBO interviewees also discussed using the data to understand individual enrollee/client needs and to track overall trends among their clients. As one MCO shared,

“The end of 2021, we had a tremendous amount of referrals for food. And so maybe we need to look at doing some of our community investment work and partnering with additional providers and community partners that are in that space for next year.”

There were some differences between MCOs and CBOs in the formality and degree to which social need data was collected. MCO interviewees, particularly those on the front lines of this work, could describe detailed and comprehensive data screening metrics for patient needs and how needs were tracked in their data systems. Using data on patient needs to identify areas for intervention was described as an essential part of patient care:

“We use the screening data, not just to meet the individual member need, but to also inform health equity and types of programs that we bring to play...”

CBO interviewees, on the other hand, had greater variability in their responses about the importance of using data on social needs at an organizational-level. Most described data as having potential value but stopped short of calling it essential for their operations. One CBO stated,

“I don't know what I would do with the information if we had it.”

Conversely, one food-oriented CBO reported that they collect demographic data and use that to help with distribution,

“So think about the local pantry that I talked about earlier. Because we know, we drive a truck into [KY County]. We know that the last five times that we've been in [KY County], we saw, on average, 150 households at each of those five visits. That tells us how much product to put on the truck so that we don't run out.”

Theme 2. Differences in organizational capacity, mission, and resources influenced variability in data use to support case management

Using data to support case management activities was an area of both alignment and incongruence between MCOs and CBOs. All MCOs and many CBOs saw value in using data systems to identify resources available, track referrals and follow-ups, keep notes, and stay in contact with patients. However, there was considerable variability in the sophistication of the data systems. Most MCOs reported elaborate data tracking systems designed specifically for screening, referral, and tracking (e.g., combining medical records applications with Unite Us [ 27 ] or Find Help (formerly Aunt Bertha [ 28 ]). Some CBOs have systems designed specifically for tracking data (e.g., Electronic Health Systems or Vesta [ 29 ]), whereas others employ systems not designed specifically for tracking (e.g., Microsoft Excel spreadsheets). Most CBOs used informal data collection to screen for needs (e.g., Post-it notes, memory, a hand-written planner), and several CBOs reported that they did not use formal data systems to screen and track patient needs at all,

“Are you kidding me? No books. What I usually tell anybody who's working with me is to either email me or text me, and that's my filing system.”

MCO interviewees were more likely to report using data analytics to support and enhance case management. Frontline MCO workers spoke about this aspect of data use more often than executives, and many saw data systems as the answer to case management problems. As one MCO stated,

“We do have a case management system that keeps track. So, we are able to schedule calls. They're able to pop back up on a calling queue, so that we're able to check in with members and attempt to continuously reach out to them. So, that's kind of how we try to make sure that those members don't fall through the cracks by continuously following up.”

Most CBOs indicated that case management occurred but was more personalized and less attached to data and technology use,

“We have a database that we use for client notes. We just record case notes in there. Some of our caseworkers keep basic Excel spreadsheets on their specific clients and what they're working on. Most of that would be informal.”

Only one MCO specifically mentioned the limits of data systems for tracking and the need for a personal touch in case management, a perspective more in line with most CBO interviewees. The MCO shared this when discussing platform capabilities, stating,

“We have a case management platform, of course, where we document everything, because just like everywhere else, if you don't write it down, it didn't happen, but a lot of it is just that manual follow-up and that human touch.”

The variability in tracking system sophistication and capabilities between MCOs and CBOs was also frequently highlighted as one of the critical challenges in collaboration and a notable source of frustration for both sides. When discussing their partnerships with MCOs and data sharing, one CBO stated,

“They really wanted to know about it. And so had to spend considerable time with them about, ‘This is what we do, this is how stuff works.’ And including it's like, ‘No, we can't track. We have no way of tracking [MCO] clientele through the [KY food security] program’."

While MCO interviewees often noted this tension in collaboration, they were aware that capacity and resources typically made it harder for CBOs to track and collect data. One MCO interviewee noted,

“I think the challenge is just the data piece and the complexity of the regulations that we have to navigate, all for good reason. When you're talking about how to best leverage those community resources, if we can't kind of have those data exchanges, it makes it so much more difficult. And so when you're trying to get at outcomes or have simplified referral processes, it just makes it harder because you may not be able to get through, they may not have the HIPAA, the high-tech clearance or whatever it is. It's expensive for them to have to do that.”

Theme 3. Funding and reimbursement structures shaped how MCOs and CBOs used data to evaluate program effectiveness

We found limited alignment between MCO and CBO perspectives on using data to evaluate social need programming and partnerships. Instead, evaluation was an area fraught with incongruencies and tension between the two sectors. The financial incentives and pressures for using data differ substantially between MCOs and CBOs. MCOs reported using data to evaluate the financial impact or effectiveness of programs (particularly claims data/utilization metrics) and partnerships to justify investments or show MCO executives that meeting unmet social needs is good business. As one MCO interviewee explained,

“I think every anything that we’re doing with the community-based partner, we’re studying all that. We’re studying the reduction, so I’m able to say, okay, because we have this member in this [CBO program], in this residential treatment program, not only mama’s healthier, baby is not born exposed to opiates, no NICU, ER utilization down. I think that’s the neat thing, there’s your answer, right?”

One reason MCOs seem to be driving data collection for demonstrated effectiveness/return on investment is that they are heavily regulated in terms of how they can invest funds,

“We are doing payment innovation, we want to take money out of what’s being spent on health care and invest it into social services and that is not easy.”

As another MCO highlighted continued investment often depends on what they can demonstrate,

“Sometimes, there are finance guidelines, right? Like when I’m fighting for my budget, they’ll say, ‘Well, where’s the return on investment numbers?’.”

Conversely, only a few CBOs used data-driven evaluation to support their financial operations. When CBOs did report using data for evaluation, it was typically in relation to using outcomes data in grant writing to gain funding specifically from MCOs, data which may not serve any other useful purpose for the CBO. As one CBO stated,

“Another kind of pain point, and for like one of the managed care companies that we contract with, they give us $8,000 a year. But the requirements to receive that $8,000 is very data heavy. We have to go through and pull all this data, get different releases signed with the participants. It’s great to have extra money, but it’s also a lot of work and nothing really being tied to it, if that makes sense. They just want the data to be able to review and any good outcomes and success stories and stuff like that, which is great. But it’s a lot of work for not a lot of money.”

Theme 4. Tension in using data to partner with other MCOs and CBOs

Both MCO and CBO interviewees described several reasons why they engage in data sharing within MCO-CBO partnerships (e.g., to garner funding, demonstrate effectiveness, or enhance case management), even if the values and importance placed on data sharing differed between agency types. When data sharing existed or was being contemplated, interviewees still described several barriers to sharing, both practical and ideological.

Overwhelmingly, CBO interviewees expressed a perception that they had to report data to the MCOs to prove impact so MCOs would maintain the partnership or provide funding. The first subtheme revealed a notable ideological difference between the MCOs/CBOs regarding whether data was useful to evaluate program effectiveness . While data-driven evaluation is routine and relied upon by most MCOs, many CBO interviewees perceived that data and metrics could harm their operations, diverting time and energy from serving clients and that there is much about program effectiveness that simply cannot be captured using formal data tracking systems. When discussing the course of their partnerships with MCOs, one CBO highlighted,

“So what does that support look like? Well, it is financial support for it. And, initially, it was very much focused on their clientele with [MCO] clientele and trying to track metrics about the impact that having access to better nutrition was going to have on the outcomes for their folks, right? So over the course of two years, I mean, we were able to show, "we," and I mean that collectively, we're able to show that it does have a positive impact. I mean, for [MCO], I think it's safe to say that they realize that it is more cost-effective to invest upfront in increasing access to healthy food better than the back end, to drugs and health care costs and all that kind of stuff. So they have, again, they have maintained that partnership.”

Indeed, most MCOs expressed wanting data from their CBO partners to justify the relationship and a reluctance to build relationships if data capacity is not present. One MCO discussed this directly, stating,

“They come us and they send us their flyer and they're like, "We want [MCO] to partner with us on our heart walk and we want you to give us $20,000." We still get a lot of people that do that because that's their old business model. Most of the time, we don't engage with those types of organizations. I always say, we want to hear from someone and I will take a meeting always if a community-based organization says, "We have an evidence-based solution that is solving for X," or "We have a solution that is solving for X and we want to work with you to help us prove that it's evidence-based," or we have research capabilities...”

Subtheme 2 illustrates how underlying the data sharing tension between CBOs and MCOs are challenges related to the need for more effective and user-friendly interfacing between tracking and referral systems, as well as the limited capacity of CBOs to track and analyze data . As mentioned, the sophistication of CBO data systems is highly variable, and even those organizations with more advanced tracking systems struggle with data sharing. When asked about data sharing, one CBO noted,

“Well that's another pain point. In my history, in my experience, every health plan has their own data system that don't talk to one another, that are very convoluted and messy. Right now we're filling stuff in on an Excel spreadsheet.”

Several MCOs also highlighted this as a challenge. As one MCO stated,

“Our system is designed to deal with hospital systems and health care providers, there's many different levels. I mean we go through a pretty comprehensive system and you have to have all kinds of, meet all kinds of requirements, share data, and different pieces that for a small community-based organization providing housing services, they might not even have the capacity to meet those requirements.”

Although some CBOs reported sharing data with MCOs willingly and saw this sharing as a natural facet of their partnership, other CBOs described significant concerns about data privacy and ownership ( subtheme 3 ). They noted how important data privacy was to the clients they served and how their organization valued serving their clients without the need to collect personal data or share it. Some CBO interviewees indicated that sharing or even collecting private client data might compromise their ability to do their work and serve their clients well,

“We respect their privacy, and we will never do any sharing of their data. In fact, a lot of people who come to us, one of the reasons they're with us is because we do not require them to show an ID.”

Subtheme 4 revealed how CBO and MCO interviewees expressed concerns about relying on data and technology as the solution to social need screening and referral systems building . Interviewees felt that data does not adequately capture utilization or partnership benefits. Primarily, this was attributed to issues related to data quality. One MCO interviewee highlighted this when discussing the challenges of understanding the quality of social need services:

“We also don't have a really long track record of managing quality for this type of provider. We have very distinct report cards and quality cards for every hospital in the state of Kentucky. I can tell you what the outcomes for [Hospital 1] compared to [Hospital 2] and compared to [Hospital 3]. We have very clear metrics on those types of things. We do not have that for the sort of soft services, especially since we don't pay for them.”

Most CBOs articulated challenges with data quality centered on their perception that data does not tell the whole story about what is happening at their organization and in the community. As one CBO noted,

“ We have a people problem. And I think right now there are a lot of hospitals and other organizations, MCOs, that want to kind of tech their way out of this. [T]hey're looking for technological [solutions] to try to streamline and expand services to folks. And that's just not really the answer. You need people.”

MCO interviewees recognized that databases and their tracking systems may be limited in what they capture. In subtheme 5 , several noted their technological ability to comprehensively track organizations in a community as a significant limitation . Maintaining accurate data has also been challenging because of community organization turnover and closures. As one MCO highlighted,

“These national repositories don't have the local knowledge so they don't know the churches that do the hot meals and they don't know the small organizations that are getting up and off their feet and tied to this one or that one, or it's an offshoot of whatever. There are some smaller organizations that don't always get into those big directories and you don't always know about them unless you have boots on the ground, people who live and work in the community and actually know what those are.”

Similarly, another MCO highlighted CBO data capacity as a major challenge in their partnerships, stating,

“Biggest challenges. I guess, you could say data might be the challenges, to close the loop around the return on investment on some of these organizations that are not ... They just don't have the staffing, or the professional leadership, if you will, to do all the tracking. The ones that do, do it very well. The ones that don't, it's just that they don't have the resources.”

In the final subtheme, all MCO interviewees acknowledged that CBOs are doing good work , even if that cannot be quantified, and the ability to share that data is often related to CBO capacity and resources. One MCO shared,

“[Food Pantry CBO] who's just like [Named Female] and her husband [Named Male], they might be the greatest people and we might know that members like going there versus the other food bank because [Named Female] like bakes brownies and gives them a hug and we want to quantify that but also it's just not realistic because they don't have the infrastructure sometimes that's needed to prove the business case, solidify the partnership and ultimately inform policy.”

Our study found alignment as well as discordance between MCOs and CBOs about how and when to leverage technology and data despite their shared mission to meet the unmet social needs of enrollees. Our findings offer important insights regarding why data and technology may create a barrier to effective MCO-CBO partnerships, potentially hindering efforts to improve health and social outcomes. They also provide guidance and identify key considerations for developing programs and partnerships that may be more effective in coordinating efforts between the two organizations.

As we observed in Themes 1 (Alignment on collecting data to identify and prioritize patient needs) and 2 (Differences in organizational capacity, mission, and resources influenced variability in data use to support case management), results suggest that data and technology can be important tools in screening and referral for social needs, but they are far from a universal panacea. Our data indicate that both logistical and cultural disconnects between MCOs and CBOs significantly limit data collection and sharing for coordination of services. On the logistical side, CBOs have extremely limited capacity (software, workforce) to collect and share data. Several participants reported serious concerns with collecting and sharing confidential client information. To make matters worse, MCOs use a range of proprietary and sophisticated referral and tracking systems that severely tax the resources and capacity of CBOs. On the cultural side, while MCOs view data and technology as essential to partnering with CBOs to meet enrollee social needs, CBOs do not. In fact, as we found in Theme 3 (Funding and reimbursement structures shaped how MCOs and CBOs used data to evaluate program effectiveness), many CBOs see data collection as a necessary evil to garner funding from potential donors. Instead, they emphasize the relationship-honoring aspects of their work as a core value.

Solutions that only focus on providing data collection and tracking technology to CBOs are unlikely to be completely successful because they fail to address the disparate cultures found in MCOs vs. CBOs. This conclusion is robustly supported by Theme 4 from our analysis (Tension in using data to partner with other MCOs and CBOs).In many ways, CBOs may view MCO efforts to grow their technological capacity as imposing profit-seeking values, norms, and structure rather than seeking true understanding and partnership. CBOs’ low enthusiasm for and capacity to use data can create difficulty for MCOs when MCOs rely on CBOs for data to justify their funding streams and partnerships. This fundamental disconnect is likely to severely impede partnership efforts without reevaluating the strengths and values each sector brings to the collaborative [ 30 ].

Successful partnerships are built on shared interest and trust [ 31 ]. Our study suggests a strong alignment between MCOs and CBOs in addressing the social needs of highly vulnerable Medicaid beneficiaries. This values alignment may offer a foundation for partnership. Our work underscores a key finding across studies on cross-sector partnerships integrating health and social services, more work must be done to build trust and understand each other’s organizational values [ 17 , 19 , 32 ]. MCOs and CBOs need each other to address social determinants of health (SDOH) effectively. MCOs have the resources and responsibility for finding more effective ways to support their beneficiaries. CBOs are ‘on the ground’ and have the trust of the clients they serve (many of whom are Medicaid enrollees). Forums that create a level playing field for both types of organizations and facilitate safe conversations to build trust are essential.

The Department of Health and Human Services (DHHS) has developed a three-pronged strategy for addressing SDOH: (1) better data, (2) improving health and social services connections, and (3) whole-of-government collaborations [ 8 ]. Our study suggests that their second strategy is essential and could be far more difficult than many imagine. Facilitating honest conversations about identifying and addressing the challenges in building these connections is a critical first step. Because many challenges involve “hearts and minds” and organizational culture, addressing these challenges will need to be a slow and iterative process. Moving forward, organizations like MCOs and other clinical partners must carefully consider how data and social need screening and referral technology can be a value-add to CBOs and not another burden on their already strained capacity.


While our sample included at least one representative from all six state MCOs and nineteen different CBOs, the generalizability of study results may not apply to other states. However, many of the MCOs in KY operate in national markets and often use similar strategies in different geographic areas. Insights likely shed light on similar efforts and challenges in other states and markets. Future studies examining the use of data and technology nationally in social need resolutions would provide confirmation of the results we present and any potential geographic variability. Additionally, participant perspectives may not necessarily represent their MCOs or CBOs. Finally, our cross-sectional view of technology and referral platforms provides a snapshot of current processes; a more in-depth longitudinal study would capture changes over time as technology constantly evolves.

Despite a shared mission to meet unmet social needs, MCOs and CBOs do not agree on how and when to leverage technology and data. This discordance is a significant barrier to effective partnerships. Technology offers powerful tools for identifying and prioritizing enrollee needs and connecting them with services. However, trust and a shared understanding of organizational cultures and goals are critically needed to allow technology to realize its potential. Current efforts to build effective MCO-CBO partnerships should focus on creating a level playing field for all organizations and a space for honest conversations that can build strong connections and sustainable relationships across sectors.

Availability of data and materials

Deidentified aggregated data is available from the corresponding author ([email protected]) on reasonable request.

Gottlieb L, Tobey R, Cantor J, Hessler D, Adler NE. Integrating social and medical data to improve population health: opportunities and barriers. Health Aff. 2016;35(11):2116–23.

Article   Google Scholar  

Seligman HK, Laraia BA, Kushel MB. Food insecurity is associated with chronic disease among low-income NHANES participants. J Nutr. 2010;140(2):304–10.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Silverman J, Krieger J, Kiefer M, Hebert P, Robinson J, Nelson K. The relationship between food insecurity and depression, diabetes distress and medication adherence among low-income patients with poorly-controlled diabetes. J Gen Intern Med. 2015;30:1476–80.

Article   PubMed   PubMed Central   Google Scholar  

Berkowitz SA, Hulberg AC, Hong C, Stowell BJ, Tirozzi KJ, Traore CY, Atlas SJ. Addressing basic resource needs to improve primary care quality: a community collaboration programme. BMJ Qual Saf. 2016;25(3):164–72.

Article   PubMed   Google Scholar  

Cole MB, Nguyen KH. Unmet social needs among low-income adults in the United States: Associations with health care access and quality. Health Serv Res. 2020;55:873–82.

Fiori KP, Heller CG, Rehm CD, Parsons A, Flattau A, Braganza S, Lue K, Lauria M, Racine A. Unmet social needs and no-show visits in primary care in a US northeastern urban health system, 2018–2019. Am J Public Health. 2020;110(S2):S242–50.

Alley DE, Asomugha CN, Conway PH, Sanghavi DM. Accountable health communities—addressing social needs through Medicare and Medicaid. N Engl J Med. 2016;374(1):8–11.

Article   CAS   PubMed   Google Scholar  

De Lew N, Sommers BD. Addressing social determinants of health in federal programs. InJAMA Health Forum 2022;3(3):e221064-e221064). American Medical Association.

Thompson T, McQueen A, Croston M, Luke A, Caito N, Quinn K, Funaro J, Kreuter MW. Social needs and health-related outcomes among Medicaid beneficiaries. Health Educ Behav. 2019;46(3):436–44.

Moreno-Camacho CA, Montoya-Torres JR, Jaegler A, Gondran N. Sustainability metrics for real case applications of the supply chain network design problem: A systematic literature review. J Clean Prod. 2019;10(231):600–18.

Apenteng BA, Kimsey L, Opoku ST, Owens C, Peden AH, Mase WA. Addressing the social needs of Medicaid enrollees through managed care: lessons and promising practices from the field. Popul Health Manag. 2022;25(1):119–25.

Cartier Y, Fichtenberg C, Gottlieb LM. Implementing Community Resource Referral Technology: Facilitators And Barriers Described By Early Adopters: A review of new technology platforms to facilitate referrals from health care organizations to social service organizations. Health Aff. 2020;39(4):662–9.

Center for Health Care Strategies [Internet]. Supporting social service and health care partnerships to address health-related social needs: case study series. [updated 2018; cited 2023 Nov 9] Available from: https://www.chcs.org/project/partnership-healthy-outcomes-bridging-community-based-human-services-health-care/ . Accessed March 2, 2023.

Klein S, Hostetter M. Leveraging Technology to Find Solutions to Patients’ Unmet Social Needs. The Commonwealth Fund; June 21, 2017. Available from: https://www.commonwealthfund.org/publications/2017/jun/leveraging-technology-find-solutions-patients-unmet-social-needs

Blavin F, Smith LB, Ramos C, Ozanich G, Horn A. Opportunities to Improve Data Interoperability and Integration to Support Value-Based Care. 2022.

Massar RE, Berry CA, Paul MM. Social needs screening and referral in pediatric primary care clinics: a multiple case study. BMC Health Serv Res. 2022;22(1):1369.

Hogg‐Graham R, Edwards K, L Ely T, Mochizuki M, Varda D. Exploring the capacity of community‐based organisations to absorb health system patient referrals for unmet social needs. Health Soc Care Commun. 2021;29(2):487–95.

Amarashingham R, Xie B, Karam A, Nguyen N, Kapoor B. Using community partnerships to integrate health and social services for high-need, high-cost patients. Issue Brief (Commonw Fund). 2018;2018:1–11.

PubMed   Google Scholar  

Agonafer EP, Carson SL, Nunez V, Poole K, Hong CS, Morales M, et al. Community-based organizations’ perspectives on improving health and social service integration. BMC Public Health. 2021;21(1):1–12.

Petchel S, Gelmon S, Goldberg B. The Organizational Risks Of Cross-Sector Partnerships: A Comparison Of Health And Human Services Perspectives: A legal and policy review to identify potential funding streams specifically for Accountable Communities For Health infrastructure activities. Health Aff. 2020;39(4):574–81.

Hogg-Graham R, Scott AM, Stahl H, Riley E, Clear ER, Waters TM. COVID-19 and MCO-community partnerships to address enrollee social needs. Am J Managed Care. 2023;29(3).

KFF. Medicaid State Fact Sheets 2023 [Available from: https://www.kff.org/interactive/medicaid-state-fact-sheets/ .

Hogg-Graham R, Scott AM, Waters TM. Medicaid Managed Care Organizations and Community Based Organizations Social Need Strategies Interview Guide. University of Kentucky. 2021.

Sandelowski M. Using qualitative research. Qual Health Res. 2004;14(10):1366–86.

Thornberg R, Charmaz K. Grounded theory and theoretical coding. The SAGE handbook of qualitative data analysis. 2014;2014(5):153–69.

Cascio MA, Lee E, Vaudrin N, Freedman DA. A team-based approach to open coding: Considerations for creating intercoder consensus. Field Methods. 2019;31(2):116–30.

Unite Us. Cross-sector collaboration software powered by community. [updated 2023; cited 2023 Nov 9]. Available from: https://uniteus.com/ .

FindHelp.org. Social Care Technology [Internet]. [updated 2023; cited 2023 Nov 9]. Available from: https://company.findhelp.com/ .

The Partnership Center. Vesta [Internet]. [updated 2023; cited 2023 Nov 9]. 2023; Available from: https://thepcl.net/vesta.html .

Varda DM, Retrum JH. Collaborative performance as a function of network members’ perceptions of success. Public Perform Manag Rev. 2015;38(4):632–53.

Varda DM, Chandra A, Stern SA, Lurie N. Core dimensions of connectivity in public health collaboratives. J Public Health Manag Pract. 2008;14(5):E1-7.

Byhoff E, Taylor LA. Massachusetts community-based organization perspectives on Medicaid redesign. Am J Prev Med. 2019;57(6):S74–81.

Download references


The authors would like to thank the Study Advisory Board for their help in guiding the research.

This research was supported by a Robert Wood Johnson Foundation grant as part of the Research in Transforming Health and Health Systems Program (Grant ID 77256). Research reported in this publication was also supported by the Kentucky Cabinet for Health and Family Services, Department for Medicaid Services under Agreement C2517 titled “Medicaid Managed Care Organizational Strategies to Address Enrollee Unmet Social Needs.” The content is solely the responsibility of the authors and does not necessarily represent the official views of the Cabinet for Health and Family Services, Department for Medicaid Services.

Author information

Authors and affiliations.

Department of Health Management and Policy, College of Public Health, University of Kentucky, 111 Washington Ave, 107B, Lexington, KY, USA

Rachel Hogg-Graham, Emily R. Clear & Elizabeth N. Riley

Department of Communication, University of Kentucky, Lexington, KY, USA

Allison M. Scott

Institute for Public and Preventive Health, Augusta University, Augusta, GA, USA

Teresa M. Waters

You can also search for this author in PubMed   Google Scholar


Concept and design (RH-G, AMS, TMW); acquisition of data (RH-G, AMS, ERC, TMW); analysis and interpretation of data (RH-G, AMS, ER, TMW); drafting of the manuscript (RH-G, AMS, ER, ERC, TMW); critical revision of the manuscript for important intellectual content (RH-G, AMS, ER, TMW); provision of patients or study materials (RH-G, ERC); obtaining funding (RH-G, TMW); administrative, technical, or logistic support (RH-G, ERC, TMW); and supervision (RH-G).

Corresponding author

Correspondence to Rachel Hogg-Graham .

Ethics declarations

Ethics approval and consent to participate.

All research activities involving human subjects have been reviewed and approved by the Institutional Review Board at the University of Kentucky. Informed consent was verbally obtained by all participants. All methods were carried out in accordance with relevant guidelines and regulations.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Hogg-Graham, R., Scott, A.M., Clear, E.R. et al. Technology, data, people, and partnerships in addressing unmet social needs within Medicaid Managed Care. BMC Health Serv Res 24 , 368 (2024). https://doi.org/10.1186/s12913-024-10705-w

Download citation

Received : 13 November 2023

Accepted : 11 February 2024

Published : 23 March 2024

DOI : https://doi.org/10.1186/s12913-024-10705-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Social determinants of health
  • Managed care organizations
  • Health care organizations and systems

BMC Health Services Research

ISSN: 1472-6963

what is data collection in research methodology

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base


  • What Is Qualitative Research? | Methods & Examples

What Is Qualitative Research? | Methods & Examples

Published on June 19, 2020 by Pritha Bhandari . Revised on June 22, 2023.

Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research.

Qualitative research is the opposite of quantitative research , which involves collecting and analyzing numerical data for statistical analysis.

Qualitative research is commonly used in the humanities and social sciences, in subjects such as anthropology, sociology, education, health sciences, history, etc.

  • How does social media shape body image in teenagers?
  • How do children and adults interpret healthy eating in the UK?
  • What factors influence employee retention in a large organization?
  • How is anxiety experienced around the world?
  • How can teachers integrate social issues into science curriculums?

Table of contents

Approaches to qualitative research, qualitative research methods, qualitative data analysis, advantages of qualitative research, disadvantages of qualitative research, other interesting articles, frequently asked questions about qualitative research.

Qualitative research is used to understand how people experience the world. While there are many approaches to qualitative research, they tend to be flexible and focus on retaining rich meaning when interpreting data.

Common approaches include grounded theory, ethnography , action research , phenomenological research, and narrative research. They share some similarities, but emphasize different aims and perspectives.

Note that qualitative research is at risk for certain research biases including the Hawthorne effect , observer bias , recall bias , and social desirability bias . While not always totally avoidable, awareness of potential biases as you collect and analyze your data can prevent them from impacting your work too much.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

what is data collection in research methodology

Each of the research approaches involve using one or more data collection methods . These are some of the most common qualitative methods:

  • Observations: recording what you have seen, heard, or encountered in detailed field notes.
  • Interviews:  personally asking people questions in one-on-one conversations.
  • Focus groups: asking questions and generating discussion among a group of people.
  • Surveys : distributing questionnaires with open-ended questions.
  • Secondary research: collecting existing data in the form of texts, images, audio or video recordings, etc.
  • You take field notes with observations and reflect on your own experiences of the company culture.
  • You distribute open-ended surveys to employees across all the company’s offices by email to find out if the culture varies across locations.
  • You conduct in-depth interviews with employees in your office to learn about their experiences and perspectives in greater detail.

Qualitative researchers often consider themselves “instruments” in research because all observations, interpretations and analyses are filtered through their own personal lens.

For this reason, when writing up your methodology for qualitative research, it’s important to reflect on your approach and to thoroughly explain the choices you made in collecting and analyzing the data.

Qualitative data can take the form of texts, photos, videos and audio. For example, you might be working with interview transcripts, survey responses, fieldnotes, or recordings from natural settings.

Most types of qualitative data analysis share the same five steps:

  • Prepare and organize your data. This may mean transcribing interviews or typing up fieldnotes.
  • Review and explore your data. Examine the data for patterns or repeated ideas that emerge.
  • Develop a data coding system. Based on your initial ideas, establish a set of codes that you can apply to categorize your data.
  • Assign codes to the data. For example, in qualitative survey analysis, this may mean going through each participant’s responses and tagging them with codes in a spreadsheet. As you go through your data, you can create new codes to add to your system if necessary.
  • Identify recurring themes. Link codes together into cohesive, overarching themes.

There are several specific approaches to analyzing qualitative data. Although these methods share similar processes, they emphasize different concepts.

Qualitative research often tries to preserve the voice and perspective of participants and can be adjusted as new research questions arise. Qualitative research is good for:

  • Flexibility

The data collection and analysis process can be adapted as new ideas or patterns emerge. They are not rigidly decided beforehand.

  • Natural settings

Data collection occurs in real-world contexts or in naturalistic ways.

  • Meaningful insights

Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing, testing or improving systems or products.

  • Generation of new ideas

Open-ended responses mean that researchers can uncover novel problems or opportunities that they wouldn’t have thought of otherwise.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Researchers must consider practical and theoretical limitations in analyzing and interpreting their data. Qualitative research suffers from:

  • Unreliability

The real-world setting often makes qualitative research unreliable because of uncontrolled factors that affect the data.

  • Subjectivity

Due to the researcher’s primary role in analyzing and interpreting data, qualitative research cannot be replicated . The researcher decides what is important and what is irrelevant in data analysis, so interpretations of the same data can vary greatly.

  • Limited generalizability

Small samples are often used to gather detailed data about specific contexts. Despite rigorous analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased and unrepresentative of the wider population .

  • Labor-intensive

Although software can be used to manage and record large amounts of text, data analysis often has to be checked or performed manually.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square goodness of fit test
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). What Is Qualitative Research? | Methods & Examples. Scribbr. Retrieved March 26, 2024, from https://www.scribbr.com/methodology/qualitative-research/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, how to do thematic analysis | step-by-step guide & examples, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

This paper is in the following e-collection/theme issue:

Published on 27.3.2024 in Vol 26 (2024)

This is a member publication of University of Toronto

Bridging and Bonding Social Capital by Analyzing the Demographics, User Activities, and Social Network Dynamics of Sexual Assault Centers on Twitter: Mixed Methods Study

Authors of this article:

Author Orcid Image

Original Paper

  • Jia Xue 1, 2 , PhD   ; 
  • Qiaoru Zhang 3 * , BA   ; 
  • Yun Zhang 3 * , MI   ; 
  • Hong Shi 3 * , MI   ; 
  • Chengda Zheng 3 * , MI   ; 
  • Jingchuan Fan 1 , MSW, MPH   ; 
  • Linxiao Zhang 1 , MA, MSc   ; 
  • Chen Chen 3 , PhD   ; 
  • Luye Li 4 , PhD   ; 
  • Micheal L Shier 1 , PhD  

1 Factor Inwentash Faculty of Social Work, University of Toronto, Toronto, ON, Canada

2 Faculty of Information, University of Toronto, Toronto, ON, Canada

3 Artificial Intelligence for Justice Lab, University of Toronto, Toronto, ON, Canada

4 Department of Sociology, Anthropology, Social Work, and Criminal Justice, Seton Hall University, South Orange, NJ, United States

*these authors contributed equally

Corresponding Author:

Jia Xue, PhD

Factor Inwentash Faculty of Social Work

University of Toronto

246 Bloor Street West

Toronto, ON, M5S 1V4

Phone: 1 416 946 5429

Email: [email protected]

Background: Social media platforms have gained popularity as communication tools for organizations to engage with clients and the public, disseminate information, and raise awareness about social issues. From a social capital perspective, relationship building is seen as an investment, involving a complex interplay of tangible and intangible resources. Social media–based social capital signifies the diverse social networks that organizations can foster through their engagement on social media platforms. Literature underscores the great significance of further investigation into the scope and nature of social media use, particularly within sectors dedicated to service delivery, such as sexual assault organizations.

Objective: This study aims to fill a research gap by investigating the use of Twitter by sexual assault support agencies in Canada. It seeks to understand the demographics, user activities, and social network structure within these organizations on Twitter, focusing on building social capital. The research questions explore the demographic profile, geographic distribution, and Twitter activity of these organizations as well as the social network dynamics of bridging and bonding social capital.

Methods: This study used purposive sampling to investigate sexual assault centers in Canada with active Twitter accounts, resulting in the identification of 124 centers. The Twitter handles were collected, yielding 113 unique handles, and their corresponding Twitter IDs were obtained and validated. A total of 294,350 tweets were collected from these centers, covering >93.54% of their Twitter activity. Preprocessing was conducted to prepare the data, and descriptive analysis was used to determine the center demographics and age. Furthermore, geolocation mapping was performed to visualize the center locations. Social network analysis was used to explore the intricate relationships within the network of sexual assault center Twitter accounts, using various metrics to assess the network structure and connectivity dynamics.

Results: The results highlight the substantial presence of sexual assault organizations on Twitter, particularly in provinces such as Ontario, British Columbia, and Quebec, underscoring the importance of tailored engagement strategies considering regional disparities. The analysis of Twitter account creation years shows a peak in 2012, followed by a decline in new account creations in subsequent years. The monthly tweet activity shows November as the most active month, whereas July had the lowest activity. The study also reveals variations in Twitter activity, account creation patterns, and social network dynamics, identifying influential social queens and marginalized entities within the network.

Conclusions: This study presents a comprehensive landscape of the demographics and activities of sexual assault centers in Canada on Twitter. This study suggests that future research should explore the long-term consequences of social media use and examine stakeholder perceptions, providing valuable insights to improve communication practices within the nonprofit human services sector and further the missions of these organizations.


Use of social media by nonprofit organizations.

Social media platforms, including Twitter (subsequently rebranded as X [X Corp]), have gained popularity among nonprofit advocacy organizations as essential tools for communication and public engagement [ 1 , 2 ]. Nonprofit organizations are increasingly recognizing the strategic value of social media in fostering public engagement, securing donations, disseminating information, recruiting volunteers, and raising awareness about social issues [ 3 - 8 ]. Today, most large and mid-sized nonprofit organizations actively maintain at least 1 social media account, underscoring the extensive use of social media within the nonprofit realm [ 9 ].

Twitter, for instance, offers nonprofit organizations a platform to create profiles, establish networks, and engage socially through features such as tweeting, sharing multimedia content, replying, and retweeting [ 10 ]. Recognized as a cost-effective means of consistently reaching a broader audience [ 11 , 12 ], Twitter proves especially valuable for nonprofit organizations, often facing limited financial resources and dedicated communication staff [ 12 ], including sexual assault centers, to actively engage with key stakeholders and spark meaningful conversations [ 13 , 14 ]. Moreover, engaging in dialogues with other Twitter users forms a central aspect of communication for these organizations, facilitating increased supporter involvement, knowledge dissemination, and the creation of supportive communities [ 15 , 16 ]. Nonprofit organizations have successfully captured their followers’ attention by regularly tweeting, responding to specific tweets, and retweeting other users’ content [ 2 ]. This social media engagement can be harnessed by nonprofit organizations to share educational information and advocate for social causes [ 17 ].

Bridging and Bonding Social Capital

Social capital plays a critical role in understanding the effectiveness of nonprofit organizations, as it is embedded within their networks, enabling them to enhance their adaptive capabilities by consolidating shared interests and harnessing diverse resources [ 18 , 19 ]. Within the context of social capital, Putnam [ 20 ] distinguishes between 2 fundamental forms: bridging and bonding social capital. Bridging social capital encompasses the distant and weak connections between individuals from diverse backgrounds, facilitating information flow. This often manifests as nonmutual following relationships between organizations and a diverse public. In contrast, bonding social capital revolves around preexisting and robust ties that reinforce homogeneity among groups, fostering emotional and social support. An illustrative example of this is the mutual following relationships observed between similar organizations [ 21 , 22 ].

From a social capital perspective, relationship building is seen as an investment. It involves a complex interplay of tangible and intangible resources, both embedded within existing relationships and generated through the act of forging new ones [ 23 , 24 ]. The success of nonprofit organizations relies significantly on their capacity to establish high-quality relationships with key stakeholders, including donors, clients, grant makers, seekers, and the broader public [ 25 , 26 ]. The social capital of nonprofits comprises the wealth of resources intricately embedded within these strategic alliances and stakeholder relationships [ 27 ]. Xu and Saxton [ 26 ] propose that the effective acquisition of social capital, at an elevated level, relies on the scope and quality of stakeholder connections. Their study introduces and demonstrates the significance of 2 primary stakeholder engagement strategies: content-based and connection-based strategies. This study underscores that the attainment of social capital is less about the number of stakeholder engagements and more about the breadth of those engagements. This breadth includes diverse stakeholder connections.

Social Media–Based Social Capital

Social media–based social capital signifies the diverse social networks that organizations can foster through their engagement on social media platforms [ 26 ]. The potential of social media to nurture and sustain web-based–offline social capital is substantial, although its effectiveness varies across platforms and strategies [ 28 ]. Platforms such as Facebook (Meta Platforms), Twitter, and Instagram (Meta Platforms) offer distinctive usability features that influence the dynamics of bridging and bonding social capital among their users. An important study conducted by Phua et al [ 22 ] examined the impact of 4 major social networking sites (Facebook, Twitter, Instagram, and Snapchat [Snap Inc]) on the development of web-based bridging and bonding social capital among 297 users. Their findings indicate that Twitter users exhibit the highest levels of bridging social capital, followed by Instagram, Facebook, and Snapchat. Conversely, when it comes to bonding social capital, Snapchat users demonstrate the highest levels, followed by Facebook, Instagram, and Twitter. Furthermore, research suggests a direct correlation between the number of followers and the development of bonding social capital [ 21 ]. Another study by Xu and Saxton [ 26 ], focusing on 198 community foundations, reinforces the importance of social media engagement strategies tailored to multiple intersectoral stakeholders and diverse communication patterns, which substantially contribute to the development of social media–based social capital.

In the context of nonprofit organizations, studies by Henry and Bosman [ 29 ] and Lee and Shon [ 30 ] underscore the positive impact of web-based social capital generated through social networking sites on charitable outcomes. These studies reveal that the quantity of Twitter followers is linked positively with personal contributions, although not necessarily with full-time equivalent volunteers [ 30 ]. Moreover, Xu and Saxton [ 26 ], drawing from Twitter data consisting of 198 community foundations, highlight the pivotal role of stakeholder engagement diversity over connection quantities. They emphasize the significance of using multiple communicative cues, such as message elements, and targeting intersectoral and interregional stakeholders in the successful acquisition of social capital through social networking sites. Leveraging social media platforms offers numerous advantages to nonprofit organizations, including the engagement of a donor base within the general population [ 1 , 31 ], the facilitation of communication strategies through the dissemination of information to a broader global audience [ 32 ], and the support of advocacy efforts for social change and community mobilization [ 17 ]. Svensson et al [ 33 ] examined the Twitter use of sport-for-development organizations and identified varying levels of engagement across different entities, potentially limiting the cultivation of social media–based social capital within this sector. Investigating the extent of social media use serves as a valuable tool to inform recommendations aimed at enhancing nonprofits’ web-based presence and fostering social media–based social capital [ 2 , 4 , 34 ].

These findings underscore the great significance of further investigation into the scope and nature of social media use, particularly within sectors dedicated to service delivery, such as sexual assault programs and organizations, which share a common mission and focus of their efforts. The endeavor to augment social capital through social media within a given sector has the potential to expand the donor and volunteer base, engage the community in matters affecting everyone, and catalyze broader social change at the policy level by mobilizing concerned citizens.

Aim of the Study

This study aims to address the existing research gap surrounding the use of social media platforms such as Twitter by specific organizations, such as sexual assault support agencies. This study intends to investigate user activities, demographics, and social network structures within these organizations on Twitter. By doing so, we aim to contribute to a better understanding of the current state of social media adoption and social network structures within sexual assault organizations in Canada. In addition, this study provides valuable insights and recommendations for building social capital among the sexual assault organizations on social media. To achieve these goals, we formulated the following research questions (RQs):

  • RQ1a: How prevalent are sexual assault centers in Canada with official Twitter accounts, and which provinces and territories have the highest number of centers actively using Twitter?
  • RQ1b: What are the geographic locations of sexual assault centers with official Twitter accounts in Canada?
  • RQ1c: In what years were the Twitter accounts of sexual assault centers in Canada established, and are there any differences in account creation among provinces and territories?
  • RQ1d: What is the average age of these centers since establishing their official Twitter accounts, and do any differences in account creation exist among provinces and territories?
  • RQ2a: How many sexual assault centers maintain an active Twitter account each year in each province or territory?
  • RQ2b: What are the Twitter activity and posting patterns of sexual assault centers while they are active on Twitter?
  • RQ2c: How do the Twitter activity and posting patterns vary across provinces and territories?
  • RQ3a: What are the variations in network size, specifically in terms of followers and followings, among sexual assault centers in different provinces and territories in Canada?
  • RQ3b: What is the relationship between followers and followings of these organizations on Twitter?
  • RQ3c: What insights can be gained from the social network structure of sexual assault centers on Twitter?

This study used purposive sampling to select sexual assault centers in Canada. Our sampling frame was developed by combining lists of sexual assault centers by province and territory from 2 sources: the Canadian Association of Sexual Assault Centres website and the Sexual Assault Centres, Crisis Lines, and Support Services directory. After removing duplicates, our sample frame consisted of 350 sexual assault centers across 10 provinces and 3 territories, providing basic information such as center name, phone number, email, and website. Our inclusion criteria were that the sexual assault center had a Twitter account and had posted at least 1 tweet. To confirm eligibility, a research assistant manually searched the home page of these centers and Twitter pages and conducted Google searches. We determined that 127 organizations had Twitter accounts, but 3 of them had never tweeted anything. As a result, our final sample consisted of 124 Twitter accounts belonging to sexual assault centers across 9 provinces and the Yukon and Northwest Territories. It should be noted that there were no sexual assault centers in Prince Edward Island and Nunavut that used Twitter.

Twitter Handles’ Acquisition

We collected the Twitter account name, location (eg, Toronto, Ontario), and Twitter handle (eg, @ABCD) for each sexual assault center’s Twitter account. The Twitter handle represented as “@name” is used by followers when replying to, mentioning, and sending direct messages to an account. We identified 22 duplicate Twitter handles among the sampled centers. As a result, our final sampling list consisted of 113 unique Twitter handles obtained from 124 centers. We gathered this information directly from the home page of each sexual assault center’s Twitter account.

Data Collection

Acquisition of twitter ids.

To collect the data necessary for this study, we obtained Twitter IDs for the 113 unique Twitter handles in our sample. A Twitter ID (eg, 12345678) is a unique numeric value associated with each Twitter handle, and it cannot be changed. We converted each Twitter handle to its corresponding Twitter ID. To ensure the accuracy of our conversions, 2 research assistants verified the results using 3 different websites: TweeterID, CodeOfaNinja, and Comment Picker.

Collection of Tweets

We used the 113 Twitter IDs associated with the sampled 124 sexual assault centers to collect their corresponding tweets. To accomplish this, we used Twitter’s academic search application programing interface (API) full archive end point and timeline end point, which allowed us to retrieve tweets published as early as 2006 [ 35 ]. We accessed the Twitter API using the native rest API requests. Our data collection process was conducted on March 15, 2023. We downloaded all tweets posted by the sampled centers from the date of each account’s establishment to March 15, 2023. Our data set included 294,350 tweets from 124 sexual assault centers, crisis lines, or support services. We obtained a substantial portion of the total number of tweets published by each Twitter ID on Twitter, specifically, >93.54%.

Data Features

We collected several features for each individual tweet message, including the user ID (user_id_str), user account creation date (user_created_at), user location (user_location), username (user_name), user screen name (user_screen_name), tweet creation time (tweet_created_time), full text of the tweet (full_text), and full text of any retweeted status (retweeted_status_full_text).

On Twitter, users commonly use functions such as retweets, replies, mentions, and hashtags. Retweets refer to publicly shared tweets between users and their followers. Users can also add their own comments and media before retweeting. In addition, users can participate in conversations on Twitter by replying to other users and mentioning them in their tweets. Finally, hashtags allowed users to easily follow and search for topics of interest.

Data Analysis

Preprocessing of raw data.

To address our RQs, we preprocessed the raw data using the following steps:

  • We removed URLs from the tweets.
  • We removed all punctuation marks, with the exception of apostrophes, which are important for contextual meaning in certain words (eg, “We’re”).
  • We removed any bigrams from the set if either of its elements belonged to the list of stop words. For instance, the phrase “The increasing awareness about sexual assault” would generate the bigrams “the increasing,” “increasing awareness,” “awareness about,” “about sexual,” and “sexual assault.” In this case, the stop words “the” and “about” would be removed from the bigrams “the increasing,” “awareness about,” and “about sexual,” leaving the bigrams “increasing awareness” and “sexual assault” in the set.

Descriptive Analysis

Descriptive analysis was used to calculate the number of sexual assault centers in each province, the number of centers created each year, and their average age. The age of each sexual assault center was determined by dividing the month of March 2023 by the establishment date of that center. For example, we used R’s difftime method (R Core Team) to calculate the age of center A’s Twitter account, which was created on March 12, 2009. By subtracting “2009-03-12” from “2023-03-15” to obtain the time difference, we determined that this center has an age of 14 years.

Geolocation Mapping of Sexual Assault Centers

We used the Twitter accounts’ IDs and sexual assault center locations to determine the actual locations of each tweet sent by the 124 centers. The locations were plotted and visualized on a map of Canada. One research assistant manually (QZ) retrieved center location information, including the city, region, and province, from the centers’ official websites and obtained the longitudes and latitudes of the cities where the centers were located. The Google API was used to calibrate the geolocations if the absolute distance discrepancy between manually identified geolocations and Google map geolocations was >6 km. Finally, we developed a Python script to automatically generate D3.js for mapping all the centers with their latitudes and longitudes (the script is available upon request). We used Figma to indicate the cities on the map [ 36 ].

Social Network Analysis

Social network analysis is one of the most effective techniques for visualizing and assessing network connectivity dynamics, offering insights into patterns of connection and disconnection among participants at a given moment. In our study, we used social network analysis to construct networks from Twitter accounts, where nodes represented accounts and directed edges symbolized follower relationships. We used Pyvis [ 37 ] and NetworkX in Python to create the network, resulting in 111 nodes and 995 edges, which is a visual representation of the relationships within the sexual assault center’s Twitter accounts. To delve deeper, we applied a range of metrics: (1) density, which quantifies the percentage of actual connections within the network; (2) degree centrality, which evaluates a node’s significance by examining its connections, distinguishing between incoming (in-degree) and outgoing (outdegree) connections [ 38 ]; (3) eigenvector centrality, which measures a node’s influence in a network by considering the relative scores to connected nodes and is based on the concept that connections to nodes with higher scores exert a more significant influence on determining the node’s score, in contrast to connections with nodes having lower scores [ 39 ]; (4) modularity, indicating the network’s community organization strength through clustering [ 40 , 41 ]; (5) betweenness centrality, evaluating an individual’s role as a bridge between unconnected entities, fostering vital connections among clusters, communities, and organizations [ 38 ]; and (6) closeness, gauging a node’s centrality in a connected graph by summing the shortest path lengths to all other nodes [ 42 ]. These metrics collectively provided a comprehensive understanding of the social network’s structure, shedding light on its various facets and opportunities for relationship cultivation and network analysis.

Ethical Considerations

The data set and analyses relied on publicly accessible secondary Twitter data; thus, no ethics approval or organizational consent was necessary. The study data presented in this manuscript were subjected to anonymization and deidentification procedures. All personally identifiable information, including but not limited to individual organizations’ identities, pictures, user-specific data, or tweets that have not been rephrased, have been meticulously removed from the data set to ensure complete anonymity.

Demographic Profile of Sexual Assault Centers on Twitter

Prevalence of sexual assault centers on twitter.

We found that 124 (35.4%) centers out of the 350 sampled sexual assault centers have an official Twitter account and have posted at least 1 tweet since their establishment. We investigated the locations of the 124 sexual assault centers in Canada that have official Twitter accounts. The province with the highest number of sexual assault centers was Ontario (n=34), followed by British Columbia (BC; n=24) and Quebec (n=23). These 3 provinces accounted for two-thirds (81/124, 65.3%) of all sampled sexual assault centers in Canada. Newfoundland and Labrador and Yukon have only 1 sexual assault center each. Figure 1 shows the prevalence of sexual assault centers in Canada that have posted tweets.

what is data collection in research methodology

The Geographic Distribution of Sexual Assault Centers on Twitter

To visualize the distribution of sexual assault centers with Twitter accounts, we created geographic distribution maps for Ontario, BC, and Quebec, which had the highest number of centers in our sample. Additional geographic distributions of Twitter accounts in the remaining provinces and territories are presented in Multimedia Appendix 1 .

Our sample included 34 sexual assault centers from 27 cities in Ontario, shown in Figure 2 . Our analysis revealed that most of these centers were concentrated in the southeast region of the province, which is also where most of Ontario’s population resides [ 43 ]. Specifically, many centers were found in the cities of Toronto, Ottawa, Peterborough, Timmins, Brampton, and London, which have larger populations in Ontario.

what is data collection in research methodology

British Columbia

A total of 24 sexual assault centers were identified in BC, spread across 15 cities, shown in Figure 3 . Vancouver had the highest number of centers in the province, followed by Surrey. As per population distribution, most of the population in BC resides in the southern part of the province [ 44 ], and similarly, most of the sampled centers are located in the southern region of BC.

what is data collection in research methodology

In Quebec, there are 23 sexual assault centers located in 17 cities, with 5 centers located in Montreal, shown in Figure 4 . We found that nearly all the centers are situated in the southern region of Quebec, where most of the province’s population resides [ 45 ]. Notably, no centers were located in the northern region of the province.

what is data collection in research methodology

Twitter Account Creation Year by Sexual Assault Centers on Twitter: Provincial and Territorial Differences

We analyzed the year of establishment of the Twitter accounts and assessed whether there were any differences in account creation among provinces and territories. Figure 5 shows our analysis results of the sampled sexual assault centers’ Twitter account creation year. The first Twitter account was created in 2009, and the number of sexual assault centers gradually increased from 2009 to 2012, peaking in 2012 with 23 new accounts. However, from 2013 to 2017, the number of Twitter accounts created by sexual assault centers decreased. In 2019, only 1 sexual assault center created Twitter accounts, and there were no new accounts in 2021. In 2022, we identified 3 centers that had established new Twitter accounts. Notably, all 3 centers that created new Twitter accounts had been in operation for >30 years, as confirmed by our examination of their official profiles.

what is data collection in research methodology

In addition, we analyzed the distribution of the Twitter accounts created across provinces from 2009 to 2023 and found provincial differences. A total of 8 sexual assault centers from 4 provinces, including Alberta, BC, Nova Scotia, and Ontario, created their Twitter accounts in 2009. From 2009 to 2017, we observed a recurring trend among centers located in Ontario and BC, the 2 provinces with the highest number of centers in Canada, where they established new Twitter accounts on an annual basis. Ontario contributed to the newest Twitter accounts created in 2011 (n=5) and 2012 (n=13). In Quebec, the third-largest province in terms of the number of centers with Twitter accounts, all sexual assault centers began using Twitter after 2010, and remarkably, 7 new centers were created in that year alone. We also noted that most centers in Saskatchewan established their Twitter accounts in 2014. In addition, 3 sexual assault centers located in the Canadian territories also created their Twitter accounts. Specifically, in 2019, one organization in Yukon established a Twitter account, whereas in 2010, two centers located in the Northwest Territories created their Twitter accounts.

Average Age of Sexual Assault Centers on Twitter by Province and Territory

We calculated the age of each sexual assault center’s Twitter account since its creation. We determined the duration of time that each sexual assault center had its official Twitter account by subtracting the most recent month of collected tweets (March 2023) from the account creation date. Then, we computed the average length of time for all sexual assault centers in each province and presented the results in Table 1 .

Table 1 displays the average length of time, SD, and range of the duration of years after the establishment of Twitter accounts by sexual assault centers in each province. For example, 13 sexual assault centers in Alberta created Twitter accounts, and the average number of years since their accounts’ establishment was 10.26 (SD 2.39) years. The range of 5 to 14 indicated that the earliest Twitter account was created in 2009 (March 2023: 14 y), whereas the most recent account was established in 2018 (March 2023: 5 y). These 13 sexual assault centers accounted for 10.5% (n=124) of the total 124 sexual assault centers in Canada.

To obtain a comprehensive understanding of Twitter presence and engagement, we used the metadata of Twitter users related to sexual assault centers in each province, which we obtained from the Twitter timeline API. Specifically, we extracted data such as “followers_count,” “friends_count,” “favorites_count,” and “listed_count” to determine the total number of followers, following, favorites, and listed users, respectively, for each province. We also aggregated the collected data by province to calculate the total number of tweets posted for each province.

User Activity of Sexual Assault Centers on Twitter

Active twitter accounts in canadian sexual assault centers by province and territory.

We analyzed the data to determine the number of active Twitter accounts maintained by sexual assault centers in each Canadian province and territory each year. We defined an active account as one that posted at least 1 tweet in a given year. The results indicated a steady increase in the number of active Twitter accounts in Ontario and BC since 2009, as shown in Figure 6 . In Quebec, there was an increase in the number of centers from 3 in 2011 to 9 in 2015, but this trend reversed in the following years, indicating that many Twitter accounts became inactive. Although there was an increase in the number of registered Twitter accounts from 13 in 2015 to 16 in 2018, only 4 of these accounts remained active in 2020, down from 9 in 2015. Alberta ranks third or fourth in terms of active Twitter accounts.

what is data collection in research methodology

In Manitoba, 2 sexual assault centers had active Twitter accounts until 2020. Only 1 center in New Brunswick registered a Twitter account in 2013 but was inactive in 2018 and 2020 while maintaining activity in other years. In Newfoundland and Labrador, the only sexual assault center remained active on Twitter from 2012 to 2020. Nova Scotia consistently showed an increasing trend in the number of active Twitter accounts from 2009 to 2020. Meanwhile, the only center in the Northwest Territories maintained an active Twitter account since 2010. In Saskatchewan, the number of active Twitter accounts increased to 6 in 2015 but decreased to 3 in 2020.

Twitter Activity and Posting Patterns of Sexual Assault Centers

We also examined the Twitter activity of sexual assault centers in Canada, investigating the popular times for tweeting and the number of tweets posted per month. Figure 7 shows the total number of tweets posted by all sampled centers aggregated by month. Over a 12-year period, these centers posted an average of 12,849 tweets per month. The most active month was November, with a total of 16,239 tweets, whereas the least active was July, with only 9079 tweets. March and May were the peak tweeting months, with >15,000 tweets, whereas August had the fewest tweets, with approximately 9500 tweets.

what is data collection in research methodology

We analyzed the monthly average tweet count of sexual assault centers during their active status (n=92). We computed the average number of tweets sent by each center per month from the first tweet after creating their account to the last tweet during the data collection period. Results showed that a large majority of the centers have a relatively low tweeting frequency, with the highest frequency of centers (over 20) averaging between 0 to 10 tweets per month. The distribution is right-skewed, showing that as the average monthly tweeting volume increases, the number of centers engaging at the level decreases. A small number of centers tweet between 10 and 30 times per month, and very few centers exceed this range. There are occasional outliers, with one center in particular averaging a significantly higher number of tweets per month, at around 180. This center is an extreme outlier in comparison to the rest of the data set. Overall, our findings suggest that sexual assault centers tend to use Twitter moderately, with the bulk of them tweeting less than 20 times per month, and a very exceptional few tweeting much more frequently.

Comparative Analysis of Tweet Activity Across Provinces and Territories

To answer RQ5, we further analyzed the total number of tweets posted by centers in different provinces and territories each month and compared the number of tweets posted across provinces and territories each year. Figure 8 shows the annual tweet activity generated from all sexual assault center accounts across provinces and territories, with data collected until March 15, 2023. Our analysis of the tweet activity revealed a gradual increase in tweeting volume across provinces and territories until 2017. Notably, Ontario exhibited the highest frequency of tweet activity in 2016, with approximately 17,000 tweets. However, with the exception of accounts in BC and Nova Scotia, the tweet activity gradually declined from 2017 to March 2023, returning to activity levels last observed in 2013 or 2014. It is worth noting that some provinces and territories, such as New Brunswick and Northwest Territories, had <100 tweets in the peak month, and hence, they were not included in our figure. The findings suggest a potential decrease in Twitter activity among sexual assault centers in recent years.

what is data collection in research methodology

Social Network Dynamics of Sexual Assault Centers in Canada

Network size.

The sexual assault organizations under investigation had an average follower count of 1543 (SD 1555) and an average following count of 819 (SD 876). The range of follower counts was quite diverse, starting at a minimum of 8 followers for one organization and reaching a maximum of 7228 followers for another organization. Similarly, the following counts also showed significant variation, with one center having the lowest count of 1 following, whereas another center had the highest following count of 4458. More detailed information about the Followers, followings, and measurement of social network analysis are in Multimedia Appendix 2 .

Sexual assault organizations across 11 provinces exhibited varying average follower counts, ranging from as low as 1 follower in New Brunswick to 17 followers in Ontario, as shown in Table 2 . The organization located in Ontario had the highest number of followers, totaling 46. In contrast, some sexual assault organizations in provinces such as Alberta and BC had no followers at all. Similarly, the average number of followings by these organizations across the 11 provinces and territories ranged from 0 in New Brunswick to 18 in Ontario. The organization with the most followings was also situated in Ontario, with a total of 38 followings, whereas several organizations in provinces such as New Brunswick and Quebec had no followings.

a N/A: not applicable.

Relationship Between Followers and Followings on Twitter

We analyzed the Twitter user network by exploring the connections between the followers and the following lists. Figure 9 shows the log-log plot of the correlation between followers and followings. Each point on the graph represents an individual user, with the x-axis representing the user’s followers and the y-axis indicating their following count. The plot demonstrates that as the number of followers continued to increase, the followings also indicated an increasing trend. At the middle of the plot, users with a medium number of followers have a high number of followings.

what is data collection in research methodology

Analysis of Twitter Social Network Structure and Node Categorization

Figure 10 presents a full network map, illustrating the relationships between followers and followings among 111 sexual assault centers on Twitter. This graph features 111 nodes and 995 edges. Each node represented a sexual assault center on Twitter, and each edge is directional, with arrows symbolizing “x follows y” relationship. In this context, Y serves as the following node, implying that it is followed by other nodes, whereas X acts as the follower node, signifying that it follows other nodes.

what is data collection in research methodology

Among the 111 sexual assault centers on Twitter, the average count of both followers and followings was approximately 9. There is a notable variability in these counts, with a SD of 9.15 for followings and 9.85 for followers. The maximum number of followings observed was 38, whereas the maximum number of followers reached was 46, with a minimum count of 0.

We categorized them into different categories to account for the variability in the number of followers. The graph illustrates nodes of various colors, each representing a specific range of followers. Red denotes those with <11 followers, green indicates 11 to 21 followers, blue represents 21 to 31 followers, purple signifies 31 to 41 followers, and yellow indicates those with ≥41 followers. Notably, there is a single green node, which stands out with 46 followers. The in-degree centrality is 0.42, which is the highest value among all the nodes, underscoring their significance. In addition, they possess a closeness score of 0.49, ranking them within the top 1% among all nodes, implying their high level of proximity to other nodes and less dependency on others for information transmission. Furthermore, their betweenness score was 0.076, signifying their involvement in a substantial number of shortest paths and positioning them among the top 7 nodes in the betweenness ranking. We classify the sexual assault centers that meet the criteria of high closeness, in-degree centrality, low betweenness, and >41 followers and followings as social queens .

The figure also draws attention to a cluster of sexual assault centers characterized by a smaller number of followers and followings. We classified these nodes as marginalized entities. These centers have a limited impact on information transmission on Twitter, leading to their closeness and other metrics registering at 0.

Identification of Modular Patterns and Key Nodes in Twitter Network

Figure 11 is a section from Figure 10 , in which distinct patterns emerge as certain nodes cluster together into modularities. As an example, we can examine a particular section of Figure 10 where a modularity forms—a small cluster consisting of several nodes that mutually follow each other. We incorporated eigenvector centrality to interpret this modularity. Within the bottom-right corner of this modularity, we find that accounts 24 and 72 (anonymous Twitter handle_names) exhibit relatively high values of eigenvector centrality. A higher eigenvector centrality score implies greater significance of the node when compared with its neighboring points. Furthermore, the importance of the node itself is directly linked to the significance of the neighboring nodes connected to it. Consequently, this specific node can be regarded as the “social queen” within this particular modularity.

what is data collection in research methodology

Principal Findings

This study represents a pioneering effort to conduct a comprehensive analysis of Twitter’s social network, user activities, and demographics within the context of sexual assault centers in Canada. By mapping and analyzing their Twitter practices, this research contributes to a better understanding of the social media landscape of sexual assault support organizations in Canada. The findings underscore the potential of Twitter as a platform for sexual assault organizations to build social capital, enhance their influence, and expand their reach. Moreover, it highlights the need for tailored engagement strategies that consider regional disparities and the unique characteristics of each province and territory. Our findings align with the broader literature on social capital, specifically bridging and bonding social capital. Among the various social media platforms, Twitter emerges as a valuable data set to study sexual violence [ 46 ] as well as a notable facilitator of bridging social capital, consistent with previous research that has underscored Twitter’s ability to connect organizations with a diverse and expansive audience [ 22 ].

The results of this study reveal a substantial presence of sexual assault centers in Canada on Twitter, signifying their acknowledgment of Twitter as a valuable communication and social capital development tool. Out of the 350 sampled centers, 124 (35.4%) maintain an active Twitter presence, highlighting the significant proportion of sexual assault organizations in Canada that recognize Twitter’s efficacy as a communication medium for engaging with their stakeholders and the public. This trend aligns with the broader nonprofit sector, where most large and mid-sized nonprofits maintain at least 1 social media account [ 9 ]. However, it is worth noting that effective communication on Twitter may be hindered by a reliance on broadcasting rather than engaging in dialogue, as observed in previous research [ 47 ]. The success of acquiring social capital through social media appears to depend on the extent and quality of stakeholder connections, emphasizing the importance of diverse engagement strategies and the diversity and complexity of message elements [ 26 ]. In the context of sexual assault centers and support services, social media–based social capital holds significant potential for increasing the donor and volunteer base, engaging with the community on social issues, and promoting wider social change [ 33 ].

The geographic distribution of sexual assault centers with Twitter accounts highlights regional variations in Twitter use and engagement. Ontario, BC, and Quebec emerged as the provinces with the highest number of centers using Twitter, collectively accounting for two-thirds of all sampled centers in Canada, indicating higher levels of social capital in those regions owing to increased opportunities for information sharing, emotional kinship, trust, and social support [ 20 ]. The concentration of centers in these provinces aligns with their higher population densities and emphasizes the importance of social media platforms, such as Twitter, in reaching a broader audience consistently [ 43 ]. The southeast regions of Ontario and Quebec as well as the southern region of BC showed a higher concentration of sexual assault centers with Twitter accounts, likely reflecting the higher population densities in these areas. Furthermore, spatial distribution may influence the topics and issues addressed in their tweets. In BC, the distribution of sexual assault centers was also concentrated in specific regions, such as Vancouver. The content of tweets from centers in these regions may reflect local concerns and initiatives. It is essential to consider the unique characteristics and needs of each region when developing communication strategies and leveraging social media platforms for social capital enhancement.

The average age of sexual assault centers’ Twitter accounts was calculated to determine the duration of their presence on the platform. The findings showed that the average duration varied across provinces and territories, ranging from 5 to 14 years. The Northwest Territories had the longest average duration of 12.77 years, indicating a relatively early adoption and subsequent use of Twitter among sexual assault centers in the territory. In contrast, provinces or territories such as Yukon and New Brunswick had a shorter average duration. These variations in account age reflect differences in the timing of adoption and highlight the diverse trajectories of Twitter use among sexual assault centers across Canada. These differences may be influenced by organizational factors, regional context, or resource availability. Centers with older Twitter accounts may have accumulated more followers and established stronger web-based communities, whereas newer accounts may need to focus on building and expanding their web-based presence.

Patterns of account creation offer insights into the temporal dynamics of engagement and emphasize the need for continuous and consistent communication efforts. The recent decline in the creation of new Twitter accounts by sexual assault centers in recent years may signal a saturation point, where most centers have already established their Twitter presence. Alternatively, it could be attributed to factors such as resource constraints, changing organizational priorities, limited staff dedicated to communication practices, or a shift in focus to other social media platforms.

The examination of the average monthly tweet count for active sexual assault centers provides valuable insights into their Twitter activity levels. The results indicate variations in tweet frequency across provinces, with Ontario and BC consistently demonstrating higher tweet volumes compared with other provinces. This observation aligns with the higher number of active Twitter accounts and underscores the importance of ongoing engagement and dialogue with stakeholders through regular tweets.

The findings related to social network dynamics reveal the landscape of Twitter engagement among sexual assault organizations. On average, these organizations have amassed approximately 1543 followers, demonstrating their capacity to reach a substantial audience. Simultaneously, they follow an average of 819 other accounts, indicating their active involvement within the Twitter community. This indicates the potential for these organizations to disseminate information, provide support, and raise awareness about their critical missions. This study aligns with the perspective emphasizing the importance of bridging social capital facilitated by Twitter’s ability to connect with various stakeholders, including service recipients, donors, and the general public [ 26 ]. It highlights the potential and disparities in bridging social capital among sexual assault centers across provinces and territories in Canada, suggesting that these organizations can better leverage Twitter to establish connections beyond their immediate constituencies. This aligns with the notion that social media platforms such as Twitter can extend an organization’s reach and promote the flow of information across various stakeholders [ 22 ].

The findings also uncovered regional disparities in Twitter engagement among sexual assault organizations in Canada. Sexual assault organizations across Canada’s provinces exhibited varying degrees of Twitter activity. Although some provinces, such as Ontario, displayed robust engagement, others, such as New Brunswick, had limited presence and following. Our observation resonates with previous studies (eg, [ 9 ]) that have emphasized the role of regional context in shaping nonprofit organizations’ social media use. These regional disparities suggest the need for tailored strategies to maximize the impact of Twitter engagement, considering the unique characteristics and needs of each province.

Within the context of nonprofit organizations, research has indicated a positive relationship between follower count and bonding social capital [ 21 ]. Our study aligns with this perspective by demonstrating a positive association between followers and followings. As follower counts increase, there is a corresponding increase in followings, indicating a proactive approach by organizations to engage with their audience. This observation underscores the importance of reciprocity and interaction on Twitter. This suggests that a larger number of followers on Twitter can contribute to increased financial support, in line with the positive impact of web-based social capital generated through social networking sites on charitable outcomes [ 29 , 30 ].

Within the intricate network of sexual assault centers on Twitter, we identified nodes with distinct characteristics. Some organizations emerged as “social queens,” characterized by high in-degree centrality, closeness scores, and low betweenness, coupled with substantial followers and followings. These “social queens” play pivotal roles in information transmission, networking, and community building. This finding suggests that organizations can strategically use Twitter to enhance their influence and reach within their fields of operation. However, this study also highlights the presence of marginalized entities with limited follower counts, which may face challenges in impacting information transmission on Twitter. This underscores the importance of proactive engagement strategies for organizations seeking to maximize their impact through social media. Sexual assault organizations can benefit from a comprehensive understanding of their social network structures, enabling them to identify opportunities to strengthen their social capital, expand their donor base, and effectively engage the community.


This study had some limitations. First, the findings are specific to sexual assault centers in Canada and may not be applicable to other countries or regions owing to cultural, social, and organizational differences. Each country or region may have unique characteristics that influence the use of Twitter and other social media platforms by sexual assault organizations. Second, the study focuses solely on Twitter data and does not consider the use of other platforms such as Facebook, Instagram, or Snapchat, which could provide additional insights into communication strategies and social media practices. Therefore, the findings of this study may not provide a comprehensive understanding of the organizations’ overall social media use. Third, the study is cross-sectional, providing a snapshot of Twitter use at a specific time, and does not capture longitudinal changes or trends. A longitudinal study would offer more detailed insights into the evolution of Twitter practices and the effectiveness of communication strategies used by these organizations. For example, we lack information about growth and changes in the number of followers over time. The follower counts remained static at the time of data collection. Future studies may need to explore how follower counts evolve dynamically to gain deeper insights. Fourth, the study primarily focused on describing Twitter use at the organizational level rather than evaluating the effectiveness or outcomes of the communication strategies used. Further research is required to assess the impact and outcomes of social media use in this context. Fifth, this study did not have access to demographic information related to the organizational size of the nonprofits, which typically includes factors such as the number of employees, volunteers, or annual budget. Unfortunately, Twitter does not provide access to such data, resulting in its absence from this study. Finally, the study does not delve deeper into the content of tweets posted by sexual assault organizations on Twitter. Future studies could explore the thematic analysis of tweets, sentiment analysis to understand the emotional tone of their messages, and the effectiveness of specific content strategies used by these organizations to engage their audience and advocate for their cause.


In conclusion, this study provides valuable insights into the current use and social structure of Twitter by sexual assault centers, crisis lines, and support services in Canada. The findings highlight the widespread adoption of Twitter among these organizations and the potential for leveraging social media platforms to build social capital. By recognizing regional disparities, identifying key players, and understanding the dynamics of followers and followings, sexual assault organizations can better navigate the Twitter landscape to further their missions of promoting awareness and support for survivors of sexual assault. Further research in this area can explore the long-term impact of social media use on organizational outcomes and stakeholder perceptions into enhancing social capital within the nonprofit sector and beyond, providing additional guidance for effective communication practices in the nonprofit human services sector and ultimately contributing to the broader goals of these organizations.

Conflicts of Interest

None declared.

Sexual assault centers with official Twitter accounts in Alberta, New Brunswick, Newfoundland and Labrador, Nova Scotia, Manitoba, Saskatchewan, Northwest Territories, and Yukon.

Followers, followings, and measurement of social network analysis.

  • Guo C, Saxton GD. Getting attention: an organizational-level analysis. In: The Quest for Attention. Redwood City, CA. Stanford University Press; 2020.
  • Guo C, Saxton GD. Speaking and being heard: how nonprofit advocacy organizations gain attention on social media. Nonprofit Volunt Sect Q. 2018;47(1):5-26. [ CrossRef ]
  • Bhati A, McDonnell D. Success in an online giving day: the role of social media in fundraising. Nonprofit Volunt Sect Q. 2020;49(1):74-92. [ CrossRef ]
  • Campbell DA, Lambright KT. Are you out there? internet presence of nonprofit human service organizations. Nonprofit Volunt Sect Q. 2019;48(6):1296-1311. [ CrossRef ]
  • Jung K, Valero JN. Assessing the evolutionary structure of homeless network: social media use, keywords, and influential stakeholders. Technol Forecast Soc Change. Sep 2016;110:51-60. [ CrossRef ]
  • Maxwell SP, Carboni JL. Social media management. Nonprofit Manag Leadersh. Sep 19, 2016;27(2):251-260. [ CrossRef ]
  • Zhou H, Ye S. Legitimacy, worthiness, and social network: an empirical study of the key factors influencing crowdfunding outcomes for nonprofit projects. Voluntas. Jun 4, 2018;30:849-864. [ CrossRef ]
  • Zhou H, Ye S. Fundraising in the digital era: legitimacy, social network, and political ties matter in China. Voluntas. Apr 01, 2019;32:498-511. [ CrossRef ]
  • Nah S, Saxton GD. Modeling the adoption and use of social media by nonprofit organizations. New Media Soc. 2013;15(2):294-313. [ CrossRef ]
  • Ellison NB, Vitak J, Gray R, Lampe C. Cultivating social resources on social network sites: Facebook relationship maintenance behaviors and their role in social capital processes. J Comput Mediat Commun. Jul 01, 2014;19(4):855-870. [ CrossRef ]
  • Li C, Bernoff J. Groundswell: Winning in a World Transformed by Social Technologies. Brighton, MA. Harvard Business Review Press; 2011.
  • Park H, Rodgers S, Stemmle J. Analyzing health organizations' use of Twitter for promoting health literacy. J Health Commun. 2013;18(4):410-425. [ CrossRef ] [ Medline ]
  • Lovejoy K, Waters RD, Saxton GD. Engaging stakeholders through Twitter: how nonprofit organizations are getting more out of 140 characters or less. Public Relat Rev. Jun 2012;38(2):313-318. [ CrossRef ]
  • Xue J, Chen J, Chen C, Hu R, Zhu T. The hidden pandemic of family violence during COVID-19: unsupervised learning of tweets. J Med Internet Res. Nov 06, 2020;22(11):e24361. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lovejoy K, Saxton GD. Information, community, and action: how nonprofit organizations use social media. J Comput Mediat Commun. Apr 01, 2012;17(3):337-353. [ CrossRef ]
  • Xue J, Macropol K, Jia Y, Zhu T, Gelles RJ. Harnessing big data for social justice: an exploration of violence against women‐related conversations on Twitter. Human Behav Emerg Technol. Jul 26, 2019;1(3):269-279. [ CrossRef ]
  • Guo C, Saxton GD. Tweeting social change: how social media are changing nonprofit advocacy. Nonprofit Volunt Sect Q. 2014;43(1):57-79. [ CrossRef ]
  • Kapucu N, Demiroz F. A social network analysis approach to strengthening nonprofit collaboration. J Appl Manag Entrep. Jan 2015;20(1):87-101. [ CrossRef ]
  • Putnam RD. Tuning in, tuning out: the strange disappearance of social capital in America. PS Political Sci Politics. 1995;28(4):664-683. [ FREE Full text ] [ CrossRef ]
  • Putnam RD. Bowling Alone: The Collapse and Revival of American Community. New York, NY. Simon & Schuster; 2001.
  • Hofer M, Aubert V. Perceived bridging and bonding social capital on Twitter: differentiating between followers and followees. Comput Hum Behav. Nov 2013;29(6):2134-2142. [ CrossRef ]
  • Phua J, Jin SV, Kim J. Uses and gratifications of social networking sites for bridging and bonding social capital: a comparison of Facebook, Twitter, Instagram, and Snapchat. Comput Hum Behav. Jul 2017;72:115-122. [ CrossRef ]
  • Bourdieu P. Distinction: a social critique of the judgment of taste. In: Food and Culture, 4th Edition. Milton Park, UK. Routledge; 2017.
  • Lin N. Building a network theory of social capital. In: Social Capital, 1st Edition. Milton Park, UK. Routledge; 2001.
  • Lai CH, Fu JS. Humanitarian relief and development organizations’ stakeholder targeting communication on social media and beyond. Voluntas. Mar 05, 2020;32:120-135. [ CrossRef ]
  • Xu W, Saxton GD. Does stakeholder engagement pay off on social media? a social capital perspective. Nonprofit Volunt Sect Q. Aug 02, 2018;48(1):28-49. [ FREE Full text ] [ CrossRef ]
  • Doerfel ML, Atouba Y, Harris JL. (Un)obtrusive control in emergent networks: examining funding agencies’ control over nonprofit networks. Nonprofit Volunt Sect Q. 2017;46(3):469-487. [ CrossRef ]
  • Williams JR. The use of online social networking sites to nurture and cultivate bonding social capital: a systematic review of the literature from 1997 to 2018. New Media Soc. 2019;21(11-12):2710-2729. [ CrossRef ]
  • Henry R, Bosman L. Strategic management and social media: an empirical analysis of electronic social capital and online fundraising. In: Olivas-Luján MR, Bondarouk T, editors. Social Media in Strategic Management. Bingley, UK. Emerald Group Publishing Limited; 2013;43-62.
  • Lee YJ, Shon J. Nonprofits’ online social capital and charitable support. J Nonprofit Public Sect Mark. 2023;35(3):290-307. [ CrossRef ]
  • Saxton GD, Wang L. The social network effect: the determinants of giving through social media. Nonprofit Volunt Sect Q. Apr 24, 2013;43(5):850-868. [ CrossRef ]
  • Seo H, Vu HT. Transnational nonprofits’ social media use: a survey of communications professionals and an analysis of organizational characteristics. Nonprofit Volunt Sect Q. Feb 28, 2020;49(4):849-870. [ CrossRef ]
  • Svensson PG, Mahoney TQ, Hambrick ME. Twitter as a communication tool for nonprofits: a study of sport-for-development organizations. Nonprofit Volunt Sect Q. Oct 16, 2014;44(6):1086-1106. [ CrossRef ]
  • Yang A, Liu W. Coalition networks for the green new deal: nonprofit public policy advocacy in the age of social media. Nonprofit Volunt Sect Q. Sep 20, 2022;52(5):1284-1307. [ CrossRef ]
  • Makice K. Twitter API: Up and Running. Sebastopol, CA. O'Reilly Media, Inc; Mar 2009.
  • Amr T, Stamboliyska R. Practical D3.js. New York, NY. Apress; 2016.
  • Interactive network visualizations. Pyvis. URL: https://pyvis.readthedocs.io/en/latest/ [accessed 2024-03-01]
  • Conducting a social network analysis. Converge. URL: https://www.converge.net/toolkit/conducting-a-social-network-analysis [accessed 2024-03-01]
  • Bonacich P. Some unique properties of eigenvector centrality. Soc Netw. Oct 2007;29(4):555-564. [ CrossRef ]
  • McCurdie T, Sanderson P, Aitken LM. Applying social network analysis to the examination of interruptions in healthcare. Appl Ergon. Feb 2018;67:50-60. [ CrossRef ] [ Medline ]
  • Newman ME, Girvan M. Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. Feb 2004;69(2 Pt 2):026113. [ CrossRef ] [ Medline ]
  • Closeness centrality (centrality measure). GeeksforGeeks. URL: https://www.geeksforgeeks.org/closeness-centrality-centrality-measure/ [accessed 2024-03-01]
  • Canada Ontario Density 2016.png. Wikimedia Commons. URL: https://commons.wikimedia.org/wiki/File:Canada_Ontario_Density_2016.png#file [accessed 2024-03-01]
  • McPhee A. Canada British Columbia Density 2016.png. Wikimedia Commons. 2019. URL: https://commons.wikimedia.org/wiki/File:Canada_British_Columbia_Density_2016.png
  • McPhee A. Canada Quebec Density 2016.png. Wikimedia Commons. 2019. URL: https://commons.wikimedia.org/wiki/File:Canada_Quebec_Density_2016.png
  • Xue J, Zhang B, Zhang Q, Hu R, Jiang J, Liu N, et al. Using Twitter-based data for sexual violence research: scoping review. J Med Internet Res. May 15, 2023;25:e46084. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Stabile B, Grant A, Purohit H, Sharan Bonala S. Take back the tweet: social media use by anti‐gender‐based violence organizations. Sex Gend Policy. Apr 29, 2021;4:38-56. [ CrossRef ]


Edited by A Mavragani; submitted 04.07.23; peer-reviewed by W Ceron; comments to author 04.12.23; revised version received 23.12.23; accepted 13.02.24; published 27.03.24.

©Jia Xue, Qiaoru Zhang, Yun Zhang, Hong Shi, Chengda Zheng, Jingchuan Fan, Linxiao Zhang, Chen Chen, Luye Li, Micheal L Shier. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 27.03.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

Read our research on: Abortion | Podcasts | Election 2024

Regions & Countries

What the data says about abortion in the u.s..

Pew Research Center has conducted many surveys about abortion over the years, providing a lens into Americans’ views on whether the procedure should be legal, among a host of other questions.

In a  Center survey  conducted nearly a year after the Supreme Court’s June 2022 decision that  ended the constitutional right to abortion , 62% of U.S. adults said the practice should be legal in all or most cases, while 36% said it should be illegal in all or most cases. Another survey conducted a few months before the decision showed that relatively few Americans take an absolutist view on the issue .

Find answers to common questions about abortion in America, based on data from the Centers for Disease Control and Prevention (CDC) and the Guttmacher Institute, which have tracked these patterns for several decades:

How many abortions are there in the U.S. each year?

How has the number of abortions in the u.s. changed over time, what is the abortion rate among women in the u.s. how has it changed over time, what are the most common types of abortion, how many abortion providers are there in the u.s., and how has that number changed, what percentage of abortions are for women who live in a different state from the abortion provider, what are the demographics of women who have had abortions, when during pregnancy do most abortions occur, how often are there medical complications from abortion.

This compilation of data on abortion in the United States draws mainly from two sources: the Centers for Disease Control and Prevention (CDC) and the Guttmacher Institute, both of which have regularly compiled national abortion data for approximately half a century, and which collect their data in different ways.

The CDC data that is highlighted in this post comes from the agency’s “abortion surveillance” reports, which have been published annually since 1974 (and which have included data from 1969). Its figures from 1973 through 1996 include data from all 50 states, the District of Columbia and New York City – 52 “reporting areas” in all. Since 1997, the CDC’s totals have lacked data from some states (most notably California) for the years that those states did not report data to the agency. The four reporting areas that did not submit data to the CDC in 2021 – California, Maryland, New Hampshire and New Jersey – accounted for approximately 25% of all legal induced abortions in the U.S. in 2020, according to Guttmacher’s data. Most states, though,  do  have data in the reports, and the figures for the vast majority of them came from each state’s central health agency, while for some states, the figures came from hospitals and other medical facilities.

Discussion of CDC abortion data involving women’s state of residence, marital status, race, ethnicity, age, abortion history and the number of previous live births excludes the low share of abortions where that information was not supplied. Read the methodology for the CDC’s latest abortion surveillance report , which includes data from 2021, for more details. Previous reports can be found at  stacks.cdc.gov  by entering “abortion surveillance” into the search box.

For the numbers of deaths caused by induced abortions in 1963 and 1965, this analysis looks at reports by the then-U.S. Department of Health, Education and Welfare, a precursor to the Department of Health and Human Services. In computing those figures, we excluded abortions listed in the report under the categories “spontaneous or unspecified” or as “other.” (“Spontaneous abortion” is another way of referring to miscarriages.)

Guttmacher data in this post comes from national surveys of abortion providers that Guttmacher has conducted 19 times since 1973. Guttmacher compiles its figures after contacting every known provider of abortions – clinics, hospitals and physicians’ offices – in the country. It uses questionnaires and health department data, and it provides estimates for abortion providers that don’t respond to its inquiries. (In 2020, the last year for which it has released data on the number of abortions in the U.S., it used estimates for 12% of abortions.) For most of the 2000s, Guttmacher has conducted these national surveys every three years, each time getting abortion data for the prior two years. For each interim year, Guttmacher has calculated estimates based on trends from its own figures and from other data.

The latest full summary of Guttmacher data came in the institute’s report titled “Abortion Incidence and Service Availability in the United States, 2020.” It includes figures for 2020 and 2019 and estimates for 2018. The report includes a methods section.

In addition, this post uses data from StatPearls, an online health care resource, on complications from abortion.

An exact answer is hard to come by. The CDC and the Guttmacher Institute have each tried to measure this for around half a century, but they use different methods and publish different figures.

The last year for which the CDC reported a yearly national total for abortions is 2021. It found there were 625,978 abortions in the District of Columbia and the 46 states with available data that year, up from 597,355 in those states and D.C. in 2020. The corresponding figure for 2019 was 607,720.

The last year for which Guttmacher reported a yearly national total was 2020. It said there were 930,160 abortions that year in all 50 states and the District of Columbia, compared with 916,460 in 2019.

  • How the CDC gets its data: It compiles figures that are voluntarily reported by states’ central health agencies, including separate figures for New York City and the District of Columbia. Its latest totals do not include figures from California, Maryland, New Hampshire or New Jersey, which did not report data to the CDC. ( Read the methodology from the latest CDC report .)
  • How Guttmacher gets its data: It compiles its figures after contacting every known abortion provider – clinics, hospitals and physicians’ offices – in the country. It uses questionnaires and health department data, then provides estimates for abortion providers that don’t respond. Guttmacher’s figures are higher than the CDC’s in part because they include data (and in some instances, estimates) from all 50 states. ( Read the institute’s latest full report and methodology .)

While the Guttmacher Institute supports abortion rights, its empirical data on abortions in the U.S. has been widely cited by  groups  and  publications  across the political spectrum, including by a  number of those  that  disagree with its positions .

These estimates from Guttmacher and the CDC are results of multiyear efforts to collect data on abortion across the U.S. Last year, Guttmacher also began publishing less precise estimates every few months , based on a much smaller sample of providers.

The figures reported by these organizations include only legal induced abortions conducted by clinics, hospitals or physicians’ offices, or those that make use of abortion pills dispensed from certified facilities such as clinics or physicians’ offices. They do not account for the use of abortion pills that were obtained  outside of clinical settings .

(Back to top)

A line chart showing the changing number of legal abortions in the U.S. since the 1970s.

The annual number of U.S. abortions rose for years after Roe v. Wade legalized the procedure in 1973, reaching its highest levels around the late 1980s and early 1990s, according to both the CDC and Guttmacher. Since then, abortions have generally decreased at what a CDC analysis called  “a slow yet steady pace.”

Guttmacher says the number of abortions occurring in the U.S. in 2020 was 40% lower than it was in 1991. According to the CDC, the number was 36% lower in 2021 than in 1991, looking just at the District of Columbia and the 46 states that reported both of those years.

(The corresponding line graph shows the long-term trend in the number of legal abortions reported by both organizations. To allow for consistent comparisons over time, the CDC figures in the chart have been adjusted to ensure that the same states are counted from one year to the next. Using that approach, the CDC figure for 2021 is 622,108 legal abortions.)

There have been occasional breaks in this long-term pattern of decline – during the middle of the first decade of the 2000s, and then again in the late 2010s. The CDC reported modest 1% and 2% increases in abortions in 2018 and 2019, and then, after a 2% decrease in 2020, a 5% increase in 2021. Guttmacher reported an 8% increase over the three-year period from 2017 to 2020.

As noted above, these figures do not include abortions that use pills obtained outside of clinical settings.

Guttmacher says that in 2020 there were 14.4 abortions in the U.S. per 1,000 women ages 15 to 44. Its data shows that the rate of abortions among women has generally been declining in the U.S. since 1981, when it reported there were 29.3 abortions per 1,000 women in that age range.

The CDC says that in 2021, there were 11.6 abortions in the U.S. per 1,000 women ages 15 to 44. (That figure excludes data from California, the District of Columbia, Maryland, New Hampshire and New Jersey.) Like Guttmacher’s data, the CDC’s figures also suggest a general decline in the abortion rate over time. In 1980, when the CDC reported on all 50 states and D.C., it said there were 25 abortions per 1,000 women ages 15 to 44.

That said, both Guttmacher and the CDC say there were slight increases in the rate of abortions during the late 2010s and early 2020s. Guttmacher says the abortion rate per 1,000 women ages 15 to 44 rose from 13.5 in 2017 to 14.4 in 2020. The CDC says it rose from 11.2 per 1,000 in 2017 to 11.4 in 2019, before falling back to 11.1 in 2020 and then rising again to 11.6 in 2021. (The CDC’s figures for those years exclude data from California, D.C., Maryland, New Hampshire and New Jersey.)

The CDC broadly divides abortions into two categories: surgical abortions and medication abortions, which involve pills. Since the Food and Drug Administration first approved abortion pills in 2000, their use has increased over time as a share of abortions nationally, according to both the CDC and Guttmacher.

The majority of abortions in the U.S. now involve pills, according to both the CDC and Guttmacher. The CDC says 56% of U.S. abortions in 2021 involved pills, up from 53% in 2020 and 44% in 2019. Its figures for 2021 include the District of Columbia and 44 states that provided this data; its figures for 2020 include D.C. and 44 states (though not all of the same states as in 2021), and its figures for 2019 include D.C. and 45 states.

Guttmacher, which measures this every three years, says 53% of U.S. abortions involved pills in 2020, up from 39% in 2017.

Two pills commonly used together for medication abortions are mifepristone, which, taken first, blocks hormones that support a pregnancy, and misoprostol, which then causes the uterus to empty. According to the FDA, medication abortions are safe  until 10 weeks into pregnancy.

Surgical abortions conducted  during the first trimester  of pregnancy typically use a suction process, while the relatively few surgical abortions that occur  during the second trimester  of a pregnancy typically use a process called dilation and evacuation, according to the UCLA School of Medicine.

In 2020, there were 1,603 facilities in the U.S. that provided abortions,  according to Guttmacher . This included 807 clinics, 530 hospitals and 266 physicians’ offices.

A horizontal stacked bar chart showing the total number of abortion providers down since 1982.

While clinics make up half of the facilities that provide abortions, they are the sites where the vast majority (96%) of abortions are administered, either through procedures or the distribution of pills, according to Guttmacher’s 2020 data. (This includes 54% of abortions that are administered at specialized abortion clinics and 43% at nonspecialized clinics.) Hospitals made up 33% of the facilities that provided abortions in 2020 but accounted for only 3% of abortions that year, while just 1% of abortions were conducted by physicians’ offices.

Looking just at clinics – that is, the total number of specialized abortion clinics and nonspecialized clinics in the U.S. – Guttmacher found the total virtually unchanged between 2017 (808 clinics) and 2020 (807 clinics). However, there were regional differences. In the Midwest, the number of clinics that provide abortions increased by 11% during those years, and in the West by 6%. The number of clinics  decreased  during those years by 9% in the Northeast and 3% in the South.

The total number of abortion providers has declined dramatically since the 1980s. In 1982, according to Guttmacher, there were 2,908 facilities providing abortions in the U.S., including 789 clinics, 1,405 hospitals and 714 physicians’ offices.

The CDC does not track the number of abortion providers.

In the District of Columbia and the 46 states that provided abortion and residency information to the CDC in 2021, 10.9% of all abortions were performed on women known to live outside the state where the abortion occurred – slightly higher than the percentage in 2020 (9.7%). That year, D.C. and 46 states (though not the same ones as in 2021) reported abortion and residency data. (The total number of abortions used in these calculations included figures for women with both known and unknown residential status.)

The share of reported abortions performed on women outside their state of residence was much higher before the 1973 Roe decision that stopped states from banning abortion. In 1972, 41% of all abortions in D.C. and the 20 states that provided this information to the CDC that year were performed on women outside their state of residence. In 1973, the corresponding figure was 21% in the District of Columbia and the 41 states that provided this information, and in 1974 it was 11% in D.C. and the 43 states that provided data.

In the District of Columbia and the 46 states that reported age data to  the CDC in 2021, the majority of women who had abortions (57%) were in their 20s, while about three-in-ten (31%) were in their 30s. Teens ages 13 to 19 accounted for 8% of those who had abortions, while women ages 40 to 44 accounted for about 4%.

The vast majority of women who had abortions in 2021 were unmarried (87%), while married women accounted for 13%, according to  the CDC , which had data on this from 37 states.

A pie chart showing that, in 2021, majority of abortions were for women who had never had one before.

In the District of Columbia, New York City (but not the rest of New York) and the 31 states that reported racial and ethnic data on abortion to  the CDC , 42% of all women who had abortions in 2021 were non-Hispanic Black, while 30% were non-Hispanic White, 22% were Hispanic and 6% were of other races.

Looking at abortion rates among those ages 15 to 44, there were 28.6 abortions per 1,000 non-Hispanic Black women in 2021; 12.3 abortions per 1,000 Hispanic women; 6.4 abortions per 1,000 non-Hispanic White women; and 9.2 abortions per 1,000 women of other races, the  CDC reported  from those same 31 states, D.C. and New York City.

For 57% of U.S. women who had induced abortions in 2021, it was the first time they had ever had one,  according to the CDC.  For nearly a quarter (24%), it was their second abortion. For 11% of women who had an abortion that year, it was their third, and for 8% it was their fourth or more. These CDC figures include data from 41 states and New York City, but not the rest of New York.

A bar chart showing that most U.S. abortions in 2021 were for women who had previously given birth.

Nearly four-in-ten women who had abortions in 2021 (39%) had no previous live births at the time they had an abortion,  according to the CDC . Almost a quarter (24%) of women who had abortions in 2021 had one previous live birth, 20% had two previous live births, 10% had three, and 7% had four or more previous live births. These CDC figures include data from 41 states and New York City, but not the rest of New York.

The vast majority of abortions occur during the first trimester of a pregnancy. In 2021, 93% of abortions occurred during the first trimester – that is, at or before 13 weeks of gestation,  according to the CDC . An additional 6% occurred between 14 and 20 weeks of pregnancy, and about 1% were performed at 21 weeks or more of gestation. These CDC figures include data from 40 states and New York City, but not the rest of New York.

About 2% of all abortions in the U.S. involve some type of complication for the woman , according to an article in StatPearls, an online health care resource. “Most complications are considered minor such as pain, bleeding, infection and post-anesthesia complications,” according to the article.

The CDC calculates  case-fatality rates for women from induced abortions – that is, how many women die from abortion-related complications, for every 100,000 legal abortions that occur in the U.S .  The rate was lowest during the most recent period examined by the agency (2013 to 2020), when there were 0.45 deaths to women per 100,000 legal induced abortions. The case-fatality rate reported by the CDC was highest during the first period examined by the agency (1973 to 1977), when it was 2.09 deaths to women per 100,000 legal induced abortions. During the five-year periods in between, the figure ranged from 0.52 (from 1993 to 1997) to 0.78 (from 1978 to 1982).

The CDC calculates death rates by five-year and seven-year periods because of year-to-year fluctuation in the numbers and due to the relatively low number of women who die from legal induced abortions.

In 2020, the last year for which the CDC has information , six women in the U.S. died due to complications from induced abortions. Four women died in this way in 2019, two in 2018, and three in 2017. (These deaths all followed legal abortions.) Since 1990, the annual number of deaths among women due to legal induced abortion has ranged from two to 12.

The annual number of reported deaths from induced abortions (legal and illegal) tended to be higher in the 1980s, when it ranged from nine to 16, and from 1972 to 1979, when it ranged from 13 to 63. One driver of the decline was the drop in deaths from illegal abortions. There were 39 deaths from illegal abortions in 1972, the last full year before Roe v. Wade. The total fell to 19 in 1973 and to single digits or zero every year after that. (The number of deaths from legal abortions has also declined since then, though with some slight variation over time.)

The number of deaths from induced abortions was considerably higher in the 1960s than afterward. For instance, there were 119 deaths from induced abortions in  1963  and 99 in  1965 , according to reports by the then-U.S. Department of Health, Education and Welfare, a precursor to the Department of Health and Human Services. The CDC is a division of Health and Human Services.

Note: This is an update of a post originally published May 27, 2022, and first updated June 24, 2022.

what is data collection in research methodology

Sign up for our weekly newsletter

Fresh data delivered Saturday mornings

Key facts about the abortion debate in America

Public opinion on abortion, three-in-ten or more democrats and republicans don’t agree with their party on abortion, partisanship a bigger factor than geography in views of abortion access locally, do state laws on abortion reflect public opinion, most popular.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .


  1. 7 Data Collection Methods & Tools For Research

    what is data collection in research methodology

  2. Data Collection Methods

    what is data collection in research methodology

  3. Data Collection Strategies: Master the Art of Data Collection With Our

    what is data collection in research methodology

  4. data collection in research methodology

    what is data collection in research methodology

  5. Tools for data analysis in research

    what is data collection in research methodology

  6. Data Demystified: A Definitive Guide to Data Collection Methods

    what is data collection in research methodology


  1. 7. "Collection of Data

  2. Research Design: Choosing your Data Collection Methods

  3. Research Design: Planning your Data Collection Procedures


  5. 7. "Collection of Data

  6. Primary data and Secondary Data, sources of data collection in research, research methodology


  1. Data Collection

    Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation. In order for data collection to be effective, it is important to have a clear understanding ...

  2. Data Collection

    Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem. While methods and aims may differ between fields, the overall process of ...

  3. What Is Data Collection: Methods, Types, Tools

    Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences ...

  4. Data Collection in Research: Examples, Steps, and FAQs

    Data collection is the process of gathering information from various sources via different research methods and consolidating it into a single database or repository so researchers can use it for further analysis. Data collection aims to provide information that individuals, businesses, and organizations can use to solve problems, track progress, and make decisions.

  5. Data Collection Methods

    Data Collection Methods. Data collection is a process of collecting information from all the relevant sources to find answers to the research problem, test the hypothesis (if you are following deductive approach) and evaluate the outcomes. Data collection methods can be divided into two categories: secondary methods of data collection and ...

  6. What Is Data Collection? A Guide for Aspiring Data Scientists

    During the data collection process, researchers must identify the different data types, sources of data, and methods being employed since there are many different methods to collect data for analysis. Many fields, including commercial, government and research, rely heavily on data collection.

  7. Best Practices in Data Collection and Preparation: Recommendations for

    We offer best-practice recommendations for journal reviewers, editors, and authors regarding data collection and preparation. Our recommendations are applicable to research adopting different epistemological and ontological perspectives—including both quantitative and qualitative approaches—as well as research addressing micro (i.e., individuals, teams) and macro (i.e., organizations ...

  8. Design: Selection of Data Collection Methods

    Data collection methods are important, because how the information collected is used and what explanations it can generate are determined by the methodology and analytical approach applied by the researcher. 1, 2 Five key data collection methods are presented here, with their strengths and limitations described in the online supplemental material.

  9. Data Collection Methods: A Comprehensive View

    The data obtained by primary data collection methods is exceptionally accurate and geared to the research's motive. They are divided into two categories: quantitative and qualitative. We'll explore the specifics later. Secondary data collection. Secondary data is the information that's been used in the past.

  10. 4.5 Data Collection Methods

    4.5 Data Collection Methods Choosing the most appropriate and practical data collection method is an important decision that must be made carefully. It is important to recognise that the quality of data collected in a qualitative manner is a direct reflection of the skill and competence of the researcher.

  11. What Is a Research Methodology?

    Your research methodology discusses and explains the data collection and analysis methods you used in your research. A key part of your thesis, dissertation, or research paper, the methodology chapter explains what you did and how you did it, allowing readers to evaluate the reliability and validity of your research and your dissertation topic.

  12. Data Collection Methods: Sources & Examples

    Some common data collection methods include surveys, interviews, observations, focus groups, experiments, and secondary data analysis. The data collected through these methods can then be analyzed and used to support or refute research hypotheses and draw conclusions about the study's subject matter.

  13. PDF Methods of Data Collection in Quantitative, Qualitative, and Mixed Research

    There are actually two kinds of mixing of the six major methods of data collection (Johnson & Turner, 2003). The first is intermethod mixing, which means two or more of the different methods of data collection are used in a research study. This is seen in the two examples in the previous paragraph.

  14. Data Collection Methods and Tools for Research; A Step-by-Step Guide to

    Data Collection, Research Methodology, Data Collection Methods, Academic Research Paper, Data Collection Techniques. I. INTRODUCTION Different methods for gathering information regarding specific variables of the study aiming to employ them in the data analysis phase to achieve the results of the study, gain the answer of the research ...

  15. (PDF) Data Collection Methods and Tools for Research; A Step-by-Step

    One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain ...

  16. What is Data Collection? Methods, Types, Examples

    Data collection is the systematic process of gathering and recording information or data from various sources for analysis, interpretation, and decision-making. It is a fundamental step in research, business operations, and virtually every field where information is used to understand, improve, or make informed choices.

  17. Data Collection Methods

    Step 2: Choose your data collection method. Based on the data you want to collect, decide which method is best suited for your research. Experimental research is primarily a quantitative method. Interviews, focus groups, and ethnographies are qualitative methods. Surveys, observations, archival research, and secondary data collection can be ...

  18. 7 Data Collection Methods & Tools For Research

    The qualitative research methods of data collection do not involve the collection of data that involves numbers or a need to be deduced through a mathematical calculation, rather it is based on the non-quantifiable elements like the feeling or emotion of the researcher. An example of such a method is an open-ended questionnaire.

  19. Qualitative Research: Data Collection, Analysis, and Management

    The method itself should then be described, including ethics approval, choice of participants, mode of recruitment, and method of data collection (e.g., semistructured interviews or focus groups), followed by the research findings, which will be the main body of the report or paper.

  20. What Is Research Methodology? Definition + Examples

    As we mentioned, research methodology refers to the collection of practical decisions regarding what data you'll collect, from who, how you'll collect it and how you'll analyse it. Research design, on the other hand, is more about the overall strategy you'll adopt in your study. For example, whether you'll use an experimental design ...

  21. What Is a Research Design

    A research design is a strategy for answering your research question using empirical data. Creating a research design means making decisions about: Your overall research objectives and approach. Whether you'll rely on primary research or secondary research. Your sampling methods or criteria for selecting subjects. Your data collection methods.

  22. CHAPTER 3

    Abstract. As it is indicated in the title, this chapter includes the research methodology of the dissertation. In more details, in this part the author outlines the research strategy, the research ...

  23. Guide: How to align data collection to an organization or project's

    Filling out this Metrics & Data tab will bring you one step closer to aligning your data collection tools and methodology to the logic model. Metrics may be based on one data point, multiple data points, and/or multiple types of data. For example, measuring the change in graduation rates across a period of time requires multiple data points:

  24. Utilization of EHRs for clinical trials: a systematic review

    Data collection: In 3 studies using EHR data reduced data collection costs compared to standard methods. Trial Design: In one study examining this application, EHR data informed optimization of eligibility criteria to improve statistical power for a COVID-19 trial. ... In several studies, EHR data was leveraged for secondary research purposes ...

  25. DOC640 Module 2 Discussion

    One significant concept discussed in Farquhar's Case Study Research for Business (2012) is the importance of coherence between research objectives and data collection methods. Farquhar emphasizes that the vocabulary used in research objectives should align with the type of data a researcher plans to collect (qualitative or quantitative).

  26. Technology, data, people, and partnerships in addressing unmet social

    Despite this emphasis on technology and data collection, research suggests substantial barriers exist in operationalizing effective systems. Methods We used qualitative methods to examine cross-sector perspectives on the use of data and technology to facilitate MCO and CBO partnerships in Kentucky, a state with high Medicaid enrollment, to ...

  27. What Is Qualitative Research?

    Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research. Qualitative research is the opposite of quantitative research, which involves collecting and ...

  28. Journal of Medical Internet Research

    Background: Social media platforms have gained popularity as communication tools for organizations to engage with clients and the public, disseminate information, and raise awareness about social issues. From a social capital perspective, relationship building is seen as an investment, involving a complex interplay of tangible and intangible resources.

  29. Full article: Awarding digital badges: research from a first-year

    Material and methods. A modified version of Stefaniak and Carey's (Citation 2019) Framework for Successful Badge Program Implementation - framed around 'Badge Instructional Design', 'Badge System Platform' and 'Badge Program Implementation' - was adopted (Figure 1) to structure the project.Our approach was distinct in that we added an iterative 'research and reflection ...

  30. What the data says about abortion in the U.S.

    The CDC data that is highlighted in this post comes from the agency's "abortion surveillance" reports, which have been published annually since 1974 (and which have included data from 1969). Its figures from 1973 through 1996 include data from all 50 states, the District of Columbia and New York City - 52 "reporting areas" in all.