

ORIGINAL ARTICLE 

Year : 2022  Volume
: 9
 Issue : 1  Page : 7076 

How to conduct descriptive statistics online: A brief handson guide for biomedical researchers
Himel Mondal^{1}, Sharada Mayee Swain^{2}, Shaikat Mondal^{3}
^{1} Department of Physiology, Fakir Mohan Medical College and Hospital, Balasore, Odisha, India ^{2} Department of Physiology, HiTech Medical College and Hospital, Bhubaneswar, Odisha, India ^{3} Department of Physiology, Raiganj Government Medical College and Hospital, Raiganj, West Bengal, India
Date of Submission  16Oct2021 
Date of Acceptance  25Oct2021 
Date of Web Publication  23Mar2022 
Correspondence Address: Himel Mondal Department of Physiology, Fakir Mohan Medical College and Hospital, Balasore, Odisha India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/ijves.ijves_103_21
Background: Descriptive statistics is the first step of data analysis. In biomedical researches, inferential statistical tests are invariably conducted after descriptive statistical tests for getting a summary of the data. Many resourcelimited settings may not have dedicated software for carrying out these tests. Aim: This article aimed to provide a brief technical guide about the conduct of descriptive statistics with visualization that can be done without any dedicated statistical software package. Methods: We searched for online tools that provide free service for the conduct of descriptive statistics. The example data were fabricated for the conduct of the test online. The visualization of the data (i.e., figures) was explained in brief, wherever necessary. Results: We described the method to graph and summarize the data using a pie chart, frequency table, stem and leaf display, histogram, frequency polygon, box plot, bar chart, stacked bar chart, line graph, dot plot, central tendency, variance, quantilequantile plot, scatter plot, and Venn diagram. All these tests and visualization were done online without any installed dedicated software package. Conclusion: This article provides a brief technical guide for conducting common descriptive statistical tests online. Researchers in any resourcelimited settings may use these services to summarize and visualize the data online from public domain websites.
Keywords: Data analysis, descriptive statistics, research design, software, statistical analysis
How to cite this article: Mondal H, Swain SM, Mondal S. How to conduct descriptive statistics online: A brief handson guide for biomedical researchers. Indian J Vasc Endovasc Surg 2022;9:706 
How to cite this URL: Mondal H, Swain SM, Mondal S. How to conduct descriptive statistics online: A brief handson guide for biomedical researchers. Indian J Vasc Endovasc Surg [serial online] 2022 [cited 2022 May 28];9:706. Available from: https://www.indjvascsurg.org/text.asp?2022/9/1/70/340492 
Introduction   
For biomedical research, data are collected from a sample to draw a conclusion about the population from where the sample was recruited. In the data analysis flow, a researcher first gets a summary (e.g., mean and variance) of the data obtained from the sample. This is known as descriptive statistics. When these data are further analyzed (e.g., comparing mean of two groups, analysis of variance) to conclude the population, it is known as inferential statistics.^{[1]} If we look at some of the published biomedical research, we would find that the inferential statistics are conducted after conducting descriptive statistics, and results are also presented with descriptive statistics such as percentage, mean, median, and standard deviation.^{[2]}
The first step of descriptive statistics is sorting the data into groups. Sometimes, sorted or grouped data are presented with figures to augment the understanding of the sample data. The next step is to summarize the data, commonly with the central tendency such as mean and standard deviation, median, and quartile range.^{[3]}
Although descriptive statistical tests for a small number of observations can be calculated manually, a large data need a calculator. There are several spreadsheet software packages (both free and paid) that may help to conduct some of the descriptive statistics. However, they do not provide a full range of descriptive statistical tests with presentable visual output. In addition, dedicated software packages (both free and paid) are available for the statistical analysis. However, there are some settings where novice researchers may not have access to these software packages.
With this context, in this article, we aimed to provide a brief technical guide on how anyone can conduct descriptive statistical tests online without any dedicated software. However, a computer connected to the internet is a prerequisite for the tests. We presume that this compilation would help novice researchers of resourcelimited settings in carrying out descriptive statistics with ease.
Methods   
Ethics
This study does not involve any human or animal research participants. All the data were fabricated for the conduct of the tests online. The websites that we described in this study provide free service through public domain websites. Hence, no ethical clearance was obtained for this study.
The basic concept of variable
In [Figure 1], we present a part of a study report to show the types of variables. A variable is a characteristic that can be measured, either qualitatively or quantitatively. Qualitative variables are also called categorical variables. These are – dichotomous (2 categories), nominal (≥3 categories without any order), and ordinal (≥3 categories in order). Quantitative variables are also called numerical variables. These are– continuous (measured in continuous scale) and discrete (can only take some numerical values).^{[2],[3],[4],[5]}
Websites
The list of the websites that were used in this study is summarized in [Table 1]. For a single test, we described a single website. There may be the availability of multiple websites that offer a particular test. Readers are encouraged to find more for their interest. In contrast, some websites offer multiple tests for descriptive statistics. However, we tried to make the list diverse. We checked the websites and included those that provide the test without any registration or fees. High resolution image of the result can be found in supplementary file.  Table 1: Common descriptive statistical test and websites offering free conduct of the tests
Click here to view 
Descriptive statistics and visualization
In the below section, we described how anyone can conduct the tests online and save the result for presenting it on the manuscript. All the websites were live when we wrote this article. However, we cannot guarantee the free availability of the websites forever.
Pie chart
Example
You conducted a survey on the first preference of bibliographic database among a sample of 169 biomedical researchers. You would like to present the qualitative/categorical data in relative frequency in a pie chart.^{[6]}
Steps
 Go to https://www.statskingdom.com/chartmaker.html2
 Select the “Type” as “Pie Chart; write the title “Preference of bibliographic database;” write “Category” titles serially replacing the A, B, C, etc., (you can copy and paste the categories too); type the corresponding values in “Group1.” Click on the “Calculate” button
 Click on “Save image” and the image would be opened on a new window. Now, right click on the image and save it from “Save image as” option.
Result
The pie chart with percentage is shown in [Figure 2]a. Although the pie chart is visually appealing, its usage in presenting small data (e.g., number of male, female, and intersex) is not suggested as it can be expressed in number and percentage as text.  Figure 2: Descriptive statistics visualization – (a) pie chart, (b) frequency table, (c) stem and leaf display, (d) histogram, (e) frequency polygon, (f) box plot, (g) bar chart, (h) stacked bar chart, and (i) line graph. Highresolution figures are available in supplementary file
Click here to view 
Frequency table
Example
You recorded the age of 35 athletes in completed years. You would like to get the frequency distribution of your numerical/quantitative discrete data.^{[7]}
Steps
 Go to https://www.socscistatistics.com/descriptive/frequencydistribution/default.aspx
 Copy the numbers and paste it in the box. Click on the “Generate” button
 You need to take a screenshot as the website does not provide option to save the table as image. You can also change the number of classes according to your choice from the “Edit Frequency Table” option.
Result
The frequency table is shown in [Figure 2]b. According to the class distribution, the highest number of athletes (10 [28.6%]) was in 23–27 years of age.
Stem and leaf display
Example
You have recorded age of 30 research participants in completed years. The data set is not a large one; hence, it can be expressed in stem and leaf display.^{[8]}
Steps
 Go to https://www.calculatorsoup.com/calculators/statistics/stemleaf.php
 Copy the data and paste it in the box below “Enter Data Set.” Click on the “Calculate” button
 You need to take a screenshot of the result.
Result
The stem and leaf plot is shown in [Figure 2]c. The stem (1, 2, 3, and 4) is multiplied by 10 and the leaf values are added to get the actual value. In the plot shown in [Figure 2]c, one participant was of 18 years, one was of 19 years, one was of 20 years, one was of 21 years; two was of 22 years, and so on.
Histogram
Example
You measured weight (in kg) of 20 research participants and you wanted to make a histogram with this data set.^{[9]}
Steps
 Go to https://www.aatbio. com/tools/onlinehistogrammaker
 Paste the data under “Data Entry;” click on “Process data.” Click on “Calculate histogram.” You may also click and rename the X axis, Y axis, and the title of the graph
 Right click on the histogram and click on the “Download Graph.” If it does not work, take a screenshot of the graph.
Result
The histogram is shown in [Figure 2]d. The weight is plotted in Xaxis, and the number of observation is shown in Yaxis. The histogram provides a rough idea about the distribution of the data.
Frequency polygon
Example
You measured the heart rates (in beats per minute) of 17 research participants in three grades of exercises. You would like to check the comparative distribution shape (i.e., comparing three histograms) of the numerical continuous data.^{[7]}
Steps
 Go to https://www.socscistatistics.com/descriptive/polygon/default.aspx
 Copy each column of data and paste them in “Distribution 1,” “Distribution 2”, and “Distribution 3.” Click on the “Generate” button
 Right click on the image and use “Save image as” option to save the figure. You can also customize the axis name, series names and number of classes from “Edit Polygon” option.
Result
From [Figure 2]e, distribution shape of heart rate in mild, moderate, and vigorous exercise can be observed. The frequency and polygon table details are also shown in the result page.
Box plot
Example
You measured body weight of 14 sedentary, 14 active, and 14 athlete research participants. You wanted to compare these three set of data with a box andwhisker graph.^{[7]}
Steps
 Go to https://goodcalculators.com/boxplotmaker
 Copy data and paste in “Group 1,” “Group 2,” and “Group 3;” name “Yaxis Title” as “Weight (kg).” Click on the “Draw” button
 Click on “Save as Portable Network Graphics (.png) “ to save the box plot image.
Result
[Figure 2]f shows the box plot of the weight of three groups. There is an option to “+Add Group” when you have more than three groups. In the box plot, the upper whisker (upper line) corresponds to the maximum value; the lower whisker corresponds to minimum value. The bold line in the box indicates median and the box itself indicates interquartile range (quartile 1 [25 percentile] to quartile 3 [75 percentile], below upwards). The round shape in the box indicates the mean and round shape outside the minimum or maximum range indicates outlier. In this example, there was no outlier.
Bar chart
Example
You collected the number of publication of two authors in last 5 years in PubMed Central. You wanted a comparison bar chart of the publications.^{[10]}
Steps
 Go to https://www.statskingdom.com/chartmaker.html
 Keep the “Type” as “Bar Chart;” write the title in the “Title” box; copy and paste the years in “Category” column and values of first author in “Group1” column; click on “Insert column” and paste the values of second author in “Group2.” Click on the “Calculate” button. You can customize the chart by clicking on the “More option;” you can name the axis; fix the range; and change the colors.
 Click on “Save image”, the image would be shown, right click on the image and use “Save image as” to save the image.
Result
The comparative numbers of publication of two authors are shown in [Figure 2]g. According to need, more columns can be added.
Stacked bar chart
Example
You conducted a survey with 32 research participants on the knowledge about, attitude toward, and practice on COVID19. The result was coded in three categories – correct, wrong, and equivocal. You wanted to make a stacked bar chart of the finding.^{[11]}
Steps
 Go to https://graphmaker.imageonline.co/stackedbarchart. php
 Write the “Line Names” as “Correct,” “Wrong,” and “Equivocal;” write the “Chart Title” as “Knowledge, Attitude, and Practice on COVID19” and “X Axis” as “Responses;” edit the “Input chart parameters” according to the values (e. g.; Knowledge: 23, 12, 7); delete the unnecessary rows. The chart would be automatically generated.
 Click on the “DownloadChart” to download the image file of the chart.
Result
The stacked bar chart is shown in [Figure 2]h. From the chart, comparative correct, wrong, and equivocal responses could be observed.
Line graph
Example
You collected data on the number of publication on Yoga and Acupuncture available in PubMed in the last 10 years. You wanted to compare the trend visually over time in this 10 year period.^{[12]}
Steps
 Go to https://www.rapidtables.com/tools/linegraph.html
 Write the “Graph title” as “Yoga and Acupuncture publication in last 10 years;” name the “Horizontal axis” as “Year” and “Vertical axis” as “Number of publication;” write the years in “Data labels” separated by comma and a space, select the “Number of lines” as “2 lines;” copy and paste the data of Yoga publication in “Line 1 data values” and Acupuncture publication number in “Line 2 data values;” you can make the line curved by selecting the “Curved line” option. Click on the “Draw” button
 Click on the download icon to save the image.
Result
The number of publication of Yoga and Acupuncture in the last 10 years in PubMed is shown in [Figure 2]i. More lines can be added in the figure according to the number of data set.
Dot plot
Example
Twenty research participants completed a stroop test, and the time was recorded in seconds. You would like to make a dot plot to graph your numerical data.^{[13]}
Steps
 Go to https://www.geogebra.rg/m/BxqJ4Vag
 Copy the data set and paste in the Column A. The plot will be generated
 Take a screenshot of the plot.
Result
The dot plot is shown in [Figure 3]a. The plot also includes the mean, median, and standard deviation.  Figure 3: Descriptive statistics visualization – (a) dot plot, (b) central tendency, (c) variance, (d) quantilequantile plot, (e) scatter plot, and (f) Venn diagram. Highresolution figures are available in supplementary file
Click here to view 
Central tendency
Example
You measured body weight of 34 employee and you wanted to get an idea about the mean age of them.^{[14]}
Steps
 Go to https://www.calculatorsoup.com/calculators/statistics/meanmedianmode.php
 Copy the data and paste it in the box “Enter Data Set.” Click on the “Calculate” button
 Take a screenshot of the result to save it for any future need.
Result
The calculator shows result of mean, median, mode, minimum, maximum, quartile, and interquartile range [Figure 3]b. Hence, the central tendency of both normally distributed (commonly expressed in mean and standard deviation) and notnormally distributed data (commonly expressed as median, quartile 1– quartile 3) can be expressed from this result.
Variance
Example
You measured body weight of 34 employee and you wanted to get an idea about the variance of the data set. Variance indicates how far the numbers are from the mean and far from each other. It is calculated by taking average of the squared differences from the mean.^{[15]}
Steps
 Go to https://statscalculator. com
 Clear the data already present in the “Observations” box; paste your data in the box. Click on the “Calculate” button
 Take a screenshot of the output for further usage.
Result
[Figure 3]c shows part of the result. When the variance is calculated for a sample, the denominator in the calculation is the number of observation (n). When the variance is calculated for population, the denominator in the calculation is = n1 [Figure 3]c.
QuantileQuantile plot
Example
You measured body weight of 34 employees and you wanted to get the normal QuantileQuantile (QQ) plot of the distribution. Quantile is a point that divides the observations into equal groups. For example, median is a quantile that divide the sample into two equal parts; quartiles are quantiles that divide the observations into four equal groups.^{[16]}
Steps
 Go to http://www.wessa.net/rwasp_varia1.wasp#output
 Clear the data present in the “Data” box; copy and paste the data in the box. Click on the “Compute” button
 Click on the “New Window” to open the image in a new window. Right click on the image and use “Save Image As” option to save the image.
Result
The QQ plot is shown in [Figure 3]d. The Xaxis plots theoretical quantiles and Yaxis plots sample quantiles. The QQ plot is a graphical method to have a gross idea about the nature of distribution. When the data are normally distributed, the points are distributed along the 45° degree line. Our data did not show that pattern. Hence, it may not be normally distributed.
Scatter plot
Example
You measured height (cm) and weight (kg) of 19 research participants. You wanted to make a scatterplot with this bivariate (studying two variables) numerical continuous data.^{[17]}
Steps
 Go to https://mathcracker.com/scatter_plot
 Copy the height data and paste it in “X data (comma or sapce separted);” copy the weight data and paste it in “Y data (comma or space separated);” write the “Type the title (optional) “ as “Relationship of height and weight;” put “Height (cm)” in “Name of X variable (optional)” and “Weight (kg)” in “name of Y variable (optional).” Click on the “GRAPH IT” button
 Right click on the graph and use “Save image as” option to save the image.
Result
The scatterplot is shown in [Figure 3]e. This graph is purely descriptive and not showing any regression line or correlation.
Venn diagram
Example
You collected data on the most liked five chapters in physiology from three groups of students. You would like to check and present visually about their common and unique choices in a Venn diagram.^{[18]}
Steps
 Go to https://bioinformatics.psb.ugent.be/webtools/Venn
 Copy the chapter names of the students' first group in the “list 1” box; “Provide name for the list (optional)” as “Group 1;” similarly do it for group 2 and group 3. Click on the “Submit” button
 Click on the “Save Image As PNG” to save the image.
Result
The Venn diagram is shown in [Figure 3]f. The website also shows the text about the overlapping and uniqueness of items (chapters). You can either save it as screenshot or save it as text by clicking on “Save text result.”
Results   
We were able to conduct common descriptive statistical tests online on public domain websites. We only included the websites that provide their service without creating an account. Hence, anyone can just open the websites and conduct the tests. We listed the brief guides to graph and summarize data using a pie chart, frequency table, stem and leaf display, histogram, frequency polygon, box plot, bar chart, stacked bar chart, line graph, dot plot, central tendency, quantilequantile plot, scatter plot, and Venn diagram.
The output visualization can be found in [Figure 2] and [Figure 3]. The list of the websites can be found in [Table 1]. For a particular test, associated data, website link, and highresolution image can be found in the supplementary file available in Figshare (http://dx.doi.org/10.6084/m9.figshare.16903072).
Discussion   
Biomedical researchers are occasionally getting formal training on biostatistics. However, a handson universal training program for all researchers is still needed for creating a competent pool of physicianresearcher.^{[19]} In India, postgraduate medical students and medical teachers are currently being trained in research methodology as a compulsory step for eligibility for university examinations or promotion. We assume it would boost the competency of future physician researchers' work capability.^{[20]}
In this article, we provided a quick handson guide to conduct various types of descriptive statistical tests. Descriptive statistics is the first step to organize, summarize, and visualize the data. The inferential tests come later. Anyone can download the fabricated example data and conduct the tests themselves following the steps to get a reallife experience. We encourage them to try the tests with their data for further experience.
The tests we described can be done online (on an internet browser) without any dedicated and installed software package. Hence, any user can conduct the tests even in public access computers that are connected to the internet. However, the websites we described here may discontinue their services at any point in time. That was the reason why we included diverse websites so that researchers can get an alternative if needed for their tests.
Conclusion   
This article provided a brief technical guide on how to conduct common descriptive statistics online and free of cost. Any researcher can carry out these tests without any dedicated and costly statistical software packages. However, a computer and internet connection is the minimum requirement. The data visualization has also been described. The output visual elements can be used for presenting the data in any seminar or a manuscript. We presume that this article would help novice researchers in any resourcelimited settings where institutional access to data analysis statistical software is not available.
Supplementary file
http://dx.doi.org/10.6084/m9.figshare.16903072.
Acknowledgements
We thank Sarika Mondal and Ahana Aarshi for their support during the preparation of the manuscript.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Guetterman TC. Basics of statistics for primary care research. Fam Med Community Health 2019;7:e000067. 
2.  Kaliyadan F, Kulkarni V. Types of variables, descriptive statistics, and sample size. Indian Dermatol Online J 2019;10:826. [ PUBMED] [Full text] 
3.  Nick TG. Descriptive statistics. Methods Mol Biol 2007;404:3352. 
4.  Mayya SS, Monteiro AD, Ganapathy S. Types of biological variables. J Thorac Dis 2017;9:17303. 
5.  Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics. Int J Acad Med 2018;4:603. [Full text] 
6.  Duquia RP, Bastos JL, Bonamigo RR, GonzálezChica DA, MartínezMesa J. Presenting data in tables and charts. An Bras Dermatol 2014;89:2805. 
7.  Manikandan S. Frequency distribution. J Pharmacol Pharmacother 2011;2:546. [ PUBMED] [Full text] 
8.  Hazra A, Gogtay N. Biostatistics series module 1: Basics of biostatistics. Indian J Dermatol 2016;61:1020. [ PUBMED] [Full text] 
9.  Shreffler J, Huecker MR. Exploratory Data Analysis: Frequencies, Descriptive Statistics, Histograms, and Boxplots. In: StatPearls. Treasure Island (FL): StatPearls Publishing; 2021. Available from: https://www.ncbi.nlm.nih.gov/books/NBK557570/. [Last updated on 2021 Mar 01]. 
10.  He Y, Yu X, Gan Y, Zhu T, Xiong S, Peng J, et al. Bar charts detection and analysis in biomedical literature of PubMed Central. AMIA Annu Symp Proc 2017;2017:85965. 
11.  Streit M, Gehlenborg N. Bar charts and box plots. Nat Methods 2014;11:117. 
12.  Peebles D, Ali N. Expert interpretation of bar and line graphs: The role of graphicacy in reducing the effect of graph format. Front Psychol 2015;6:1673. 
13.  Cornelius V, Cro S, Phillips R. Advantages of visualisations to evaluate and communicate adverse event information in randomised controlled trials. Trials 2020;21:1028. 
14.  Rodrigues CFS, Lima FJC, Barbosa FT. Importance of using basic statistics adequately in clinical research. Rev Bras Anestesiol 2017;67:61925. 
15.  
16.  Voorman A, Lumley T, McKnight B, Rice K. Behavior of QQplots and genomic control in studies of geneenvironment interaction. PLoS One 2011;6:e19416. 
17.  Slutsky DJ. The effective use of graphs. J Wrist Surg 2014;3:678. 
18.  Chen H, Boutros PC. VennDiagram: A package for the generation of highlycustomizable Venn and Euler diagrams in R. BMC Bioinformatics 2011;12:35. 
19.  Federer LM, Lu YL, Joubert DJ. Data literacy training needs of biomedical researchers. J Med Libr Assoc 2016;104:527. 
20.  The cultures of academic medicine in India. Natl Med J India 2019;32:30810.2019;32:30810. 
[Figure 1], [Figure 2], [Figure 3]
[Table 1]
