Hi, this is Omar. In today's article, we will review Gaussian Distribution Histograms and create an example using Node JS and LightningChart JS.
Gaussian Distribution is a concept of probability and statistics that helps us see how data is distributed around an average value, forming a symmetrical bell that shows the grouping of values around the mean.
We can determine the normal distribution by measuring 2 parameters: the mean and the standard deviation. the standard deviation shows how data spreads around the central point and how far away they are (left or right-skewness).
An important property within the distribution is that around 68% of the data are within one standard deviation of the central value, approximately 95% of the data are within two standard deviations, and approximately 99.7% of the data are within 3 standard deviations.
Visualizing Gaussian Distribution is of vital relevance in data analysis for, e.g., visualizing data from natural and artificial phenomena, statistical inferences (such as hypothesis testing), process simulations, and predictions.
When we need to visualize categorical data, the best choice is to use histograms. In a Gaussian distribution histogram, we will see if the data is normally distributed or skewed.
Another aspect that we can observe in a histogram bar, it divides the range of the data set into groups of the same width, which are usually called bins, and counts the number of values that fall into each of these.
These bins (intervals or groups) are a key element to identify the distribution in a dataset.
I will now review how you can display data in histograms using LightningChart JS which is especially useful for developers who require statistical charts for data applications.
Template Setup
1) Download the template to follow the tutorial.
2) After downloading the template, you’ll see a file tree like this:
3) Open a new terminal and run the npm install
command
Starting with the chart
Today the most recent versions are LightningChart JS 5.1.0 and XYData 1.4.0. I recommend that you review the most recent versions and update them. This is because some LightningChart JS tools do not exist in previous versions.
In the project’s package.json file you can find the LightningChart JS dependencies:
"dependencies": {
"@arction/lcjs": "^5.1.0",
"@arction/xydata": "^1.4.0",
"webgl-obj-loader": "^2.0.8",
}
1) Importing libraries
We will start by importing the necessary libraries to create our chart.
// Import LightningChartJS
const lcjs = require('@arction/lcjs')
const {
lightningChart,
AxisTickStrategies,
BarChartTypes,
BarChartSorting,
Themes
} = lcjs
2) Add license key (free)
Once the LightningChart JS libraries are installed, we will import them into our chart.ts file. Note you will need a trial license, which is free. We would then add it to a variable that will be used for creating the JavaScript Gaussian Distribution chart.
let license = undefined
try {
license = 'xxxxxxxxxxxxx'
} catch (e) {}
3) Properties
.BarChart({
theme: Themes.cyberSpace,
type: BarChartTypes.Vertical
})
.setTitle('Histogram')
.setSorting(BarChartSorting.Disabled)
.setValueLabels(undefined)
.setData(histogramData)
.setCursorResultTableFormatter((builder, category, value, bar) => builder
.addRow('Range:', '', category)
.addRow('Amount of values:', '', bar.chart.valueAxis.formatValue(value))
)
– Theme: defines the look and feel of your JavaScript stacked bar chart. Note that you must specify the color theme of the chart components beforehand.
– setTitle: Sets the name at the top of the chart.
- setSorting: To preserve the bell curve shape, automatic sorting must be disabled
Setting up bins
We need to add an HTML element to the chart to modify the number of bins directly on the user interface. This element will allow us to enter up to 1000 bins and will update the chart immediately.
We have to access the chart container:
const barDiv = barChart.engine.container
const inputDiv = document.createElement('div')
barDiv.append(inputDiv)
inputDiv.style.position = "absolute"
inputDiv.style.top = "0"
The **createElement**
function will allow us to create an HTML element, in this case, a DIV
. Basic positioning properties can be specified with the position and top properties included in the ElementCSSInlineStyle
interface. This configuration corresponds to the div that will contain the input text, to create the “Number of bins” label, we will have to perform the same procedure:
const label = document.createElement('label')
inputDiv.append(label)
label.innerHTML = "Number of bins:"
label.style.position = "relative"
Now we will create the input text element that contain its own configuration to delimit the scope of the chart:
const binInput = document.createElement('input')
inputDiv.append(binInput)
barChart.setTitleMargin({top: 25, bottom: -10})
binInput.type = "number"
binInput.min = "1"
binInput.max = "1000"
binInput.value = "100"
binInput.style.position = "relative"
binInput.style.height = "30px"
The input allows us to add similar visual properties to the previous two elements but also allows us to specify the type of value it will accept. In this case, the input will only allow numbers, with a minimum value of 1 and a maximum value of 1000 for the bins.
binInput.addEventListener('input', () => {
const inputValue = parseInt(binInput.value)
if (Number.isInteger(inputValue) && inputValue > 0 && inputValue <= 1000) {
barChart.setData([])
const histogramData = calculateHistogramBins(values, inputValue)
barChart.setData(histogramData).setSorting(BarChartSorting.Disabled)
}
})
Finally, we will add a listener linked to the INPUT type element. This listener will execute the methods that generate the data shown in the chart. The listener will be executed when the user enters a new value, and it is between the values 1 and 1000.
Setting up axes
barChart.valueAxis.setTickStrategy(AxisTickStrategies.Numeric, ticks =>
ticks.setMajorTickStyle(major =>
major.setGridStrokeStyle(
barChart.getTheme().xAxisNumericTicks.majorTickStyle.gridStrokeStyle
)
).setMinorTickStyle(( tickStyle ) =>
tickStyle.setGridStrokeStyle(
barChart.getTheme().yAxisNumericTicks.minorTickStyle.gridStrokeStyle
)
)
)
The setTickStrategy
function defines the positioning and formatting logic of Axis ticks as well as the style of the ticks created. In this case, the grid style will be formatted with the same theme used for the chart. Lastly, we have to assign the fill style for all the bins based on the first-bin fill style.
const bars = barChart.getBars()
const fillSTyle = bars[0].getFillStyle()
bars.forEach(bar => { bar.setFillStyle(fillSTyle) })
Creating Data Points
To generate the data points, we will use the generateGaussianRandom function which generates samples with random values based on the Box Muller transformation method.
const generateGaussianRandom = (length) => {
const samples = []
for (let i = 0; i < length; i++) {
let u = 0, v = 0, s = 0
while (s === 0 || s >= 1) {
u = Math.random() * 2 - 1
v = Math.random() * 2 - 1
s = u * u + v * v
}
const temp = Math.sqrt(-2 * Math.log(s) / s)
const sample = u * temp
samples.push(sample)
}
return samples
}
Once the sample is generated, we must calculate the number of bins for the histogram using the **calculateHistogramBins**
function.
const calculateHistogramBins = (data, numberOfBins) => {
const minValue = Math.min(...data)
const maxValue = Math.max(...data)
const binSize = (maxValue - minValue) / numberOfBins
It is necessary to have 3 variables: the minimum value, the maximum value, and the number of intervals. The minimum and maximum values must be those found in the data argument. The subtraction of the maximum and minimum values allows us to obtain the range, which must be divided by the number of intervals.
This way we can obtain the number of the bins. To calculate each interval, we must add the bin size to the minimum value. The interval number in the loop (i) is added to this sum. The result is added to the bins array:
// Calculate bin intervals
const bins = []
for (let i = 0; i < numberOfBins; i++) {
const binStart = minValue + i * binSize
const binEnd = minValue + (i + 1) * binSize
bins.push({
binStart: Math.round(binStart * 100) / 100,
binEnd: Math.round(binEnd * 100) / 100,
values: Array(),
})
}
bins[numberOfBins - 1].binEnd = maxValue
Now we need to assign the value to each bin that matches the generated index, based on the loop's value in turn, minus the minimum value between bin size.
data.forEach(value => {
const binIndex = Math.floor((value - minValue) / binSize);
if (binIndex >= 0 && binIndex < numberOfBins) {
bins[binIndex].values.push(value);
}
})
If the index is greater than zero and less than the specified number of bins, the value is assigned to the bin with the generated index that is within the bin array. Finally, we will assign values to category and value:
bins.forEach(interval => {
barChartData.push({
category: `${
(interval.binStart + (interval.binStart === minValue ? 0 : 0.01)).toFixed(2)}—${
interval.binEnd < 0 ? `(${interval.binEnd.toFixed(2)})` : interval.binEnd.toFixed(2)}`,
value: interval.values.length
})
})
return barChartData
Some conditions are assigned in the category value:
If the value of binStart is equal to the minimum value obtained at the beginning, the value 0 is assigned if not the value 0.01.
If the value of binEnd is greater than 0, it is enclosed with parentheses.
The value object will be equal to the total values inside the bin. Finally, the array barChartData is returned as the result.
Run the project
To run the Gaussian distribution chart project, Run the npm start
command in the terminal to visualize the chart in a local server.
Conclusion
Thank you for coming this far, today's topic was quite interesting although quite long. The histogram could be defined as a statistical graph that represents the distribution of a set of data through bars, each of which represents a category or interval. Bins are also sometimes called "intervals", "classes", or "buckets".
To create a histogram with LightningChart JS, it was necessary to use the BarChart series, in which a vertical orientation was specified. The histogram is made up of bars (bins), which had to be created with the help of the generateGaussianRandom and calculateHistogramBins methods.
The first method generates the sample or dataset, while the second helps us calculate the intervals, which was the most complicated part of this exercise.
In your case, if you already have a set of intervals, it will only be necessary to generate a data source that the chart can consume, such as a JSON file that contains the Category and Value elements.
If you have any questions or comments, do not hesitate to contact me. Thank you!
Written by:
Omar Urbano | Software Engineer & Technical Writer
Send me your questions via LinkedIn
Top comments (0)