A short time ago, I wrote a service that fetches historical data from Yahoo! Finance in order to generate Monte Carlo simulations based on a security's performance. For those of you who may be unfamiliar with Monte Carlo simulations, they are a probabilistic modeling technique that assesses potential outcomes in complex scenarios involving random variables. By simulating multiple probability scenarios, Monte Carlo simulations help analyze risk and uncertainty across various fields like investing, business, physics, and engineering.
One of the great things about Yahoo! Finance is that it lets you download historical data in CSV format. The only problem is that my service - like many - serves its response in JSON format. That required performing a transformation on the incoming data. This article will cover exactly how I achieved that in Node.js using the excellent Node.js csvtojson library.
Comparing the CSV and JSON Formats
Let's quickly go over both formats and establish what needs to happen in order to go from one to the other.
CSV (Comma-Separated Values) is a simple file format used to store tabular data. In a CSV file, each line represents a row of data, and values within each row are separated by commas. It's one of the most popular formats for importing data into a database. For example, here's a sample of a Yahoo! Finance response:
Date,Open,High,Low,Close,Adj Close,Volume
2004-08-19,2.490664,2.591785,2.390042,2.499133,2.499133,897427216
2004-08-20,2.515820,2.716817,2.503118,2.697639,2.697639,458857488
2004-08-23,2.758411,2.826406,2.716070,2.724787,2.724787,366857939
2004-08-24,2.770615,2.779581,2.579581,2.611960,2.611960,306396159
2004-08-25,2.614201,2.689918,2.587302,2.640104,2.640104,184645512
2004-08-26,2.613952,2.688672,2.606729,2.687676,2.687676,142572401
2004-08-27,2.692408,2.705360,2.632383,2.643840,2.643840,124826132
Meanwhile, JSON (JavaScript Object Notation) is a lightweight, text-based data interchange format that is easy for humans to read and write and simple for machines to parse and generate. It consists of two primary structures: objects (key-value pairs enclosed in curly braces {}) and arrays (ordered lists enclosed in square brackets []). To get an idea what it looks like, here is a partial response from my API service:
{
"stockData": [
{
"Date": "2019-01-02T00:00:00.000Z",
"Close": 52.233059
},
{
"Date": "2019-01-03T00:00:00.000Z",
"Close": 50.745255
},
{
"Date": "2019-01-04T00:00:00.000Z",
"Close": 53.474648
},
{
"Date": "2019-01-07T00:00:00.000Z",
"Close": 53.35878
},
{
"Date": "2019-01-08T00:00:00.000Z",
"Close": 53.752831
},
// etc...
]
}
Getting Started With csvtojson
There are a few similar libraries, with csvtojson being one of the most active. To install it using the Node Package Manager (npm), run the following in the terminal:
npm i csvtojson
Next, add the following import statement to the top of your Node script:
const csvtojson = require("csvtojson");
And just like that, we're ready to use the library!
Transforming the Data
To transform the data, all we need to do is invoke the csvtojson constructor with a few options and then chain that to the fromString() method:
try {
const response = await fetch(url);
if (response.status !== 200) {
throw new Error('Fetch failed. Received a response of '
+ response.status);
}
const data = await csvtojson({
checkType: true,
colParser: {
"Date": dt => new Date(dt),
"Open": "omit",
"High": "omit",
"Low": "omit",
"Volume": "omit"
}
}).fromString(await response.text());
return data;
} catch (err) {
// handle errors
}
There is also a fromStream() method, but I found it easier to handle fetch-related errors by separating the calls.
A bit about the options:
By default, csvtojson sets every value to a string. To override this behaviour, we can set the checkType property to true. Doing so tells csvtojson to attempt to find a proper type parser according to the cell value. That is, if cell value is "5", a numberParser will be used and all values under that column will use the numberParser to transform data. There are built-in parsers for strings and numbers, but not for dates. That's why the "Date" column employs a custom conversion method.
There is one other built-in parser: a value of "omit" omits the whole column.
We can access our transformed data using dot accessor notation, i.e., data.Date
or data['Adj Close']
.
Changing a Property Name
The "Close" attribute in the API response object is actually the "Adj Close", but I didn't want to call it that, so I changed the attribute name. To transform the result that is sent to downstream, we can use the .subscribe() method for each json object, i.e., for each transformed row.
To do that, I assign the "Adj Close" value to the new attribute and then delete the existing column. This is all done within a Promise due to the asynchronous nature of the transformation process:
const data = await csvtojson({
checkType:true,
colParser: {
"Date": dt => new Date(dt),
"Open": "omit",
"High": "omit",
"Low": "omit",
"Volume": "omit"
}
}).fromString(await response.text()).subscribe(jsonObj =>
new Promise(resolve => {
// use the adj close
jsonObj.Close = jsonObj['Adj Close'];
delete jsonObj['Adj Close'];
resolve();
})
);
Conclusion
In this tutorial we learned how to transform Yahoo! Finance's historical price information of a security from CVS to JSON using the Node.js csvtojson library.
You can try the Monte Carlo simulator API on RapidAPI. I also wrote an API to Get the Stock Buy Price to Earn Desired Returns and a Total Credit Card Interest Calculator. All are free to use up to a certain threshold and then require a paid subscription beyond that.
If you have any questions about my APIs or would like to inquire about having a custom one built, feel free to email me at raylsstr(AT)gmail(DOT)com.
Top comments (0)