Introduction
Google Finance Page is a data-rich page for traders and investors to get access to international exchanges, real-time financial news, and financial analysis to keep you updated with the current market scenario.
This type of data-rich website always has a couple of advantages:
- It helps traders and investors in pricing analysis.
- The financial data of a company can help people to decide which stock to purchase in the future.
- Technical indicators and analysis are used to do research on the trends in past prices of stock and predict the future trend of share prices.
Requirements:
Web Parsing with CSS selectors
Searching the tags from the HTML files is not only a difficult thing to do but also a time-consuming process. It is better to use the CSS Selectors Gadget for selecting the perfect tags to make your web scraping journey easier.
This gadget can help you to come up with the perfect CSS selector for your need. Here is the link to the tutorial, which will teach you to use this gadget for selecting the best CSS selectors according to your needs.
User Agents
User-Agent is used to identify the application, operating system, vendor, and version of the requesting user agent, which can save help in making a fake visit to Google by acting as a real user.
You can also rotate User Agents, read more about this in this article: How to fake and rotate User Agents using Python 3.
If you want to further safeguard your IP from being blocked by Google, you can try these 10 Tips to avoid getting Blocked while Scraping Google.
Install Libraries
To start scraping Google Finance Results we need to install some NPM libraries, so that we can move forward.
So before starting, we have to ensure that we have set up our Node JS project and installed both the packages - Unirest JS and Cheerio JS. You can install both packages from the above link.
Target:
Process:
Let's start scraping the Google Finance Page. We will be using Unirest JS to extract the raw HTML data and parse this data with the help of Cheerio JS.
Copy this link and open it in your browser so we can start selecting the tags to parse the required data.
https://www.google.com/finance/?hl=en
Now, we will make a GET request to our target URL using Unirest JS to extract the HTML data.
const unirest = require("unirest");
const cheerio = require("cheerio");
const getFinanceData = async () => {
const url = "https://www.google.com/finance/?hl=en";
const response = await unirest
.get(url)
.header({"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36"})
const $ = cheerio.load(response.body);
Step-by-step explanation:
- In the sixth line, we made a GET request to the target URL.
- In the next line, we passed User-Agent as a header with the URL, so our bot can mimic a real organic user.
- Next, we load the response in a cheerio instance variable.
Now, we will prepare our parser by searching the tags with the help CSS selector gadget, stated above in the Requirements section.
As you can see, the above tabs are under the tag .H8Chl .SxcTic
. So, its parser will look like this:
let interested_top = [];
$(".H8Ch1 .SxcTic").each((i,el) => {
interested_results.push({
stock_name: $(el).find(".ZvmM7").text(),
price: $(el).find(".YMlKec").text(),
change_in_price: $(el).find(".P2Luy").text(),
change_in_percentage: $(el).find(".JwB6zf").text()
})
})
Now, we will parse the financial news results.
So, we found the tag for financial news results. This is how we parse it:
let financial_news = [];
$(".yY3Lee").each((i,el) => {
financial_news.push({
title: $(el).find(".Yfwt5").text(),
link: $(el).find("a").attr("href"),
source: $(el).find(".sfyJob").text(),
time: $(el).find(".Adak").text()
})
})
Similarly, if we follow the same process for finding the tags of the other blocks and tabs, it makes our whole parser looks like this:
let interested_top = [];
$(".H8Ch1 .SxcTic").each((i,el) => {
interested_top.push({
stock_name: $(el).find(".ZvmM7").text(),
price: $(el).find(".YMlKec").text(),
change_in_price: $(el).find(".P2Luy").text(),
change_in_percentage: $(el).find(".JwB6zf").text()
})
})
let financial_news = [];
//sometimes the scraper may return you empty news, so you have to
//run the program once again
$(".yY3Lee").each((i,el) => {
financial_news.push({
title: $(el).find(".Yfwt5").text(),
link: $(el).find("a").attr("href"),
source: $(el).find(".sfyJob").text(),
time: $(el).find(".Adak").text()
})
})
let market_trends = [];
$(".gR2U6").each((i,el) => {
market_trends[i] = $(el).text();
})
let interested_bottom = [];
$(".tOzDHb").each((i,el) => {
interested_bottom.push({
stock_name: $(el).find(".RwFyvf").text(),
price: $(el).find(".YMlKec").text(),
change_in_percentage: $(el).find(".JwB6zf").text()
})
})
let calendar_results = [];
$(".kQQz8e").each((i,el) => {
calendar_results.push({
stock_name: $(el).find(".qNqwJf").text(),
date_and_time: $(el).find(".fbt0Xc").text(),
link: $(el).find("a").attr("href")
})
})
let most_followed_on_google = [];
$(".NaLFgc").each((i,el) => {
most_followed_on_google.push({
stock_name: $(el).find(".TwnKPb").text(),
following: $(el).find(".Iap8Fc").text().replace(" following", ""),
change_in_percentage: $(el).find(".JwB6zf").text(),
link: $(el).find("a").attr("href")
})
})
console.log("interested_top:", interested_top)
console.log("financial_news:" ,financial_news)
console.log("market_trends:", market_trends)
console.log("interested_bottom:", interested_bottom)
console.log("calendar_results:", calendar_results)
console.log("most_followed_on_google:", most_followed_on_google)
And here are the results:
interested_top: [
{
stock_name: 'BSE Ltd',
price: '₹586.75',
change_in_price: '-₹14.25',
change_in_percentage: '2.37%'
},
{
stock_name: 'ITC Ltd',
price: '₹360.70',
change_in_price: '+₹7.20',
change_in_percentage: '2.04%'
},
{
stock_name: 'S&P 500',
price: '3,766.18',
change_in_price: '-61.93',
change_in_percentage: '1.62%'
},
{
stock_name: 'Dow Jones Industrial Average',
price: '32,648.88',
change_in_price: '-511.95',
change_in_percentage: '1.54%'
},
{
stock_name: 'NuStar Energy L.P.',
price: '$16.27',
change_in_price: '-$0.22',
change_in_percentage: '1.33%'
},
{
stock_name: 'Reliance Industries Ltd',
price: '₹2,593.70',
change_in_price: '-₹12.90',
change_in_percentage: '0.49%'
}
]
financial_news: [
{
title: 'Why has Facebook parent Meta fired employees',
link: 'https://indianexpress.com/article/explained/meta-to-fire-11000-people-why-facebook-parent-is-cutting-jobs-what-next-8259364/',
source: 'The Indian Express',
time: '4 hours ago'
},
{
title: `"Just Killed It": Elon Musk On Twitter's 'Official' Tag, Hours After Launch`,
link: 'https://www.ndtv.com/world-news/twitter-will-do-lots-of-dumb-things-elon-musk-after-official-tag-vanishes-3505913',
source: 'NDTV.com',
time: '30 minutes ago'
},
{
title: 'Tata Motors Q2 Results: Firm sees higher-than-expected net loss of ₹945 cr; \n' +
'revenue jumps 30% | Mint',
link: 'https://www.livemint.com/companies/company-results/tata-motors-q2-results-firm-sees-higher-than-expected-net-loss-of-rs-945-cr-revenue-jumps-30-11667983557507.html',
source: 'Mint',
time: '8 hours ago'
},
{
title: "Govt to sell SUUTI's 1.55% stake in Axis Bank via OFS on November 10-11",
link: 'https://www.moneycontrol.com/news/business/govt-to-sell-1-5-stake-in-axis-bank-via-ofs-on-nov-10-11-9484291.html',
source: 'Moneycontrol',
time: '3 hours ago'
},
{
title: 'Large and mega deals dry up for TCS, Infosys, Wipro, HCL Tech',
link: 'https://www.moneycontrol.com/news/business/large-and-mega-deals-dry-up-for-tcs-infosys-wipro-hcl-tech-9474901.html',
source: 'Moneycontrol',
time: '14 hours ago'
},
{
title: 'HRtech platform Keka raises $57M in Series A round',
link: 'https://yourstory.com/2022/11/hrtech-platform-keka-raises-series-a-westbridge-capital',
source: 'YourStory',
time: '13 hours ago'
},
{
title: 'Mobile plans of Amazon Prime, Hotstar, Netflix and others: How much they \n' +
'cost, benefits and other details',
link: 'https://www.gadgetsnow.com/slideshows/mobile-plans-of-amazon-prime-hotstar-netflix-and-others-how-much-they-cost-benefits-and-other-details/photolist/95386786.cms',
source: 'Gadgets Now',
time: '16 hours ago'
},
{
title: 'Royal Enfield at EICMA 2022',
link: 'https://www.youtube.com/watch?v=V-CvJs4ZZVU',
source: 'YouTube',
time: '1 day ago'
},
{
title: 'In an interaction with Yash Bhuva, Marketing Head, Sheetal Cool Products Ltd',
link: 'https://www.dsij.in/dsijarticledetail/in-an-interaction-with-yash-bhuva-marketing-head-sheetal-cool-products-ltd-27207-1',
source: 'Dalal Street',
time: '1 day ago'
}
]
market_trends: [ 'Market indexes', 'Climate leaders', 'Crypto', 'Currencies' ]
interested_bottom: [
{
stock_name: 'BSE SENSEX',
price: '61,033.55',
change_in_percentage: '0.25%'
},
{
stock_name: 'NIFTY 50',
price: '18,157.00',
change_in_percentage: '0.25%'
},
{
stock_name: 'Dow Jones Industrial Average',
price: '32,648.88',
change_in_percentage: '1.54%'
},
{
stock_name: 'BSE Ltd',
price: '₹586.75',
change_in_percentage: '2.37%'
},
{
stock_name: 'S&P 500',
price: '3,766.18',
change_in_percentage: '1.62%'
},
{
stock_name: 'Reliance Industries Ltd',
price: '₹2,593.70',
change_in_percentage: '0.49%'
},
{
stock_name: 'Yes Bank Limited',
price: '₹16.55',
change_in_percentage: '0.00%'
},
{
stock_name: 'State Bank of India',
price: '₹614.60',
change_in_percentage: '0.073%'
},
{
stock_name: 'ITC Ltd',
price: '₹360.70',
change_in_percentage: '2.04%'
},
{
stock_name: 'NuStar Energy L.P.',
price: '$16.27',
change_in_percentage: '1.33%'
},
{
stock_name: 'Mahindra And Mahindra Ltd',
price: '₹1,334.70',
change_in_percentage: '1.23%'
},
{
stock_name: 'Future Retail Ltd',
price: '₹3.35',
change_in_percentage: '4.69%'
},
{
stock_name: 'Vodafone Idea Ltd',
price: '₹8.50',
change_in_percentage: '1.80%'
},
{
stock_name: 'OFS Capital Corp',
price: '$10.48',
change_in_percentage: '0.73%'
},
{ stock_name: 'VIX', price: '26.13', change_in_percentage: '2.31%' },
{
stock_name: 'NCC Ltd',
price: '₹72.05',
change_in_percentage: '0.14%'
},
{
stock_name: 'Indusind Bank Ltd',
price: '₹1,149.00',
change_in_percentage: '0.42%'
},
{
stock_name: 'Nasdaq Composite',
price: '10,391.74',
change_in_percentage: '2.11%'
}
]
calendar_results: [
{
stock_name: 'PI Industries Ltd.',
date_and_time: 'Nov 9, 2022, 2:30 PM',
link: 'https://www.google.com/finance/quote/PIIND:NSE'
},
{
stock_name: 'Tata Motors',
date_and_time: 'Nov 9, 2022, 6:30 PM',
link: 'https://www.google.com/finance/quote/TATAMOTORS:NSE'
},
{
stock_name: 'Occidental Petroleum',
date_and_time: 'Nov 9, 2022, 11:30 PM',
link: 'https://www.google.com/finance/quote/OXY:NYSE'
},
{
stock_name: 'Adani Green Energy',
date_and_time: 'Nov 10, 2022, 12:08 PM',
link: 'https://www.google.com/finance/quote/ADANIGREEN:NSE'
},
{
stock_name: 'Hindalco Industries',
date_and_time: 'Nov 11, 2022, 4:00 PM',
link: 'https://www.google.com/finance/quote/HINDALCO:NSE'
},
{
stock_name: 'Zomato',
date_and_time: 'Nov 11, 2022, 5:00 PM',
link: 'https://www.google.com/finance/quote/ZOMATO:NSE'
}
]
most_followed_on_google: [
{
stock_name: 'Reliance Industries Ltd',
following: '306K',
change_in_percentage: '0.49%',
link: 'https://www.google.com/finance/quote/RELIANCE:NSE'
},
{
stock_name: 'State Bank of India',
following: '272K',
change_in_percentage: '0.07%',
link: 'https://www.google.com/finance/quote/SBIN:NSE'
},
{
stock_name: 'Yes Bank Limited',
following: '228K',
change_in_percentage: '0.00%',
link: 'https://www.google.com/finance/quote/YESBANK:NSE'
},
{
stock_name: 'Tata Motors Limited Fully Paid Ord. Shrs',
following: '197K',
change_in_percentage: '0.44%',
link: 'https://www.google.com/finance/quote/TATAMOTORS:NSE'
},
{
stock_name: 'Infosys Ltd',
following: '164K',
change_in_percentage: '0.14%',
link: 'https://www.google.com/finance/quote/INFY:NSE'
},
{
stock_name: 'Tata Consultancy Services Limited',
following: '156K',
change_in_percentage: '0.73%',
link: 'https://www.google.com/finance/quote/TCS:NSE'
}
]
Here is the full code:
const unirest = require("unirest");
const cheerio = require("cheerio");
const getFinanceData = async () => {
const url = "https://www.google.com/finance/?hl=en";
const response = await unirest
.get(url)
.header({"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36"})
const $ = cheerio.load(response.body);
let interested_top = [];
$(".H8Ch1 .SxcTic").each((i,el) => {
interested_top.push({
stock_name: $(el).find(".ZvmM7").text(),
price: $(el).find(".YMlKec").text(),
change_in_price: $(el).find(".P2Luy").text(),
change_in_percentage: $(el).find(".JwB6zf").text()
})
})
let financial_news = [];
$(".yY3Lee").each((i,el) => {
financial_news.push({
title: $(el).find(".Yfwt5").text(),
link: $(el).find("a").attr("href"),
source: $(el).find(".sfyJob").text(),
time: $(el).find(".Adak").text()
})
})
let market_trends = [];
$(".gR2U6").each((i,el) => {
market_trends[i] = $(el).text();
})
let interested_bottom = [];
$(".tOzDHb").each((i,el) => {
interested_bottom.push({
stock_name: $(el).find(".RwFyvf").text(),
price: $(el).find(".YMlKec").text(),
change_in_percentage: $(el).find(".JwB6zf").text()
})
})
let calendar_results = [];
$(".kQQz8e").each((i,el) => {
calendar_results.push({
stock_name: $(el).find(".qNqwJf").text(),
date_and_time: $(el).find(".fbt0Xc").text(),
link: $(el).find("a")?.attr("href").replace(".","https://www.google.com/finance")
})
})
let most_followed_on_google = [];
$(".NaLFgc").each((i,el) => {
most_followed_on_google.push({
stock_name: $(el).find(".TwnKPb").text(),
following: $(el).find(".Iap8Fc")?.text().replace(" following", ""),
change_in_percentage: $(el).find(".JwB6zf").text(),
link: $(el)?.attr("href").replace(".","https://www.google.com/finance")
})
})
console.log("interested_top:", interested_top)
console.log("financial_news:" ,financial_news)
console.log("market_trends:", market_trends)
console.log("interested_bottom:", interested_bottom)
console.log("calendar_results:", calendar_results)
console.log("most_followed_on_google:", most_followed_on_google)
};
getFinanceData();
Serpdog's Google Search API
Currently, serpdog.io don't have an API for scraping the Google Financial Results but if a user want a custom scraper for Google Finance API, then you can contact me on my drift messenger ID.
Our users also get 100 free requests on the first sign-up.
Conclusion:
In this tutorial, we learned why stock market is important to scrape, what are the benefits of this data and discussed how we can scrape it with the help of Node JS.
Feel free to message me if I missed something. Follow me on Twitter. Thanks for reading!
Additional Resources
- Web Scraping Google With Node JS - A Complete Guide
- Web Scraping Google Without Getting Blocked
- Scrape Google Organic Search Results
- Scrape Google Shopping Results
- Scrape Google Maps Reviews
Author:
My name is Darshan and I am the founder of serpdog.io. I love to create scrapers. I am working currently for several MNCs to provide them Google Search Data through a seamless data pipeline.
Top comments (0)