Writing a scraping tool is a boring process, you have to use headless browser or an API (but that wouldn't be called scraping, would it?).
<!-- -->It takes a lot of time to develop and run such a tool. Whenever possible, it's best to avoid writing a standalone application for that.
My goal was to gather links to some tweets that were listed under Twitter's search page. I went to the search page, put this little snippet that I wrote in the DevTools and started scrolling until I was satisfied with the results.
<!-- -->It will probably stop working in the near future, since it is fully based on text content of some DOM nodes, but you can of course take a look at Twitter's DOM and modify it to your needs.
const links = new Set();
window.addEventListener('scroll', () =>
[...document.querySelector('[aria-label="Timeline: Search timeline"').children[0].children].forEach((el) => {
const singleLink = el.querySelectorAll('a')[3];
if (singleLink) {
links.add(singleLink.getAttribute('href'));
}
}),
);
console.log(links);
Top comments (0)