DEV Community

Bartłomiej Stefański
Bartłomiej Stefański

Posted on • Originally published at bstefanski.com on

🤖 Quickly scrape tweets without API or headless browser

Writing a scraping tool is a boring process, you have to use headless browser or an API (but that wouldn't be called scraping, would it?).

<!-- -->It takes a lot of time to develop and run such a tool. Whenever possible, it's best to avoid writing a standalone application for that.

My goal was to gather links to some tweets that were listed under Twitter's search page. I went to the search page, put this little snippet that I wrote in the DevTools and started scrolling until I was satisfied with the results.

<!-- -->It will probably stop working in the near future, since it is fully based on text content of some DOM nodes, but you can of course take a look at Twitter's DOM and modify it to your needs.


const links = new Set();

window.addEventListener('scroll', () =>

[...document.querySelector('[aria-label="Timeline: Search timeline"').children[0].children].forEach((el) => {

const singleLink = el.querySelectorAll('a')[3];

if (singleLink) {

 links.add(singleLink.getAttribute('href'));

}

}),

);

console.log(links);

Enter fullscreen mode Exit fullscreen mode

Top comments (0)