Here’s my view on a new way to do webscraping, that maybe already done, but could be very interesting if used in the proper scenario.

  1. You can use python/autohotkey or a combination of both to navigate to the webpages that you want to check something out of.
  2. Then take a screenshot of that page, a complete screenshot.
  3. Then use gpt vision to extract whatever data/text you want.
  4. Additionally, with a combination of another GPT-3 pass that parses the reply into clean JSON - you have successfully extracted this data.

I’m trying this out, and it’s pretty fun to see and build!

You never know where this small experiment becomes a part of a larger project!