Posted 14 Days Ago Job ID: 2097649 25 quotes received

HTML Content Research Using Common Crawl

Fixed Price$250-$500 W9 Required for U.S.
Quotes (25)  ·  Premium Quotes (1)  ·  Invited (0)  ·  Hired (0)

  Send before: December 20, 2024

Send a Quote

Programming & Development Programming & Software

We are seeking a skilled freelancer to help identify websites using a specific technology by searching for unique identifiers (keywords) within HTML code. This task involves querying Common Crawl datasets using AWS Athena or an equivalent method to identify relevant websites that are using a specific technology and building a report with the URLs for those sites.  You should have experience with large-scale data extraction and analysis, familiarity with Common Crawl and AWS Athena (or similar tools), and the ability to deliver results in an organized, actionable format.


To get this job, please provide an example of a similar project and explain how you will approach this project. If you can provide some sample data, that would also help.

... Show more
Andrew D United States