Key Responsibilities
Create a clear documentation and workflow guide for web scraping teams that explains exactly how the data should be collected and structured.
Define data structures and field requirements so the scraper collects consistent and usable data.
Review and validate scraped data for accuracy, completeness, and formatting issues.
Organize and process large datasets into a clean, structured format ready for database import.
Load validated data into a database system and export clean datasets that can be used for CRM imports or product listings.
Develop automated data validation and cleaning processes to avoid manual processing in Excel or Google Sheets.
Ensure the system can scale to extremely large datasets.
Requirements
Experience with large-scale data processing
Strong understanding of data structures and database design
Experience working with web scraping datasets
Ability to design data validation and integrity checks
Experience with Python, SQL, or similar tools for data processing
Experience handling very large datasets
Ideal Candidate
Thinks in terms of systems and scalable workflows
Comfortable working with large datasets
Focused on data accuracy, integrity, and automation
Able to document clear instructions for technical teams
If you have experience building scalable data pipelines and organizing large datasets, we would like to hear from you.
... Show more