<p>We are looking for software engineers to join our Delivery team to work on web crawler development with Scrapy, our flagship open source project.</p><br /><p>Are you interested in building web crawlers harnessing the Scrapinghub platform, which powers crawls of over 3 billion pages a month?</p><br /><p>Do you like working in a company with a strong open source foundation?</p><br /><p><strong>Job Responsibilities:</strong></p><br /><ul><br /><ul><br /><li>Design, develop and maintain Scrapy web crawlers</li><br /><li>Leverage the Scrapinghub platform and our open source projects to perform distributed information extraction, retrieval and data processing</li><br /><li>Identify and resolve performance and scalability issues with distributed crawling at scale</li><br /><li>Help identify, debug and fix problems with open source projects, including Scrapy</li><br /></ul><br /></ul><br /><p>Scrapinghub’s platform and Data offerings have been growing tremendously over the past couple of years but there are a lot of big projects waiting in the pipeline, and in this role you would be a key part of that process.</p><br /><p>Here’s what we’re looking for:</p><br /><br /><br /><br /><p><strong>Requirements</strong></p><br /><br /><ul><br /><li>2+ years of software development experience.</li><br /><li>Solid Python knowledge.</li><br /><li>Familiarity with Linux/UNIX, HTTP, HTML, Javascript and Networking.</li><br /><li>Good communication in written & spoken English.</li><br /><li>Availability to work full time.</li><br /></ul><br /><p><strong>Bonus points for:</strong></p><br /><ul><br /><li>Scrapy experience is a big plus.</li><br /><li>Familiarity with techniques and tools for crawling, extracting and processing data (e.g. Scrapy, NLTK, pandas, scikit-learn, mapreduce, nosql, etc).</li><br /></ul>