- Notera att ansökningsdagen för den här annonsen kan ha passerat. Läs annonsen noggrant innan du går vidare med din ansökan.
Web crawling administrator
We are a company who download open and public data from many different websites. These websites needs to be supervised by a person who has knowledge and experience of HTML.
We are looking for a source administrator to look after our sources, a list of sites to be crawled regularly by our system.
Existing sources need to be repaired when the sites change structure, and sometimes we want to add new sources.
A source consists of one or more hubs: sections or starting URLs. For example, a city could have two hubs: one for published meeting protocols and one for public announcements. The
job for the source admin is to identify the different hubs, and configure our system to crawl them correctly.
The job of the source admin is more about understanding the structure of web sites than programming. Therefore our source admin probably has a light background in web development rather than systems development.
There may also be system administrator duties (we run Ubuntu Linux servers on Google Compute Engine) that we need to perform, to keep the crawlers running. These adjacent areas are not required skills and they will not be performed for the first months, but they are possible ways to grow the job responsibility.