Logo
Internet Archive

Software Engineer - Archiving and Data Services

Internet Archive, San Francisco, California, United States, 94199


Interested in a mission-driven job ensuring perpetual open access to information for a global audience? Enjoy helping scale the use of services and products critical to hundreds of national and international non-profits, libraries, universities, cultural heritage institutions, and mission-driven organizations? If so, the Internet Archive is seeking a Software Engineer for our Archiving & Data Services team.Internet Archive (IA) is a non-profit digital library, top 200 website at archive.org, and an archive of over 99 petabytes of digital information running in many self-owned and operated data centers. Internet Archive also provides mission-aligned services to thousands of organizations working collaboratively to advance our shared goal of “Universal Access to All Knowledge.” The Archiving & Data Services group provides a suite of paid, SaaS, and free products, as well as community programs, focused on the archiving, management, analysis, and accessibility of digital information. Its services are used by over 1,500 organizations around the world.We are looking for a motivated, detail-oriented Software Engineer to join our team. The role’s preliminary duties will focus on IA Scholar (scholar.archive.org), our service supporting persistent access to open scholarship. However, many of our engineers support multiple products across our portfolio of services, so the role will also contribute to other products related to digital archiving and access. This position offers the opportunity to work with a range of technologies and gain deep knowledge about the collection, management, presentation, and preservation of open access scholarship, data, and primary sources in digital form. Our services work with petabytes of archived data and facilitate the discovery and use of these archived digital collections. The Software Engineer will have the unique opportunity to build things that further open access to information and advance the public good.Key Responsibilities:

Collaborate with team members to understand user needs, design new features, support the acquisition and access to scholarship, and improve the performance and reliability of IA Scholar and other department products.Implement, test, and maintain software across our stack (Python, Elasticsearch, Rust, Postgres, Kafka, and HTML/CSS/JS).Develop, monitor, and maintain web services to facilitate seamless access to archived digital collections in many forms and formats.Work with large scale data processing to ingest, normalize, index, and provide access to scholarly works, data, and digital archives.Participate in code reviews to ensure the quality and stability of our software.Document software and features for internal and external users.Qualification and Skills:

Degree in Computer Science or a related field, or equivalent experience, strongly preferred.Proficiency in Python, with familiarity in Elasticsearch, Rust, Postgres, Kafka, and HTML/CSS/JS preferred.Experience with web crawling and/or scraping is a plus.A strong understanding of web services and distributed systems.Excellent problem-solving skills, attention to detail, and ability to work both independently and collaboratively.Experience with Hadoop/HDFS or large-scale data processing is a plus.Ansible, GitLab, GitHub, Sentry, Grafana, JIRA, are other tools we use.Our independently operated data centers run Ubuntu Linux VMs and our department runs everything from the VM up, so Linux experience is preferred.An interest in the open access movement and the Internet Archive’s mission to provide Universal Access to All Knowledge is expected.Job Details:

This is a remote-first position working in a distributed team. Candidates will need to have some time overlap with a primarily North America (and mostly Pacific Time) based distributed team for collaborative work and meetings. The role reports to the Senior Engineering Manager, Archiving & Data Services.Benefits & Perks:

The Internet Archive is a remote first workplace and provides a comprehensive benefits package including PTO, paid holidays, and medical benefits. Depending on where you live, we also provide these additional benefits; dental, vision, health savings accounts, flex spending accounts, commuter benefits, short term disability, long term disability and retirement programs.At the Internet Archive, we believe we do our best work when our employees bring together diverse ideas. Members of all groups under represented in the tech industry and library world are strongly encouraged to apply. We are proud to be an equal opportunity workplace and are committed to equal employment opportunity regardless of race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, genetic information, veteran status, gender identity or expression, sexual orientation, or any other characteristic protected by applicable federal, state or local law.

#J-18808-Ljbffr