Apple Senior Data Infrastructure Engineer in Santa Clara Valley, California

Senior Data Infrastructure Engineer

Job Number: 113671241

Santa Clara Valley, California, United States

Posted: 16-Apr-2018

Weekly Hours: 40.00

Job Summary

In this role, you’ll be working very closely with a small team of engineers and statisticians to design, build, and maintain systems that enable rapid analysis of large datasets.

Key Qualifications

  • Distributed systems concepts (CAP, “Fallacies of Distributed Computing,” etc)

  • Unstructured storage (distributed filesystems, blob storage)

  • Structured / indexed storage (column stores, faceted search)

  • Distributed processing models (map/reduce, RPC services)

  • Distributed schedulers and resource allocators (e.g., Apache Mesos and YARN)

Description

We’re seeking candidates who are confident in the systems they build, but humble and cognizant of the limitations of their software and infrastructure. This role also requires great communication skills, as you’ll contribute to functional specs and design documents to describe the systems you build to coworkers, other teams, and those who join after you. You might ship bugs – but we hope you also employ strategies to reduce risk through thoughtful design, unit + integration tests, stress tests, CI, instrumentation, and monitoring.

Education

BS/MS CS or equivalent experience.

Additional Requirements

We believe that effective systems design should be sympathetic to the underlying hardware. We hope you have some experience with:

Storage fundamentals (disk types and drive layouts, random and sequential IO, compaction)

Compute fundamentals (concurrency models, distributed single-threaded versus coordinated multithreaded scheduling, synchronous and asynchronous IO)

Networking fundamentals (data locality, datacenter network layouts, multi-datacenter systems design)

You might have specific experience with Apache projects like:

HDFS or other distributed file systems

HBase, Cassandra, or similar distributed databases

Kafka or another distributed replicated log

ZooKeeper or a similar coordination system

Mesos, YARN, or other resource allocation and scheduling systems

We also hope you have experience with (or strong interest in learning) modern Java and Python.

We don’t expect you’re an expert in everything described above, but we do hope you have strong experience in a few areas, and enough curiosity + desire to become skilled in the rest. If this position sounds like a good match for your interests and skills, please consider applying. You’ve found a unique team.