Netflix’s Emmy and its Simian Army

Netflix recently upset the status quo when their in-house series “House of Cards,” starring Kevin Spacey, shoved aside offerings from the networks and cable channels to win the Emmy Award for Best Director.  Hollywood studios and television executives, who have been slow to provide content over the public Internet, due to their cozy and profitable relationship with cable companies (some would say “greedy”) were taken aback.  They see this internet-streaming-video company threatening their monopoly of delivering new content to home viewers via the expensive cable data pipe. 

Netflix is one of the largest customers for Amazon Web services and one of the largest users of cloud computing in the world.  With Netflix operating in countries around the globe – offering dubbed Hollywood movies or movies with subtitles and local content – the company consumes a large portion of Internet bandwidth.  They were one of the earliest big customers for Amazon’s cloud services and recently have given away their streaming-video technology, including data center operation tools, to the Netflix Open Source (OSS) open-source software project.

Network World reports that one of these OSS tools is the Simian (“ape”) Army. Data center operators can use these tools to test the failover ability of their architecture by randomly shutting down virtual machines.  Chaos Gorilla shuts down an entire Availability Zone in the Amazon cloud. FoxPlay, Hulu, MovieCity, and other streaming providers could learn from Netflix, as their other offerings trail Netflix in their ability to keep-on-streaming when the internet slows or when something goes wrong in the cloud.

Potentially any data center could bring OSS in-house to automate management and other tasks.

The OSS website design is Netflix either making fun of itself or showing off its brand.  Each of their open-source products is linked to a movie poster.  Under the section “Romantic Comedies,” we have tools for managing the Cassandra open-source database (an Apache project, not Netflix).   To access the Simian Army, click on the picture of the “Mobster.”

Ariel Tseitlin, director of cloud solutions for Netflix, told CIO magazine some of the lessons they learned trying to keep their services running for their 38 million subscribers.  Among these are to divide up the system into the smallest pieces possible, so that when one goes down, the other pieces keep working and change gears, stepping in with something else that is useful to the customer.  Redundancy is obviously good advice that they promote.  We already mentioned Chaos Gorilla; Chaos Monkey is a smaller scale tool that shuts down various virtual machines instead of an entire Amazon Availability zone. The point of both is to design for resiliency and then test, test, test.

Studying the way Netflix does business can help you with yours.