Tuesday, 1 November 2011

Facebook MySQL DBA gives tips on server management

SAN FRANCISCO -- Facebook MySQL database technical operator Rob Wultsch has one main idea for supervising thousands of servers: retain it not hard, stupid.

KISS is a universal and admired acronym, and in the case of Facebook, Wultsch said it is key to being competent to promptly provision servers, lessen the solitary points of nonachievement and advance automation.

“Simple procedures are not hard to put back together,” he said. “Complex procedures are not.”
One of Wultsch’s principle tenets is that servers bungle, and there’s none every one can perform about it. So database and procedures overseers should balance that with simplicity. Have as small number hardware store retaining entities (SKUs) as probable, find running procedures and firmware that works and fasten with it as long as possible. Variety and upgrades are scary.

“It’s better if you have one SKU,” he said, “or possibly a very large packing box and a little packing box SKU. Then you can just return them, and appeal doesn’t have to be part of the equation.”
Wultsch said the MySQL database servers at Facebook are better directed with much smaller number solitary points of nonachievement than at his earlier job, at GoDaddy.com as a MySQL database overseer (DBA). As the server surroundings increases greater, establishing definite there is homogeneity at the hardware, running procedure, database and programs stages is key to supervising modifications, dealing backups and turning over back when necessary.

Wultsch granted some other ideas for dealing large database server environments:
  • Make definite there are tepid substitute servers. Have more of them, squatted in the rack, geared up to be used. “This moves back to the item that backups fail,” he said.
  • Be competent to promptly provision and reprovision hosts. “At my last job it took hours to put a new OS on a server. At my new job it’s easy: run one lead, depart to luncheon and by the time I get back it’s possibly done.”
  • Have external support. “It’s good to be competent to call out when things depart very wrong.”
Managing population in large MySQL database server environments is as valued as the practical classes, if not more so, Wultsch said.
First, burnout is “very real.”

“Being a DBA is long hours. It’s high-stress, even though it’s beautiful good pay,” he said. “So a item of population don’t like to perform it.”

Wultsch said that a DBA in large-scale environments ought be competent to program well in rank to suitably supervise thousands of servers. At Facebook, DBAs ought be competent to program in a “P language” for instance Perl or Python, have good Linux and Bash scripting talents and decent database knowledge.
As valued is being humble.

“We deal with too much to recognise any kind amazingly well,” he said. “We perform what we consider is right and at times we’re wrong.”

Wultsch adjoined that the ramp-up at a large-scale financial gathering is long. At Facebook, it is commonly six months before a new DBA can perform any kind practical and a full year to be entirely up to speed. He said it was the matching circumstances at GoDaddy.com.

Finally, he said that faults eventuate, and it should be an organization’s objective to minimize them. That is wrapped up through guideline, by not calling up database authorities in the middle of the after dark every time a thing tiny moves awry.

“When population don’t snooze because they get called every time a watch blips, they are likely to make more mistakes,” he said.