JOB POST DATE: Jan 10th, 2003
EMAIL RESUME TO: Brad Fitzpatrick <brad@danga.com>
(or questions)
=================
Short description
=================
Linux sysadmin to manage servers for LiveJournal.com, an online
community/"blogging" website with 800,000 users and millions of pages
views/day.
Also, management of upcoming launch of photo hosting site, PicPix.com.
================
Necessary skills
================
-- mysql
- replication
- tuning
- monitoring
-- apache
-- ntp
-- backup strategies
-- linux (debian, redhat)
-- monitoring/graphic tools (netsaint(nagios), mrtg, rrdtool, cricket, etc)
will expect you to graph and monitor tons of things.
-- mail (we use postfix, but we're flexible if you prefer otherwise)
-- dns
-- nfs
==========================
Hardware/software involved
==========================
[as of Jan 2003, but new hardware is almost always on order]
2 load balancers (BIG-ip)
15 database servers (MySQL, mix of RedHat and Debian)
-- some inactive, used only as occasional stand-ins
-- some incredibly light, replicating small subset of tables
for isolation. like directory search (uses InnoDB, not MyISAM)
or mail (postfix)
20+ web servers (Debian (mostly netbooting), mod_perl and/or apache+mod_proxy+lingerd)
3 100 Mbps switches... 1 public, two internal (linked with gigabit fiber)
1 Gigabit switch (for backup network)
misc machines, disk arrays, ...
mail machines
Offsite, for static content (cheaper bandwidth):
2 LVS boxes (locked down, "managed") + 2 of our machines (TUX + mod_perl)
===========
Programming
===========
Only enough to get your job done. We're not looking for a sysadmin +
programmer. If everything's running smoothly and all your work is
done, we want you to relax.
Experience in the past has shown us that most programmer + sysadmins
prefer to just program. So, we're looking for somebody that perhaps
knows how to program, but doesn't necessarily enjoy it. :)
=====
Other
=====
-- good communication skills, *especially* if you're remote
-- regular status updates
-- proactive investigation in how to improve things.
-- proactive configuration of fail-over services. if it's not necessary
until something else breaks, it's necessary right away.