July 2011 Archives

Why did I get an MBA? Read this article

An interesting article in New York magazine discusses the “bamboo ceiling” Asian-Americans face in U.S. workplaces. I think the same discussion applies to academic hotshots of any color. This is a pretty accurate explanation of why I chose to get an MBA instead of furthering my technical skills with a science degree.

“In order to succeed, you have to understand which rules you’re supposed to break. If you break the wrong rules, you’re finished. And so the easiest thing to do is follow all the rules. But then you consign yourself to a lower status. The real trick is understanding what rules are not meant for you.”

An important point is that while lower-level jobs tend to be fairly meritocratic, many other factors start entering the picture as you rise into management – IQ becomes less relevant and EQ becomes essential. But, paying attention in school and doing what you’re told won’t prepare you for this, especially if you aren’t raised in an environment that emphasizes leadership and assertiveness.

Having the go-home option

In maintaining my digital systems, one principle that has served me well is making sure there’s a quick way to work around any potential point of failure. For example, by hosting your domain’s DNS records at a separate company from your web host, you have the option to cut over to a different provider in the event that something goes wrong (and do it immediately, rather than waiting hours or days for your failing web host to give up control of your DNS settings).

Here are some ways to avoid single points of failure:

DNS hosting is really useful. I use EasyDNS. It costs slightly more than letting your web hosting company handle DNS, but having the option to cut them out instantly if something goes wrong is essential.
Be careful of cloud-based services that don’t provide a true “export” option that dumps ALL of your data in a non-proprietary format. For example, while Google Calendar allows you to dump all of your appointments to an iCal file, there is no way to extract all of your emails from GMail in a common format. Nor is there a way to batch-download comments or photos from Facebook. Clearly this is to the benefit of the providers because it causes lock-in. But as a user, be aware that your data can be held hostage at any time. (this is why I’m starting to use this blog more instead of posting updates on Facebook – I’m getting more worried about the lack of backup and search options over there).
When performing maintenance on any computer system, always ALWAYS make sure you’ve got a working, up-to-date backup that you can cut over to in case something goes wrong. Fortunately, virtualization and cloud hosting are making this much easier to accomplish that it used to be.
Anyone who’s worked with Maya extensively has horror stories about corrupt .mb files. Despite the waste of space that saving in ASCII format entails, it’s much safer to have your production data in a form that you can hack or fix with a text editor.
On that subject, ALWAYS keep backup copies of Final Cut Pro files. I’ve had a few bad experiences where FCP decided to “eat” one of my .fcp files, leaving it in an unreadable state. (this is why FCP gets an asterisk in my list of trusted tools)

Moving data to the cloud

I’ve already moved my public internet servers over to Amazon’s EC2 cloud, and am now planning how to use it for backing up my important data. Later I might even move my primary file server into AWS.

Let’s consider how I might set up a data backup system. I have about 100GB of “core” files I need to protect. This includes things like email, documents, 3D production data, source files for websites, custom software, and Linux system images. The biggest uses of space are texture maps and some large datasets for my 3D work. The 100GB figure does not include some massive things I won’t move to the cloud yet, like finished animations (~200GB in compressed video; ~2TB of raw frames) or my personal media collection (a few tens of GB).

Amazon gives you two ways to store data in their cloud: first, there’s S3, the classic heavy-duty data repository. S3 is supposed to be extremely reliable, but you can’t use it like a filesystem. It’s designed for uploading and downloading large “chunks” of data. Second, you can store files on an EBS volume, which is somewhat like a physical hard drive in one of Amazon’s data centers. You can put a regular filesystem on EBS, but it’s not designed to be as reliable as S3, and it can only be accessed or served out through a running EC2 system.

I think S3 makes most sense for backup purposes, although I don’t want to use any of the hacks that make it appear like a filesystem. I also don’t want to upload all 100GB as a monolithic glob. I think I will divide my core data into smaller chunks, say around 10GB each, of related data (e.g., system software, email, per-project 3D source files, etc). Then I’ll upload each chunk to S3 as a single archive. This makes partial backups/restores easier, and seems to fit best with how S3 is designed to operate.

Long-term, I will probably end up using an EC2 instance as my main file server, so I’ll have to store things in a regular filesystem on EBS. In this case I’ll still use S3 for “offline” storage, while keeping a smaller set of “online” data in EBS. This is just like an old-fashioned system of on-line/near-line storage areas.

One drawback to this arrangement is that Amazon will end up double- or triple-charging you for data storage: once for the S3 backup, once for the EBS copy, and again for any EBS snapshot images. So I’ll probably end up paying more like $0.30-$0.50/GB per month for this setup. Still, the cost is quite reasonable compared to the depreciation on local hardware, plus the headaches of maintaining the system myself.

Sony: Not the Greatest Brand

Campaign magazine just published a survey that indicated Sony was the top brand in Asia (http://www.bbc.co.uk/news/business-14009880). My perspective on them is quite different. It’s a negative brand. When I hear “Sony” I think:

Too many proprietary interfaces and file formats,in everything from consumer to pro video equipment
Flimsy VAIO laptops and PCs loaded with shovelware
The PlayStation 3, a monument to complacency and hubris
Poorly-managed online games (Star Wars NGE, anybody?)
edit: how could I forget about the DRM rootkit on Sony audio CDs?

I will never buy Sony video hardware. There are just too many instances of “Super XDCAM PRO II EX” formats that are “really just motion JPEG, but changed just enough so that none of your existing software tools work with it.”

Engineer: “Listen, I know this sounds crazy, but wouldn’t it be cool if our awesome new HD camera actually made, you know, Quicktime movies?”

Boss: “A camera that records in a non-proprietary format? Come on now, you’ve got to be kidding. Go back to your desk and design us a proprietary variant of MPEG-4, then hire some fly-by-night sweatshop off elance.com to develop us a crummy Windows driver for it. That’s The Sony Way!”

Thank goodness somebody at Canon had the INSANE idea of making a camera that actually records in a directly editable format. They rightfully deserve to own the video hardware market.

Now, how about some great Taiwanese brands? ASUS and Giant Bicycles are known world-wide, 85 Degrees is getting big in China now, and domestically Uni-President and Eslite go a long way…

Computer stuff that doesn’t break (quite as often)

When people hear I do 3D animation, they often say “Wow, you must use a lot of software in my work.” Actually, I probably use fewer software packages than most people. All technology is “broken until proven working” to me (see previous post), and I only rely on tools that have passed this test.

Here are the things I’m willing to count on for “mission-critical” work:

Linux (including only the kernel and Debian distribution, and excluding glibc)
Windows (though not really any version after XP)
Mac OSX (as a desktop client only)
Emacs
PuTTY SSH
Web browsers: Chrome, Firefox, Safari
EasyDNS for DNS hosting
Photoshop
Lightwave*
Pixar’s RenderMan
Final Cut Pro*
interp, and other software I write myself

* Lightwave and FCP make the list only because I’ve used them long enough to know how to stay away from their significant weak spots.

New things that are probably going to get added once I accumulate some more experience with them:

Maya (core features only, and it’ll be a “*”)
Amazon Web Services

I do use a few things that are not on this list, but if you ask me about them, I’ll usually say something like “well, I do use X, not so much because it’s great, but because I haven’t found something less bad yet.” (actually my attitude towards FCP is about like this… it’s unreliable and lacks features that 3D artists need, but Premiere and Avid are not acceptable substitutes).

Computers seem to break a lot

My default stance on any computer product is “broken until proven working.” I never assume anything will work as advertised until I’ve tested it myself. But sometimes even this level of paranoia is not enough, as illustrated by this past weekend.

tl;dr version: in the past few days, the following basic things have failed on me:

Intel’s Gigabit ethernet driver
The Linux the boot loader
Brand-new hard drives
C library string functions

I should not have to deal with problems with these basic building blocks. This is 2011, not 1989.

I’m going to accelerate the transition of my whole computing infrastructure to the cloud. I’m perfectly happy to pay Amazon staff to handle all these nit-picky problems for me.

Detailed account of what happened, for posterity:

– I wanted to try setting up a VPN with OpenVPN on my Linux server, but I hadn’t compiled the necessary “tun” module into the kernel. No problem, I’ll just recompile it, and might as well upgrade to the latest kernel version at the same time.

– Oops, now my render nodes won’t connect to the network. It turns out the Intel Gigabit Ethernet driver included with the new kernel acts flaky on my hardware. Tried forward-porting the old driver to the new kernel, but there were too many API changes. Gave up and wrote a script that checks the Ethernet connection every 10 minutes and resets it if it’s down.

– Oh, and now the server complains that the kernel image is getting too big for LILO. Well, I guess I might as well join the modern era and upgrade to the new GRUB bootloader.

– Oops. GRUB won’t even install, complaining of some device error, apparently because of changes to how udev exposes devices in the latest kernel. So I guess I can’t use GRUB now, because the version Debian ships isn’t compatible with newer kernels. Gave up and pray LILO doesn’t fail in the future.

– Got a batch of four new 2TB Western Digital “Green” drives to replace the 250GB drives in my file server. After sliding them in, discovered that they cripple read/write performance down to <10MB/sec and time-out frequently. (Yes, I checked Google and found lots of reports about the 4KB sector size causing problems, but that’s not my issue – I am SURE my partitions are aligned correctly). No way I’m going to rely on these drives for my server. Order new Hitachi drives.

– Brought my Debian packages up to date, including updating glibc from 2.5 to 2.11.

– Oops. Now any C program I compile segfaults immediately upon the first call to any string function (what???). Eventually discovered that the new glibc plays tricks with linker symbols in a way that my older binutils can’t handle. Very disappointed that there is no error message for this – things compile fine, then just refuse to run. Silent failure is a Cardinal Sin of software. Upgraded binutils and all is well again.