thedwick

August 21, 2008

A Plug for the Cleantech Open

I just went to a volunteer event for the California Cleantech Open last night and I’ve got to say, I’m impressed. Often these kinds of things are filled with about 30% people who are sharp and will make a difference and 70% hangers-on who just want to be a part of the action. But this organization is different.

In 3 years, they’ve managed to match startups with $200 million in funding. Every person I talked to last night seemed sharp and impressive and really knew their stuff. They just signed on a half-dozen national labs to partner with their startups. If we’re really going to make a difference attacking climate change, I’m sure the solution(s) will pass through their doors in one way or another. I just hope in such an impressive crowd I can find a way to help out.

Filed under: Green, Technology — trcull @ 8:22 pm

August 13, 2008

A Whole New Benchmark for “Late”

No matter how late your project is, I doubt it’s as late as Chandler. Six years and eight million dollars to get out version 1.0 is a pretty low bar to have to beat.

Filed under: Technology — trcull @ 7:30 pm

August 11, 2008

The Very Height of Awesome

Some scientists have figured out a way to bend light with nano materials such that you could theoretically make something invisible. How freaking awesome is that?

Filed under: Technology — trcull @ 8:21 pm

July 20, 2008

Always Enough Room for Innovation

I was just wrapping up a present and found an interesting bit of innovation: some bright and underappreciated soul at Hallmark had thought to print a little grid on the back side of their wrapping paper. Suddenly, cutting in a straight line is 100x easier.

That’s just living proof that even for products as old as wrapping paper, there’s room to improve. Thanks Hallmark guy, whoever you are.

Filed under: Technology — trcull @ 8:36 am

July 11, 2008

Profiler Now Included in Standard JDK

One of my work buddies just pointed out a new tool in the JDK to help do performance profiling called VisualVM. It looks like it does 80% of what a commercial profiler will do, but it’s free!

Haven’t tried it yet, but would be interested in hearing about people’s experiences

Filed under: Technology — trcull @ 4:39 pm

July 1, 2008

TimeMachine and Linkstation NAS: How do I get thee to reconcile?

I recently bought my very first Mac ever. After years of struggling with crappy movie editing software, draining system resources and general bloat on Windows I finally bought myself a beautiful 24″ iMac and so far the experience has been every bit the bundle of awesomeness I’d hoped it would be. Except for TimeMachine.

I’ve got a 500GB Buffalo Linkstation I used for backup and for storing music files for a year. It’s a great machine that just works and serves my two laptops and a desktop very well. The problem is that the backup software what comes with it only works on Windows.

Crap.

So, rather than let my Linkstation become an expensive bookend, I hit the Internet to find an answer and wasn’t entirely pleased with the options:

1) Turn on this undocumented and apparently not very stable feature on OS X.

2) Rebaseline my Linkstation with an “opened” firmware and use rsync.

3) Purchase some kind of third party backup software for Mac like SuperDuper! that does backup instead of TimeMachine

I went with option 2. Yeah, I know option 1 sounds easier, but the stuff I read really made it sound unstable and like a waste of time, and anyway the geek in me was really warming to the idea of a Linkstation that was basically a really cheap Linux computer with a giant hard drive.

Many hours later I’m ready to declare success. So what if I could have just bought a couple of TimeMachine devices (at the same hourly-billable equivalent rate). Instead I’d learned something new that reinforced my technical chops.

So dear readers, here’s how you do it. Specifically, here’s how you backup from an iMac running OS X 10.5.2 to a Buffalo Linkstation Pro LS-GL running on an ARM processor. With any other combination your results may vary so make sure you read up before you start.

Backup all your data
The stuff you’re about to do has the potential to brick your Linkstation if you screw it up. Yes, brick it, as in won’t turn on, unrecoverable, gone, expensive book-end. There are some instructions on un-bricking a Linkstation but thankfully I didn’t have to try them out.

Download the jtymod firmware for Linkstation
Some kind soul created a hacked firmware called jtymod that is basically the standard firmware with telnet, ssh, and a few other necessary services enabled. Make sure you get the “fixed” jtymod version not the regular jtymod version. I don’t know what the fix was for, but the forum posts I read made it sound pretty necessary. Once you’ve downloaded it, you just unzip it and you have both the installer and the binary images you need to install. This part worked like a dream for me so I don’t have any advice for troubleshooting. If you run into trouble, the Linkstation hacker wiki is an excellent resource.

Reset the Root Password

You should see an option to reset the root password on your linkstation now as shown in this screen shot. After you clear the password, you should be able to set it again by logging in as “root” with a blank password and then changing it:

ssh root@192.168.1.1

passwd root

Install it using the included installer exe
Took me a while to figure out the installer was included in the zip file. I looked all over Google for general instructions on “how to install firmware images on a Linkstation” before I finally figured out I didn’t need them. Just run the exe included in the firmware zip file. Maybe you’re smarter than I am and figured that out yourself.

Write a simple shell script using rsync
The BackupUsers.sh shell script is stone simple:

#!/bin/sh
rm /Users/trcull/Documents/BackupUsers.stdout
rm /Users/trcull/Documents/BackupUsers.stderr
echo "Starting BackupUsers.sh" > /Users/trcull/Documents/BackupUsers.stdout
date >> /Users/trcull/Documents/BackupUsers.stdout
sleep -60
rsync -va /Users admin@192.168.1.6:/mnt/disk1/macbackup &> /Users/trcull/Documents/BackupUsers.stdout

You can see some much more involved examples on the rsync site and on the Linkstation wiki rsync page.

Setup some shared keys so you don’t have to type a password
If you want this job to run unattended, you have to set up some shared keys between your Linkstation and your Mac.
Configure it to run using launchd
Apple has decided to replace crond with something called launchd. There’s a decent writeup for using launchd to get you started. I ended up creating a plist file that looks like this:

< ?xml version="1.0" encoding="UTF-8"?>
< !DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Debug</key>
	<true />
	<key>ExitTimeOut</key>
	<integer>7100</integer>
	<key>Label</key>
	<string>net.kroppel.BackupUsers</string>
	<key>OnDemand</key>
	<false />
	<key>Program</key>
	<string>/Library/Scripts/CustomScripts/BackupUsers.sh</string>
	<key>StartInterval</key>
	<integer>7200</integer>
	<key>ThrottleInterval</key>
	<integer>7000</integer>
	<key>UserName</key>
	<string>trcull</string>
</dict>
</plist>

You may wonder why I have a ThrottleInterval in there. That’s because my rsync exits with a code 23 because it doesn’t have permissions on some of the directories it’s trying to copy. This causes launchd to think the script failed and needs to be spawned again. It took me days of my Linkstation constantly thrashing on its drive before I realized what was going on.

Done!
Now you’re in backup heaven. Still glad you didn’t just give in and buy a Time Capsule instead?

Filed under: Green, Technology — trcull @ 10:39 pm

June 27, 2008

ResultSet: Watch the Scroll Type

It turns out that using ResultSet.TYPE_SCROLL_INSENSITIVE or ResultSet.TYPE_FORWARD_ONLY on a java.sql.ResultSet can potentially give you a vastly different memory footprint. For example, we discovered today that a process pulling 42,000 rows out of a database and converting them into objects might take 70MB to do its job, or 800MB to do its job, depending on which type you use (at least with a Sybase jconn2 JDBC driver).

That get your interest? Read on to see my observations and my wild-ass guess at why.

Again profiling with JProfiler (really, you should not be doing performance analysis without a profiler or you will never find these kinds of things) I discovered that, in particular, ResultSet.getInt() using jconn2 is an incredible memory pig. It uses 7 times as much memory in temporary objects as calls to getDate(), getString() and getDouble(). I’m talking about objects that is uses in the process of building the “int” it returns to you, not the actual “int” (which is clearly very small). So, why do you care? If you’re using TYPE_FORWARD_ONLY you don’t have to care much unless you’re just trying to get the garbage collector to run less often. But if you’re using TYPE_SCROLL_INSENSITIVE (I can’t speak for TYPE_SCROLL_SENSITIVE because I didn’t test it) then you should care. A lot.

The reason is that the ability to scroll backwards and forwards with your result set doesn’t come free. It requires the result set to hold onto a lot more data internally instead of letting it go right way. So, consequently, with a forward only result set we were seeing the memory released every time we moved from one row to another, but with the scrolling result set it was holding onto all the memory until the very bitter end, when the result set was finally closed. For the application in question, that meant on our highest volume day of the year we simply couldn’t start. Even though the data we were trying to cache on startup only totalled 70MB or so in the end, it required (and held onto until the result set was closed) a whopping 800MB of memory to build that 70MB of data. That 800MB, when combined with the other data already in memory, meant we exceeded the 1.3GB limit for a JVM running on 32-bit Windows. We were stuck.

Credit for this last find goes to someone I work with named Shibu; without him I’d probably still be at work trying to figure out where those extra 730MB were coming from.

Filed under: Technology — trcull @ 8:04 pm

June 20, 2008

hashCode(): The Easiest Way to Kill Application Performance

There’s no faster way to kill your application’s performance than by implementing an inefficient hashCode() function in some low-level, commonly used class. As with SimpleDateFormat problems, I’ve seen this hashCode() problem twice now “in the wild” and the results aren’t pretty. Both times, the problem could only be diagnosed by using a profiler (JProfiler in both cases), which bolsters even further the argument that you’re wasting your time trying to do performance improvement if you’re not using a profiler.

Why does hashCode() matter so much, you ask? Think about what happens every time your application puts an object into a java.util.Collection class, especially (wait for it….) Hashtable and HashMap. Notice something common there? Yes, that’s it, the word “hash”, as in “hash function”, which means that each of those collection classes will call your humble little hashCode() function millions of times. So if your hashCode() function is doing anything more exotic than returning
an int it had already pre-calculated in the constructor then you’ve got problems.

The first time I saw this problem in the wild was an application that was displaying domain objects in a Swing (more precisely, JIDE) table by storing the domain objects directly in a table model. The table model was both sortable and filterable, so the sorting and filtering algorithms ended up calling hashCode methods several million times over the course of a minute, especialy when the user was scrolling back and forth. Ultimately, the problem came down to an Identifier class that
was often used as a key:

public class Identifier {
       private Object type;
       private Object value;

       public Identifier(type, value){
               this.type = type;
               this.value = value;
       }

       ...some stuff, including equals()....

       public int hashCode() {
               return type.hashCode() ^ value.hashCode();
       }
}

The application wasn’t very responsive, so we profiled it and discovered that this implementation took something crazy like 80% of the time used by the Swing event thread simply calling hashCode().

The second time we had an application was taking forever and a day to start up and prime its caches of data. One of many culprits turned out to be a custom class to implement the concept of a compound key:

public class Key {
       private Map members;

       public Key(Map members){
               this.members = members;
       }

       ...some stuff, including equals()...

       public int hashCode() {
               Iterator i = members.entrySet().iterator();
               List sortedSet = new ArrayList();
               while (i.hasNext()){

sortedSet.add(i.next().value().toString().trim());
               }

               // sort is because members.entrySet() has no guaranteed
order
               Collections.sort(sortedSet);

               Iterator j = sortedSet.iterator();
               String hashStr = "";
               while (j.hasNext()){
                       hashStr += j.next();
               }
               return hashStr.hashCode();
       }
}

This code is just downright awful, performance or no performance, though in its defense I will say it was written many years ago by a team of guys who were both junior and new to Java from VB6. Profiling the application revealed that it was spending 70% of its cache loading time simply calling Key.hashCode()–and far more time calling hashCode() than it even spent pulling the data it was caching out of the database.

Both these cases had two things in common: 1) the classes were very low level, ubiquitous, and probably written early in the project, and 2) they both ended up being a key in a Map. Both of them also prove a mantra I’m developing: if you’re trying to do performance tuning without the hard data you get from a profiler, then you are wasting your time. Without a profiler I would never have thought to look at them.

Filed under: Technology — trcull @ 8:57 pm

May 12, 2008

Difference Between Java and Ruby

Want to know the difference between Java (or more precisely, Struts) and Ruby (or more precisely, Rails)? Here is an example that says it all.

Filed under: Ruby, Technology — trcull @ 9:33 pm

April 21, 2008

SimpleDateFormat: Performance Pig

Just yesterday I came across this problem “in the wild” for the third time in my career so far: an application with performance problems creating tons of java.text.SimpleDateFormat instances. So, I have to get this out there: creating a new instance of SimpleDateFormat is incredibly expensive and should be minimized. In the case that prompted this post, I was using JProfiler to profile this code that parses a CSV file and discovered that 50% of the time it took to suck in the file and make 55,000 objects out of it was spent solely in the constructor of SimpleDateFormat. It created and then threw away a new one every time it had to parse a date. Whew!

“Great,” you think, “I’ll just create one, static instance, slap it in a field in a DateUtils helper class and life will be good.”

Well, more precisely, life will be good about 97% of the time. A few days after you roll that code into production you’ll discover the second cool fact that’s good to know: SimpleDateFormat is not thread safe. Your code will work just fine most of the time and all of your regression tests will probably pass, but once your system gets under a production load you’ll see the occasional exception.

“Fine,” you think, “I’ll just slap a ’synchronized’ around my use of that one, static instance.”

Ok, fine, you could do that and you’d be more or less ok, but the problem is that you’ve now taken a very common operation (date formatting and parsing) and crammed all of your otherwise-lovely, super-parallel application through a single pipe to get it done.

What would be better is to use a ThreadLocal variable so you can have your cake and eat it, too:

public class DateUtils {

    public static final String MY_STANDARD_DATE_FORMAT = "yyyyMMdd";

    public static java.util.Date parseDate(String dateString) throws ParseException {
        return getFormat().parse(dateString);
    }

    private static ThreadLocal format = new ThreadLocal(){
        protected synchronized Object initialValue() {
            return new java.text.SimpleDateFormat(MY_STANDARD_DATE_FORMAT);
        }
    }

    private static DateFormat getFormat(){
        return (DateFormat) format.get();
    }
}

I hope this code works because I wrote it on the fly and haven’t tried to run it, but you get the point.

Filed under: Technology — trcull @ 8:35 pm
Next Page »

Powered by WordPress