Archive for December, 2005

December 18th, 2005

Bridging the Gap, Part 2

by Tim Cull

Part one of this series talked about some of the processes involved in writing software in industry, but what about the cool stuff? What about technologies?

When I was in school, I had two entire quarters of data structures and algorithm analysis. There I learned to code linked lists, hash tables, heaps, queues, etc. And I learned to analyze them and distill them down to their Big-O performance profile.

And in seven years of software development, I have never written or analyzed a data structure like that again. Definitely, the knowledge I gained from that class has informed much of what I’ve done, but actually writing linked lists myself is a thing of the past.

There’s much more to software development than just pure computer science. Even though you’ve spent 4 (or more) years in school learning abstract technologies, there are certain applied technologies that nearly every recent college grad needs to know but usually hasn’t yet been taught.

1. SQL and Relational Databases. If you are getting a job writing business software, then you need to know this backwards and forwards. And I’m not talking about just MySQL; you should download and play with one of the developer versions of Oracle, Sybase, or Microsoft SQL Server and familiarize yourself with them. You should understand what an index is, when to use them and why you can’t just use them everywhere. You should understand joins. You should understand data modelling and normalization. And of course, you should know the syntax for basic select, insert, update and delete. It also doesn’t hurt to familiarize yourself with the database layer of one or more programming languages like JDBC (Java), ODBC (C++), or ADO (.Net).

2. Graphical User Interface (GUI) libraries. Sure, you know Java, C#, C or C++. But one thing I’ve come to realize is that knowing the basic syntax of any language is easy, but getting to know the libraries people use to write useful software in that language takes a very long time. A good place to start is with the GUI libraries because most (but admittedly not all) software needs to interact with actual users in one way or another. Some examples are: MFC (C++), Swing/AWT/SWT (Java), Win Forms/Web Forms (.Net), HTML/Javascript/AJAX (various).

3. Distributed computing. Businesses, especially large ones, tend to have their software infrastructure spread all over the place. Different development teams have different schedules and have made different technology choices. Additionally, some businesses have a real need for scale as far as the number of transactions they can handle at the same time. These two needs are really what has spawned a whole bevy of distributed computing technologies over the years. The latest that you should familiarize yourself with are Enterprise Java Beans (Java) and web services (various, especially .Net)

4. Unit testing. JUnit, NUnit, PHPUnit, or any *Unit will do. They all evolved from JUnit anyway. You’ll never really realize how valuable a full suite of automated regression tests is until the very first time you have to refactor a big piece of existing software. After then, you’ll never forget how valuable they are. Automated unit tests are also at the heart of Agile development, which is all the rage these days.

Not sure what Agile development is? Stay tuned for part 3, which will introduce some development methodologies.

December 5th, 2005

Bridging the Gap

by Tim Cull

Many people assume that once they have a bachelor’s or master’s degree in computer science, they’ve got everything they need to succeed as a professional. But it takes a couple of years struggling in the early part of your career to realize that, in fact, there’s a lifetime of learning ahead, and most of it doesn’t (directly) deal with computer science.

College does a great job of laying a theoretical foundation to start from and giving you a common technical vocabulary to communicate with others. In college, you develop small, focused programs specifically designed to explore a certain concept. In some cases, you even work for a few weeks on small teams. But no matter how big or long those projects may seem in college, they’re nothing compared to the scale of the software you’re going to end up writing in industry.

Interesting things happen with scale. I mean scale in many different ways: large code base, large user base, large developer teams, large time horizons, large amounts of money at stake, etc, etc. Each of these big gorillas bring with them their own challenges and their own tools to deal with those challenges.

Let’s start with source control. When you’re just working on a 3 member team for a couple of weeks in the school lab on a throw-away program, using something like CVS might seem like more of a pain than it’s worth. And what your professors won’t tell you is that in that scenario it really is more of a pain than it’s worth. But if they’re good professors they encouraged you to use it anyway, just to get used to it.

Why? Try imagining that project stretching on for years instead of weeks. Imagine dozens of contributors coming and going. Imagine trying to share a code base with people in a different building, different state or even different country. Imagine trying to eyeball every one of literally thousands of files to see what’s changed each time you want to compile a release. Now you start to get an idea of why source control is important and why, literally, people have written
entire books about version control
.

But back up a minute, what’s a “release”? When you’re in college, your development cycle consists of two phases: 1) write and test the code in the lab, and 2) email (or whatever) it to your professor. Things get a lot more complicated in industry, though. If you do internal software development, then when you finish your code, it’s going to move into a production environment where it will immediately start effecting real people, real business processes, and real money. If you work for a software vendor, your code is going to be packaged up with a whole marketing blitz, rounds of support training, and installation at customer sites where it will…start effecting real people, real business processes, and real money.

Wait a minute, what’s a “production environment?” In school, you do your coding in the same lab everyone else is using to do who knows what. In industry, (smart) companies have strict divisions between environments. At a minimum, they have a development environment and a production environment. Development is where you do your coding and production is where the company makes its money. There’s no end to the environments many businesses will have:
Build: a clean environment where you sanity check that you didn’t forget to check something into source control and that your code plays well with everyone else’s code.
QA: when you think you’re done coding, you install your code in the QA environment and test it some more, usually integrated with everyone else’s code
UAT: usually the same as QA, but sometimes on its own. This is where you turn your creation loose on non-developers so they can test out the ride (and find the bugs you didn’t find)
Production: where the money is made. Often, you will be physically prevented from accessing this environment and will have to rely on a different team of people to install your changes to it. At a minimum, you will have to fill out some paperwork or go though a ticketing and approval process to install your changes here.
BCP: stands for Business Continuity Planning. An exact copy of your production environment that’s hosted somewhere physically separate from your production environment. If you have a component failure (ie. a database goes down, hard drive fails, network fails) or if the city your production environment sits in is vaporized by terrorists, you fail over to the BCP environment. Think businesses don’t plan to that level of paranoia? Think again.

That’s it for part 1 of this series, ahead in part 2 and 3: some technologies and development methodologies you didn’t learn in school.