Ditching A Language
I can't go into the full background and a couple of details have been changed to protect the innocent, but I was chatting with a company that I'll call Acme. They faced a situation that I've seen before and usually ends badly. The code base they have looks like this:
- Roughly a million lines of legacy spaghetti code
- Very little use of existing libraries ("not invented here" syndrome)
- Siloed developers
- Hard to maintain and extend
- Prospective developers see the code and "nope" the heck out of there
I have spoken to quite a few companies in this mess and Acme had a solution for dealing with it: they were going to rewrite the code base in another language.
Oh really?
A million lines of code, in heavy use, is going to be rewritten? Let's do some rough math.
We know that lines of code is a rubbish metric for productivity, but in this case, it's really all I have to work with. I pulled out an old project of mine and saw that I did six months of development on it. How do I calculate the lines of code I've written? Well, I can't, not really, but for the sake of argument, let's make a quick guess. Given a starting and ending commit, here's how I can calculate how many lines of code I've changed over the course of the project (using git):
git log --numstat --pretty="%H" --author=ovid $first_commit..$last_commit \
| perl -anE 'if (3 == @F){ $a+=$F[0];$b+=$F[1] } END { say "+$a -$b" }'
And that printed my total insertions and deletions: +76294, -32686. That makes for just over 100,000 lines of code changed in six months, but even then, it's a rubbish, rubbish metric. No matter how I play with the numbers, I get between 300 to 1,000 lines of code a day. I'm productive, but that number seems very high. On the other hand, the project lead commented that I was productive enough that it was hard to do code reviews, so maybe that's not too far off.
Let's forget about the 10 lines of code a day that we sometimes hear and pick 500 lines of code a day. That's being generous. Very generous. Further, let's assume that we can maintain this pace and that the old code base will be written in a language with similar verbosity. That's roughly 5.5 person years of effort to rewrite to rewrite the code base, but that assumes you're working seven days a week, 365 days a year. In reality US workers typically work roughly 2,000 hours per year, or about 250 days out of the year. That means it would take eight person years of effort to replicate the above code base (over ten years for the average hourly hours here in France).
And that's assuming you could consistently crank out 500 lines of code a day. So we'll forget about the lessons of the Mythical Man Month and assume 8 developers could crank this out in a year. Assuming that each developer annually costs the company around €100,000 (that's salary, insurance, taxes, training, supporting equipment, etc.), and assuming at least two other employees will be there to coordinate the project, that's a €1,000,000 they're going to spend.
No, that's not true. That's a one million euro they're going to flush down the toilet and reach for the euro roll again. Acme's developers have meetings. They have "off days". They have to spend time understanding the original code base: they can't just transcribe it. Work will still have to continue on the old code base. Features added to the old would have to be added to the new. Features taken for granted in one code base won't be available in the other. Modules working in one code base won't be available in the other (or worse, may be available but you'll miss all the interface differences). And then there's the work coordination necessary between eight devs. Further, because of the lack of structure in Acme's old code base, simply transcribing it would turn one steaming pile of ones and zeros into another steaming pile of ones and zeros. I suspect new team would be lucky to get 500 lines of code a day for the entire team. Very lucky. This project, even if wildly successful, is going to cost many millions of euros.
And now Acme has to have the new development team and the old development team sticking around, greatly increasing their costs (or retrain the existing devs). At the BBC, as soon as they announced they were switching from Perl several Perl devs announced they were quitting and taking their hard-earned business knowledge with them. Why stick with a company when you know you don't have a future? (Actually, your future is likely guaranteed, but grudgingly.) And I might add that Perl is still heavily used at the BBC, several years later because "see above". (And guess why COBOL is still so heavily used? See above.)
Note that the above isn't an analysis. It's a "pie in the sky" best case pipe dream that simply won't happen. You can pull this off for smaller code bases, but even then it's painful, expensive, and time-consuming. The industry is riddled with projects (and companies) that have failed because of rewrites (anyone remember Netscape?).
I have been at multiple companies that have decided to change programming languages, but they're generally not foolish enough to blindly rewrite their systems[1]. I worked with two companies that switched to Ruby on Rails and subsequently ditched it (both for Perl), only to find that they don't have the time to replace it. So now they have legacy Rails apps that they have to maintain because they don't have the time or money to simply rewrite them. And these were small compared to a million line code base.
So Acme has decided to spend millions of euros to switch from their current high risk/low reward position to a new higher risk/low reward position. I fail to see the cunningness here. Unfortunately, I understand what happened. At another company, the CEO panicked in a financial crisis and fired the dev team that had been working for one and a half years to develop a complicated project in part because an outsourcing company in India promised they could replicate it in two months — for a lot less money. Non-technical people often have no understanding of how hard our work is.
Acme's solution to their current woes clearly fits the definition of "large project" and we know from decades of painful experience that large projects fail. They're so disastrous that almost one fifth of large software projects threaten the very existence of the company. Would you dare go to the Board of Directors and say "I have a multi-million euro software project that if it succeeds, will make it easier to hire developers, but it will probably fail and has almost a 1 in 5 chance of bankrupting us?" No, you wouldn't, but Acme did.
1. Except for an insurance company who decided to switch their accounting software from COBOL to C++. They gave their COBOL devs a two week training course in C++ and told 'em to rewrite the system. I don't need to tell you how that turned out.
Too many text lines (btw why to count code change lines, just count your code lines after last commit, when you rewriting something you are assuming that you wont rewrite same lines).
But to shorten: if you decide to change car engine, when you in a race, what could possible go wrong.
That's really not the case anymore. No new Perl products, existing devs maintain existing Perl products that have no new features and all get trained in other more common languages used. Those who kicked up sand have either accepted or moved on.
First, that "10 SLOC/day" figure is for the whole project team on a large project, not just for one coder hacking away. It includes the time to gather, write and review the requirements, the time for each developer to keep up with the changes made by all the other developers, the time of the dedicated test team, formal system testing with written logs of each step, and don't forget the boss and the administrative assistant.
Second, the 1 MSLOC of legacy spaghetti will usually be much smaller when rewritten, as long as the organisation resists Second System Syndrome. In between the terseness of a higher level language, removal of duplicated cut-and-paste code and obsolete functionality, an order of magnitude improvement might well be achievable. Replacing 1 MSLOC with 100 kSLOC looks like a much better deal, especially if there is a 10% annual churn on the existing codebase.
On the other hand, if I remember correctly, Youporn switched in 2012 too and successfully.
Everybody always points out Netscape - and yet some projects manage a rewrite successfully.
I'd say it's a lot more interesting to look what they did right - because some really manage and I can't imagine that this is only because they have amazing wizards at hand. There must be more.
Adam Turoff has responded to you on the topic of Netscape before, Ovid. :-)
http://notes-on-haskell.blogspot.de/2007/08/rewriting-software.html
http://notes-on-haskell.blogspot.de/2007/09/rewriting-software-part-2.html
So... what is the solution?
There are times where a legacy technology has limitations that ultimately prevent progress, and create maintenance nightmares. A good example is some of the older database technologies where a 1TB database machine could cost 100k (example: Sybase).
You could try to maintain a series of expensive databases, but between replication backups, dev boxes, etc suddenly your costs to keep the old technology are really high. And if these are databases storing expensive financial data, well, maybe the risk of hitting your 1TB limit is an expensive risk to have on your plate.
And it might take 10MM to migrate to a new technology, but in this case it would probably be worth a switch.
When technology is a commodity, and working poorly is still working, then maybe it's hard to justify a switch. But if your old technology starts to limit your performance and introduce risks or problems that detract from your competitive advantage, then you might not have a choice.
A couple comments.
You can use the cloc (count lines of code) utility to get a more accurate count of lines of code in most languages:
http://cloc.sourceforge.net/
cloc does not count blank lines or comment lines.
Lines of code is a very rough measure, partly because simple changes in style can cause large changes in the count even using fancy line counting tools such as cloc that try to avoid misleading things like counting comment lines or blank lines.
Even accounting for style and similar issues, there seems to be wide variation from project to project in productivity using lines of code (or other metrics) depending on the inherent complexity of the algorithm or task, the required "quality" of the program or system, and other factors. So the 10 lines of code per day, or the output of models like Barry Boehm's COCOMO (which is roughly 10 lines of code) is unfortunately very rough and may not be applicable at all to some projects.
John
10 lines of code? 500 lines of code?
I'm not sure how citing useless metrics adds anything to this...
This is why you should refactor early & often.
"Prospective developers see the code and "nope" the heck out of there"
great line.
The number of lines you code per day will not be a good metric for productivity.
Their current situation is untenable and the solution will bankrupt them. That's one way for a company to fail.
You're telling us about the big rewrite: http://chadfowler.com/blog/2006/12/27/the-big-rewrite/ .
Which is often doomed.
The good way to do it is progressive refactoring - find a way to interface legacy code with newly written code (IPC, sockets, whatever) and progressively switch off parts of the old codebase. This of course requires intervention on the old codebase, but if at any time you run out of time or money you can just stop porting, and you can start porting the piece that's most buggy or that would need improvement.
It's not easy to do, but it can be done with an higher success rate than the big rewrite.
Why this will succeed #1: the new language, unlike that old one, makes it impossible to write spaghetti code.
Why this will succeed #2: the original developers had "write spaghetti code" as their mission statement. The new developers now know that this is a bad idea for million-line programs, and so they are sure to avoid that trap.
Why this will succeed #3: the managers on the new project will put code cleanliness above all other priorities. If the programmers say they need an extra 6 months (say, to reverse engineer an undocumented feature of the old program), the management will gladly agree to this every time.
I'm predicting great success!
Have you considered using a automatic translation app? I think I have seen these on the iPhone. Basically, you photograph each page of the original source code and then click the TRANSLATE button. The app then generates the same code in the target language.
This should a lot better than doing it by hand.
Many codebases that are under constant development and need to deliver new features at a steady pace end up in a situation where even simple changes are either a lot of work, carry a lot of risk, or both. In particular when you have added a lot of features that the underlying architecture was never intended for.
Sometimes you should start over.
The size of the code doesn't necessarily tell the whole story either. Years ago I was moved to a project that was struggling. The codebase wasn't particularly large (less than 80kLOC), but it was rather messy and there was a lot of trivial duplication. When I was done rewriting the code from scratch I think I ended up with two libraries that were about 800-1000 LOC each and a handful of programs on the order of 50-100 LOC each. In addition I wrote perhaps 200 LOC in unit tests and the equivalent of perhaps 4-5 pages of documentation. "Fixing" the original code would have been a pointless exercise because the original author neither understood the problem he was trying to solve nor did he seem to have any kind of plan.
Changing languages may not be a bad idea if what you are doing is catered for by the new language or libraries available in the other language. If you can build on known good libraries.
(Choice of language is one of those hard decisions where occasionally someone has to be the grownup and stand up to the whims of inexperienced developers. Right now thousands of developers are suffering because some hipster dipshit threw temper tantrums to be allowed to use some "interesting" language he/she wanted to learn at the time, but that the organization had zero experience with)
Hi, I think your article is very good, I actually read it twice. I feel your pain about changing the language of a entire project. What are your thoughts of instead of moving an entire codebase to another language, you extract parts of the app and you build small services in whatever language you want?
Thanks for the article. David
What on earth makes you think that the end result of the rewrite should have as many lines of code as the original?
Rewrites are a brilliant idea, and they can be done in a controlled fashion.
I don't get why you have to assume that a million line of code after being rewritten will still be million line of code. If you are rewriting it well chances are that it might reduce the LOC by 5 times, maybe even more(depends all on how screwed up the legacy code was).
I do get the point that rewriting code is a whole new problem altogether but your math does not add up !
Joel Spolsky also argued against big rewrites: http://www.joelonsoftware.com/articles/fog0000000069.html
Hi,Summary: Refactor for the win.
I am working on a point of sale system for a company in a similar situation. We have a 500K vb.net system which seems to have been auto-translated from VB6. The design is wretched and the coding worse.
A review of the costs associated with a rewrite came to more than $1M so it was decided to refactor and improve the UI where possible. This seems to have been the best possible choice - the project is now nearing completion.
The refactor cost $100k for programming and testing staff and a further $200K for new computers with bigger touch screens (allowing improved UI).
Along the way we managed to loose 50% of the original code with a huge improvement in load time. We also added dozens of new features and fixed innumerable original bugs.
You can't reduce LOC by five times. You can reduce it at most by 100%; that is, one time.
Bjørn you are now officially one of my heroes :-)
Some management people often overlook basic stuff, maybe because some of them never had a job as a developer.
Give me a manager who has been a developer for 10 hard years of complications and related stuff, and he will be able to cope with this "hard" decisions.
LOC is actually not a problem for many Java programmers, thanks to their great IDEs.